Skip to Content.
Sympa Menu

forum - Re: [abinit-forum] problem with parallelization variables

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: [abinit-forum] problem with parallelization variables


Chronological Thread 
  • From: Mohua Bhattacharya <mohua.simoom@gmail.com>
  • To: forum@abinit.org
  • Subject: Re: [abinit-forum] problem with parallelization variables
  • Date: Thu, 18 Feb 2010 20:02:10 -0500
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=tqJOpYj3Jn9Pg+ctGwL6k4gWv79NxXwAbRQ/ZptSHYno2FXqSk1JjRWbnUjhsEiPGo Ma/EYiQHc76k4gjt2FjSi52GISiBvNuheP7AFt24pmAYVGBLwGxv9VjoXZwW1LlInRBC D7bpjEDWD2Dom6tT0eNGsI9RVQqwbMW229Ii8=

Hi Francois Bottin,

Thank you very much for your help.
Mohua

On Thu, Feb 18, 2010 at 3:54 AM, BOTTIN Francois <francois.bottin@cea.fr> wrote:
Your version is too old! (~three years old)
Please, skip on a more recent one (5.8.4 for example).
You are in the "stone age" of parallelisation in ABINIT:
I don't know if at this time paral_kgb was activated, MPI_FFT precompilation flag removed ...
All these types would explain your k-point only parallelization.

You will find the paral test suite in ~abinit/build/tests
In this directory you can execute:
make tests_paral paral_host="YourHostName" paral_mode=seqpar
In order to choose "YourHostName" among the ones already defined, see:
~abinit/tests/Scripts/run-parallel-tests.pl


Regards,
Francois

Mohua Bhattacharya a écrit :
Hello,

I am trying to do band fft parallelization using version 5.3.5 and 8 processors.
I found an example input file t_bandfft.in <http://t_bandfft.in> (attached below) and try to run it . I am including the input file and snippets of the log file .   I was expecting the log file to contain some information showing that band parallelization is activated. I don't know what's going on. I don't see that. I really need help with this.

Thank you very much for your time
Regards
Mohua


*input file*


# Gold with one vacancy (107 atoms of gold).
 nstep 15
 ecut   24
 ngfft 108 108 108

#The following parameters are completely irrealistic even for parallel testing
#They were used to check that the file can run correctly in sequential # (they lead to a 5-minute sequential run).
#To activate these, comment the three lines above.

#nstep 1    nline 1   ecut 2.5

iscf 7 npulayit 10

useylm 0
#For the parallelisation

npband  4 #27 npfft  2    #4
timopt -1
fftalg 401 wfoptalg 4 fft_opt_lob 2 iprcch 0 intxc 0

 nband 648

 acell 3*23.01
 occopt 3 tsmear 0.002  enunit 2    #sorties dans toutes les unites
 nkpt 1 istwfk 1

 rprim  1. 0. 0.
         0. 1. 0.          0. 0. 1.  toldfe 1.e-7
 znucl  79.0

 natom  107 ntypat 1 typat  107*1
#total number of atoms= 107
...




*Snippets of the log file*
*
*
*
*
*
*
*
inkpts : istwfk preprocessed, gives following first values (max. 6): 1
 distrb2: enter  mpi_enreg%parareel=           0
 mpi_enreg%paralbd=           0
 mpi_enreg%paral_compil_respfn=           0

 distrb2: WARNING -
 nproc=   8 >= nkpt=   1* nsppol=   1
 The number of processors is larger than nkpt. This is a waste.
 distrb2: exit  invars1: mkmem  undefined in the input file. Use default mkmem  = nkpt
 invars1: With nkpt_me=    1 and mkmem  =     1, ground state wf handled in core.
 invars1: mkqmem undefined in the input file. Use default mkqmem = nkpt
 invars1: With nkpt_me=    1 and mkqmem =     1, ground state wf handled in core.
 invars1: mk1mem undefined in the input file. Use default mk1mem = nkpt
 invars1: With nkpt_me=    1 and mk1mem =     1, ground state wf handled in core.

 Symmetries : space group Pm -3 m (#221); Bravais cP (primitive cubic)
 inkpts: Sum of    1 k point weights is    1.000000

 inkpts : istwfk preprocessed, gives following first values (max. 6): 1
 chkneu : initialized the occupation numbers for occopt=    3
.......
 For input ecut=  2.400000E+01 best grid ngfft=     108     108     108
      max ecut=  2.717841E+01
 input values of ngfft(1) =108 ngfft(2) =108 ngfft(3) =108 are alright and will be used
 getng: value of mgfft=     108 and nfft=     1259712
 getng: values of ngfft(4),ngfft(5),ngfft(6)     109     109     108
 getmpw: optimal value of mpw=   68315

 getdim_nloc : deduce lmnmax  =   8, lnmax  =   2,
                     lmnmaxso=   8, lnmaxso=   2.
 memory : analysis of memory needs
================================================================================
 Values of the parameters that define the memory need of the present run
  intxc =         0  ionmov =         0    iscf =         7     ixc =        1
 lmnmax =         2   lnmax =         2   mband =       648  mffmem =        1
P  mgfft =       108   mkmem =         1 mpssoang=         3     mpw =    68315
 mqgrid =      3001   natom =       107    nfft =   1259712    nkpt =        1
 nloalg =         4  nspden =         1 nspinor =         1  nsppol =        1
   nsym =        48  n1xccc =      2501  ntypat =         1  occopt =        3
================================================================================
P This job should need less than                    1117.178 Mbytes of memory.
 Rough estimation (10% accuracy) of disk space for files :
 WF disk file :    675.480 Mbytes ; DEN or POT disk file :      9.613 Mbytes.
================================================================================

 Biggest array : cg(disk), with    675.4799 MBytes.
-P-0000  leave_test : synchronization done...
 memana : allocated an array of    675.480 Mbytes, for testing purposes.
 memana : allocated    1117.178 Mbytes, for testing purposes.
 The job will continue.
 -outvars: echo values of preprocessed input variables --------
    acell    2.3010000000E+01  2.3010000000E+01  2.3010000000E+01 Bohr
      amu    1.96966540E+02
     ecut    2.40000000E+01 Hartree
   enunit         2
   fftalg       401
 fft_opt_l         2
   iprcch         0
istwfk      1
P    mkmem         1
    natom       107
    npfft         2
   npband         4
 npulayit        10
    nband       648
    ngfft       108     108     108
     nkpt         1
    nstep        15
     nsym        48
   ntypat         1



 getdim_nloc : deduce lmnmax  =   8, lnmax  =   2,
                     lmnmaxso=   8, lnmaxso=   2.
 distrb2: enter  mpi_enreg%parareel=           0
 mpi_enreg%paralbd=           0
 mpi_enreg%paral_compil_respfn=           0

 distrb2: WARNING -
 nproc=   8 >= nkpt=   1* nsppol=   1
 The number of processors is larger than nkpt. This is a waste.
 distrb2: exit .......
pspatm: atomic psp has been read  and splines computed

  1.14368199E+07                                ecore*ucvol(ha*bohr**3)
-P-0000  wfconv:   648 bands initialized randomly with npw= 68315, for ikpt=     1
-P-0000  leave_test : synchronization done...
 newkpt: loop on k-points done in parallel
 pareigocc : MPI_ALLREDUCE

*






On Wed, Feb 17, 2010 at 10:32 AM, Mohua Bhattacharya <mohua.simoom@gmail.com <mailto:mohua.simoom@gmail.com>> wrote:

   Dear Francois Bottin,

   Thank you very much for your inputs. The version that I am using
   in 5.3.5.
   I changed the necessary variables to

   paral_kgb 1
   npband 2
   npfft 4

   but it has the same problem. I was hoping if you could tell me
   where to find the paral tests that you mentioned .
   It would be really helpful.

   Thank you very much for your time,
   Regards
   Mohua


   On Wed, Feb 17, 2010 at 3:04 AM, BOTTIN Francois
   <francois.bottin@cea.fr <mailto:francois.bottin@cea.fr>> wrote:

       It's strange, paral_kgb 1 seems to have no meaning. Your
       parallelization is
       performed on k-points only, with 2 k-points on each processors
       and 7 processors at all.
       What is the version?
       I suggest you to launch the tests P, R, T, X & Y of the paral
       tests series to
       check the ability of your code to perform calculations in
       parallel using paral_kgb 1.

       I would also mention that your input variables are not set
       properly. See:
       http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/varpar.html#npband
       and all the related variables.
       For example, nproc=npband*npfft=4x4=16 in your input file and
       not 8 as you want (set 4x2 in your case).
       In addition, npband has to be a divisor of nband (10 in your
       case).

       Best regards,
       Francois

       Mohua Bhattacharya a écrit :

           Hello,
            I am trying to optimize bulk Nb structure with abinip. I
           have 14 k points and 10 bands and 8 processors. I want to
           use both k pt and band FFT parallelization. (This is just
           a test that I am running before going into larger
           systems). So I set my parallelization variables as

           paral_kgb 1
           npband 4
           npfft 4
           npkpt 1

           wfoptalg 4
           fft_opt_lob 2
           fftalg 401
           iprcch 0
           istwfk 14*1

           The output file shows


           P newkpt: treating     10 bands with npw=    1190 for
           ikpt=   1 by node    0
           P newkpt: treating     10 bands with npw=    1191 for
           ikpt=   2 by node    0
           P newkpt: treating     10 bands with npw=    1192 for
           ikpt=   3 by node    1
           P newkpt: treating     10 bands with npw=    1205 for
           ikpt=   4 by node    1
           P newkpt: treating     10 bands with npw=    1190 for
           ikpt=   5 by node    2
           P newkpt: treating     10 bands with npw=    1191 for
           ikpt=   6 by node    2
           P newkpt: treating     10 bands with npw=    1196 for
           ikpt=   7 by node    3
           P newkpt: treating     10 bands with npw=    1194 for
           ikpt=   8 by node    3
           P newkpt: treating     10 bands with npw=    1195 for
           ikpt=   9 by node    4
           P newkpt: treating     10 bands with npw=    1220 for
           ikpt=  10 by node    4
           P newkpt: treating     10 bands with npw=    1196 for
           ikpt=  11 by node    5
           P newkpt: treating     10 bands with npw=    1192 for
           ikpt=  12 by node    5
           P newkpt: treating     10 bands with npw=    1204 for
           ikpt=  13 by node    6
           P newkpt: treating     10 bands with npw=    1212 for
           ikpt=  14 by node    6



           and the log file has

           mkrho: loop on k-points and spins done in parallel

           It looks like band parallelization is not even invoked.
           Could someone please help me with this.
           I am confused about how I should instruct abinip to use
           certain number of processors  for band parallelization and
           some others for k pt parallelization.

           Thank you very much for your time,
           Regards

           Mohua





       --        ##############################################################
       Francois Bottin                    tel: 01 69 26 41 73
       CEA/DIF                            fax: 01 69 26 70 77
       BP 12 Bruyeres-le-Chatel         email: Francois.Bottin@cea.fr
       <mailto:Francois.Bottin@cea.fr>
       ##############################################################





--
##############################################################
Francois Bottin                    tel: 01 69 26 41 73
CEA/DIF                            fax: 01 69 26 70 77
BP 12 Bruyeres-le-Chatel         email: Francois.Bottin@cea.fr
##############################################################





Archive powered by MHonArc 2.6.16.

Top of Page