Skip to Content.
Sympa Menu

forum - Re: [abinit-forum] problem with parallelization variables

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: [abinit-forum] problem with parallelization variables


Chronological Thread 
  • From: Mohua Bhattacharya <mohua.simoom@gmail.com>
  • To: forum@abinit.org
  • Subject: Re: [abinit-forum] problem with parallelization variables
  • Date: Wed, 17 Feb 2010 16:00:03 -0500
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=bo9FkIWP6nij1BravytBKBSBeWTPBWFKwbp8pvkt02Y5Y29rUJ5/WUG3CWVlWGxTzx 8DztzYn/dZSuB2IM3ltAhclT59JnVlVKxqZRm4oaMOryfbcvwKpZev9mwXmSsILxuLsu 4XOvxA8mzm9zClKrsHeMLi/4LjTqHkyaS3pM0=

Hello,

I am trying to do band fft parallelization using version 5.3.5 and 8 processors.
I found an example input file t_bandfft.in (attached below) and try to run it . I am including the input file and snippets of the log file .   I was expecting the log file to contain some information showing that band parallelization is activated. I don't know what's going on. I don't see that. I really need help with this.
Thank you very much for your time
Regards
Mohua


input file


# Gold with one vacancy (107 atoms of gold). 

 nstep 15
 ecut   24
 ngfft 108 108 108

#The following parameters are completely irrealistic even for parallel testing
#They were used to check that the file can run correctly in sequential 
# (they lead to a 5-minute sequential run).
#To activate these, comment the three lines above.

#nstep 1    nline 1   ecut 2.5

iscf 7 npulayit 10

useylm 0 

#For the parallelisation

npband  4 #27 
npfft  2    #4
timopt -1
fftalg 401 wfoptalg 4 fft_opt_lob 2 
iprcch 0 intxc 0

 nband 648

 acell 3*23.01
 occopt 3 tsmear 0.002 
 enunit 2    #sorties dans toutes les unites
 nkpt 1 istwfk 1

 rprim  1. 0. 0.
          0. 1. 0. 
          0. 0. 1. 
 toldfe 1.e-7
 znucl  79.0

 natom  107 ntypat 1 typat  107*1
#total number of atoms= 107
...




Snippets of the log file



inkpts : istwfk preprocessed, gives following first values (max. 6): 1
  distrb2: enter 
  mpi_enreg%parareel=           0
  mpi_enreg%paralbd=           0
  mpi_enreg%paral_compil_respfn=           0

 distrb2: WARNING -
  nproc=   8 >= nkpt=   1* nsppol=   1
  The number of processors is larger than nkpt. This is a waste.
  distrb2: exit 
 invars1: mkmem  undefined in the input file. Use default mkmem  = nkpt
 invars1: With nkpt_me=    1 and mkmem  =     1, ground state wf handled in core.
 invars1: mkqmem undefined in the input file. Use default mkqmem = nkpt
 invars1: With nkpt_me=    1 and mkqmem =     1, ground state wf handled in core.
 invars1: mk1mem undefined in the input file. Use default mk1mem = nkpt
 invars1: With nkpt_me=    1 and mk1mem =     1, ground state wf handled in core.

 Symmetries : space group Pm -3 m (#221); Bravais cP (primitive cubic)
 inkpts: Sum of    1 k point weights is    1.000000

 inkpts : istwfk preprocessed, gives following first values (max. 6): 1
 chkneu : initialized the occupation numbers for occopt=    3
.......
 For input ecut=  2.400000E+01 best grid ngfft=     108     108     108
       max ecut=  2.717841E+01
 input values of ngfft(1) =108 ngfft(2) =108 ngfft(3) =108 are alright and will be used
 getng: value of mgfft=     108 and nfft=     1259712
 getng: values of ngfft(4),ngfft(5),ngfft(6)     109     109     108
 getmpw: optimal value of mpw=   68315

 getdim_nloc : deduce lmnmax  =   8, lnmax  =   2,
                      lmnmaxso=   8, lnmaxso=   2.
 memory : analysis of memory needs
================================================================================
 Values of the parameters that define the memory need of the present run
   intxc =         0  ionmov =         0    iscf =         7     ixc =         1
  lmnmax =         2   lnmax =         2   mband =       648  mffmem =         1
P  mgfft =       108   mkmem =         1 mpssoang=         3     mpw =     68315
  mqgrid =      3001   natom =       107    nfft =   1259712    nkpt =         1
  nloalg =         4  nspden =         1 nspinor =         1  nsppol =         1
    nsym =        48  n1xccc =      2501  ntypat =         1  occopt =         3
================================================================================
P This job should need less than                    1117.178 Mbytes of memory.
  Rough estimation (10% accuracy) of disk space for files :
  WF disk file :    675.480 Mbytes ; DEN or POT disk file :      9.613 Mbytes.
================================================================================

 Biggest array : cg(disk), with    675.4799 MBytes.
-P-0000  leave_test : synchronization done...
 memana : allocated an array of    675.480 Mbytes, for testing purposes.
 memana : allocated    1117.178 Mbytes, for testing purposes.
 The job will continue.
 -outvars: echo values of preprocessed input variables --------
     acell    2.3010000000E+01  2.3010000000E+01  2.3010000000E+01 Bohr
       amu    1.96966540E+02
      ecut    2.40000000E+01 Hartree
    enunit         2
    fftalg       401
 fft_opt_l         2
    iprcch         0
istwfk      1
P    mkmem         1
     natom       107
     npfft         2
    npband         4
  npulayit        10
     nband       648
     ngfft       108     108     108
      nkpt         1
     nstep        15
      nsym        48
    ntypat         1



 getdim_nloc : deduce lmnmax  =   8, lnmax  =   2,
                      lmnmaxso=   8, lnmaxso=   2.
  distrb2: enter 
  mpi_enreg%parareel=           0
  mpi_enreg%paralbd=           0
  mpi_enreg%paral_compil_respfn=           0

 distrb2: WARNING -
  nproc=   8 >= nkpt=   1* nsppol=   1
  The number of processors is larger than nkpt. This is a waste.
  distrb2: exit 
.......
pspatm: atomic psp has been read  and splines computed

   1.14368199E+07                                ecore*ucvol(ha*bohr**3)
-P-0000  wfconv:   648 bands initialized randomly with npw= 68315, for ikpt=     1
-P-0000  leave_test : synchronization done...
 newkpt: loop on k-points done in parallel
 pareigocc : MPI_ALLREDUCE







On Wed, Feb 17, 2010 at 10:32 AM, Mohua Bhattacharya <mohua.simoom@gmail.com> wrote:
Dear Francois Bottin,

Thank you very much for your inputs. The version that I am using in 5.3.5.
I changed the necessary variables to

paral_kgb 1
npband 2
npfft 4

but it has the same problem. I was hoping if you could tell me where to find the paral tests that you mentioned .
It would be really helpful.

Thank you very much for your time,
Regards
Mohua


On Wed, Feb 17, 2010 at 3:04 AM, BOTTIN Francois <francois.bottin@cea.fr> wrote:
It's strange, paral_kgb 1 seems to have no meaning. Your parallelization is
performed on k-points only, with 2 k-points on each processors and 7 processors at all.
What is the version?
I suggest you to launch the tests P, R, T, X & Y of the paral tests series to
check the ability of your code to perform calculations in parallel using paral_kgb 1.

I would also mention that your input variables are not set properly. See:
http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/varpar.html#npband
and all the related variables.
For example, nproc=npband*npfft=4x4=16 in your input file and not 8 as you want (set 4x2 in your case).
In addition, npband has to be a divisor of nband (10 in your case).

Best regards,
Francois

Mohua Bhattacharya a écrit :

Hello,
 I am trying to optimize bulk Nb structure with abinip. I have 14 k points and 10 bands and 8 processors. I want to use both k pt and band FFT parallelization. (This is just a test that I am running before going into larger systems). So I set my parallelization variables as

paral_kgb 1
npband 4
npfft 4
npkpt 1

wfoptalg 4
fft_opt_lob 2
fftalg 401
iprcch 0
istwfk 14*1

The output file shows


P newkpt: treating     10 bands with npw=    1190 for ikpt=   1 by node    0
P newkpt: treating     10 bands with npw=    1191 for ikpt=   2 by node    0
P newkpt: treating     10 bands with npw=    1192 for ikpt=   3 by node    1
P newkpt: treating     10 bands with npw=    1205 for ikpt=   4 by node    1
P newkpt: treating     10 bands with npw=    1190 for ikpt=   5 by node    2
P newkpt: treating     10 bands with npw=    1191 for ikpt=   6 by node    2
P newkpt: treating     10 bands with npw=    1196 for ikpt=   7 by node    3
P newkpt: treating     10 bands with npw=    1194 for ikpt=   8 by node    3
P newkpt: treating     10 bands with npw=    1195 for ikpt=   9 by node    4
P newkpt: treating     10 bands with npw=    1220 for ikpt=  10 by node    4
P newkpt: treating     10 bands with npw=    1196 for ikpt=  11 by node    5
P newkpt: treating     10 bands with npw=    1192 for ikpt=  12 by node    5
P newkpt: treating     10 bands with npw=    1204 for ikpt=  13 by node    6
P newkpt: treating     10 bands with npw=    1212 for ikpt=  14 by node    6



and the log file has

mkrho: loop on k-points and spins done in parallel

It looks like band parallelization is not even invoked. Could someone please help me with this.
I am confused about how I should instruct abinip to use certain number of processors  for band parallelization and some others for k pt parallelization.

Thank you very much for your time,
Regards

Mohua





--
##############################################################
Francois Bottin                    tel: 01 69 26 41 73
CEA/DIF                            fax: 01 69 26 70 77
BP 12 Bruyeres-le-Chatel         email: Francois.Bottin@cea.fr
##############################################################






Archive powered by MHonArc 2.6.16.

Top of Page