forum@abinit.org
Subject: The ABINIT Users Mailing List ( CLOSED )
List archive
- From: "Josef W. Zwanziger" <jzwanzig@jzwanzig.org>
- To: forum@abinit.org
- Subject: Re: [abinit-forum] ecutsm and electronic convergence
- Date: Thu, 18 Dec 2008 08:49:13 -0800 (PST)
OK, here's some test results. One sentence summary: I *think* the problem is the wfoptalg 4 algorithm, and npband /= 1 adds a little bit too. But the biggest problem seems to be in wfoptalg 4.
Josef W. Zwanziger
Professor of Chemistry
Canada Research Chair in NMR Studies of Materials
Director, Atlantic Region Magnetic Resonance Centre
Department of Chemistry
Dalhousie University
6274 Coburg Road
Halifax, Nova Scotia B3H 4J3 Canada
tel: +1.902.494.1960
fax: +1.902.494.1310
web: http://jwz.chem.dal.ca
jzwanzig@jzwanzig.org,jzwanzig@dal.ca
From: BOTTIN Francois <francois.bottin@cea.fr>
To: forum@abinit.org
Sent: Thursday, December 18, 2008 10:09:13 AM
Subject: Re: [abinit-forum] ecutsm and electronic convergence
Hi Joe,
Josef W. Zwanziger a écrit :
I have a last question.
Is this problem related to the Band-only or Band/FFT-bidimensionnal level of parallelization?
Please, could you check if the troubles of convergence appear also using npfft/=1 and npband=1?
Many thanks,
Francois
Compiler: Intel 10.1 + MKL + MPICH2 compiled with Intel 10.1
System: BCC iron primitive unit cell, atompaw-generated PAW with parameters from http://www.abinit.org/PAW/MAIN/ATOMICDATA/026-Fe/
(this is Marc's set I believe).
ngkpt 8 8 8
tolvrs = 1.0D-16 (extremely tight convergence to exaggerate any differences)
ecut 15
pawecutdg 30
ecutsm1 0.0
ecutsm2 0.5
1) Baseline data:
paral_kgb 0,
wfoptalg = 10 (default)
Convergence at ETOT 25 and ETOT 24 in the two cases.
2) paral_kgb 0, wfoptalg 4, nloalg 4, fftalg 401:
Two cases: ETOT 24, ETOT 36
note: you can't run the defaults on nloalg and fftalg with wfoptalg 4.
3) paral_kgb 1, npband 4, npfft 1, npkpt 4, fft_opt_lob 2 together with wfoptalg 4 from above
two cases: ETOT 23, ETOT 41
4) as in 3, but npband 1, npfft 1, npkpt 4
ETOT 23, ETOT 37
5) npband 1, npfft 4, npkpt 4, otherwise as in 3
ETOT 22, ETOT 38
So, phenomenologically, the presence of ecutsm /= 0 makes convergence take 50% longer in the wfoptalg 4 case, and the further presence of npband /= 1 adds another 20%.
Professor of Chemistry
Canada Research Chair in NMR Studies of Materials
Director, Atlantic Region Magnetic Resonance Centre
Department of Chemistry
Dalhousie University
6274 Coburg Road
Halifax, Nova Scotia B3H 4J3 Canada
tel: +1.902.494.1960
fax: +1.902.494.1310
web: http://jwz.chem.dal.ca
jzwanzig@jzwanzig.org,jzwanzig@dal.ca
From: BOTTIN Francois <francois.bottin@cea.fr>
To: forum@abinit.org
Sent: Thursday, December 18, 2008 10:09:13 AM
Subject: Re: [abinit-forum] ecutsm and electronic convergence
Hi Joe,
Josef W. Zwanziger a écrit :
Thanks, this is a good news.Hi,
I replicated Emmanuel's problem with paral_kgb set to 1. With kpt only parallelization, everything works fine (with and without ecutsm). It seems to only go bad with band parallelization turned on.
I have a last question.
Is this problem related to the Band-only or Band/FFT-bidimensionnal level of parallelization?
Please, could you check if the troubles of convergence appear also using npfft/=1 and npband=1?
Many thanks,
Francois
Joe
Josef W. Zwanziger
Professor of Chemistry
Canada Research Chair in NMR Studies of Materials
Director, Atlantic Region Magnetic Resonance Centre
Department of Chemistry
Dalhousie University
6274 Coburg Road
Halifax, Nova Scotia B3H 4J3 Canada
tel: +1.902.494.1960
fax: +1.902.494.1310
web: http://jwz.chem.dal.ca
jzwanzig@jzwanzig.org,jzwanzig@dal.ca
From: BOTTIN Francois <francois.bottin@cea.fr>
To: forum@abinit.org
Sent: Thursday, December 18, 2008 9:48:05 AM
Subject: Re: [abinit-forum] ecutsm and electronic convergence
Hi Emmanuel, Joe and Marc,
Emmanuel Arras a écrit :I got the same results as you : It works just fine without band parallelization.Your problem is very interesting and I have paid a large attention to the discussion
The parameters I use are :
paral_kgb 1
fftalg 401
wfoptalg 4
noalg 4
I'm afraid I'm not qualified to look into this any further... I'll just have to avoid the use of ecutsm in the futur, and hope the bug has no other consequences...
Thanks a lot for the help.
Emmanuel ARRAS
PS : the input file I gave you was an old version. I agree with everything you say. (However, it is interesting to notice that it is actually not necessary to define paral_kgb 1 to turn on the band parallel stuff. Some how, it works without it, but twice slower. Thus it can be confusing).
(I am concerned and perhaps implicated in the problem...). I have a few comments and questions about your problem.
Emmanuel, I think the band/FFT/kpoint parallelism is not allowed without the explicit use of paral_kgb=1.
(see http://www.abinit.org/Infos_v5.6/input_variables/varpar.html#paral_kgb)
The default is paral_kgb=0, which is the traditionnal k-point-only parallelization.
So, if you don't explicitely write paral_kgb=1, you work on the k-point-only parallelization.
In this case, there is no use of npband, npkpt and npfft (if I'm right, the echo of outvars is probalby misleading), and you just use the number of processors you require at the execution.
This probably explains your scaling and timing report.
I have also some questions: Is it the triple parallelization (paral_kgb=1, npkpt/=1, npband=npfft=1),
the band level of parallelization (paral_kgb=1 and npband/=1) or
the LOBPCG algorithm (paral_kgb=0, wfoptalg=4 with or without nloalg=4 and fftalg=401)
which causes some troubles of convergence, with respect to sequential CG and parallel-k-point CG (paral_kgb=0, wfoptalg=0)?
Emmanuel, could you send me your LAST input and output files (as well as the pseudo). Many thanks.
Perhaps, you detect a problem which is present in the triple parallelization or LOBPCG since a long time and
is related to other troubles of convergence seen elsewhere.
Good work,
Francois
-- ############################################################## Francois Bottin tel: 01 69 26 41 73 CEA/DIF fax: 01 69 26 70 77 BP 12 Bruyeres-le-Chatel email: Francois.Bottin@cea.fr ##############################################################
-- ############################################################## Francois Bottin tel: 01 69 26 41 73 CEA/DIF fax: 01 69 26 70 77 BP 12 Bruyeres-le-Chatel email: Francois.Bottin@cea.fr ##############################################################
- Re: [abinit-forum] ecutsm and electronic convergence, (continued)
- Re: [abinit-forum] ecutsm and electronic convergence, matthieu verstraete, 12/13/2008
- {Spam?} Re: [abinit-forum] ecutsm and electronic convergence, Emmanuel Arras, 12/15/2008
- Re: {Spam?} Re: [abinit-forum] ecutsm and electronic convergence, matthieu verstraete, 12/15/2008
- Re: {Spam?} Re: [abinit-forum] ecutsm and electronic convergence, Emmanuel Arras, 12/15/2008
- Re: {Spam?} Re: [abinit-forum] ecutsm and electronic convergence, matthieu verstraete, 12/15/2008
- Re: Re: [abinit-forum] ecutsm and electronic convergence, Josef W. Zwanziger, 12/16/2008
- Re: [abinit-forum] ecutsm and electronic convergence, Emmanuel Arras, 12/16/2008
- Re: Re: [abinit-forum] ecutsm and electronic convergence, Marc Torrent, 12/17/2008
- Re: [abinit-forum] ecutsm and electronic convergence, BOTTIN Francois, 12/18/2008
- Re: [abinit-forum] ecutsm and electronic convergence, Emmanuel Arras, 12/16/2008
- Re: [abinit-forum] ecutsm and electronic convergence, Josef W. Zwanziger, 12/18/2008
- Re: [abinit-forum] ecutsm and electronic convergence, BOTTIN Francois, 12/18/2008
- Re: [abinit-forum] ecutsm and electronic convergence, Josef W. Zwanziger, 12/18/2008
- Re: [abinit-forum] ecutsm and electronic convergence, Emmanuel Arras, 12/18/2008
- Re: [abinit-forum] ecutsm and electronic convergence, BOTTIN Francois, 12/19/2008
Archive powered by MHonArc 2.6.15.