Skip to Content.
Sympa Menu

forum - Re: RE : [abinit-forum] PAW parallel calculation remains idle

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: RE : [abinit-forum] PAW parallel calculation remains idle


Chronological Thread 
  • From: BOTTIN Francois <francois.bottin@cea.fr>
  • To: forum@abinit.org
  • Subject: Re: RE : [abinit-forum] PAW parallel calculation remains idle
  • Date: Fri, 31 Jul 2009 10:00:04 +0200
  • Organization: CEA-DAM

Dear Xenophon,

The npband keyword has a meaning only for paral_kgb=1 (with wfoptalg=4 or 14) parallelization,
and your problems don't come from it. If you want, you can coutinue with it (in PAW).

In case of band only calculations/parallelization (paral_kgb=0 with wfoptalg=1 or 11),
the npband keyword has no more meaning.
In this last case, the distribution over processors is driven by nbdblock.
So, that's the reason why I said that I suspect you are in sequential when you use wfoptalg 1
and npband=/1 (by default nbdblock is equal to 1 in this case).

To resume, you have trouble using wfoptalg=1 (or 11) and nbdblock=/1 in parallel, only in PAW and not in NC .
Could you try the calculation (wfoptalg=1 and nbdblock=6) in sequential (as it is performed in v3/t41) in order
to verify if this calculation works at the sequential level?

I yes (as it is expected), that would say that the problem comes from MPI statement (probably
at the vtowfk.F90 level).
As I previously said, I haven't found any equivalent parallel test case of the v3/t41 sequential one.
I think this feature (wfoptalg=1, nbdblock=/1 in parallel) of ABINIT is no more maintained.

Thanks,
Francois

Xenophon Krokidis a écrit :
Dear Marc,

yes the behavior with wfoptalg=11 is the same as for wfoptalg=1. However, the problem is related with "nbdblock" command because as soon as I use npband then I have no problem.

Xenophon Krokidis

Xenophon Krokidis, PhD
*Scienomics* <www.scienomics.com>
T: + 33 (0)1 40 07 56 60
F: + 33 (0)1 71 19 75 83
M: + 33 (0)6 76 68 06 47



Marc.TORRENT@cea.fr wrote:
Dear Xenophon,

I'm not sure that wfoptalg=1 was OK with PAW in one past version of Abinit.
It is an old "parallelisation over bands" that was implemented before the
introduction of PAW.
More precisely:
It uses the Conjugate Gradient algorithm by blocks...
But, at the time of the implementation (as PAW didnt exist), the Conjugate Gradient
solved a single eigenvalue problem (H.Psi= Eps.Psi); for PAW, the problem now is a
generalized eigenvalue problem (H.Psi=Eps.S.Psi)... probably there is an issure with
the treatment of this "S" overlap operator with wfoptalg=1.
But, as the wfoptalg=4/14 algorithm is much more efficient that the
wfoptalg=1/11, we never worked on the latter with PAW.

By the way, did you test wfoptalg=11 ?
I suspect the issue is the same.


Marc Torrent


-------- Message d'origine--------
De: Xenophon Krokidis [mailto:Xenophon.Krokidis@scienomics.com]
Date: lun. 27/07/2009 17:32
À: forum@abinit.org
Objet : Re: [abinit-forum] PAW parallel calculation remains idle
Dear François,

let me summarize the situation. I want to perform PAW type of calculation at
one k-point (0.5 0.5 0.5). Below I list the trials I did and the observations.


* PAW

* Parallel (wfoptalg 1; nbdblock 6): ABINIT stays idle.
* Paralle (wfoptalg 1; npband 6): ABINIT runs fine


* Parallel (wfoptalg 14; npband 6; paral_kgb 1, ... ): ABINIT
runs fine

* Sequential : ABINIT runs fine.

* Norm-concerving pseudopotentials.

* Parallel (wfoptalg 1; nbdblock 6): ABINIT runs fine.

* Sequential : ABINIT runs fine

So it seems that there is a problem when parallelization over bands is
defined via the nbdblock keyword.

Xenophon

Xenophon Krokidis, PhD
Scienomics
T: + 33 (0)1 40 07 56 60
F: + 33 (0)1 71 19 75 83
M: + 33 (0)6 76 68 06 47


BOTTIN Francois wrote:
Dear Xenophon,
Xenophon Krokidis a écrit :

Dear Francois,
I followed Emannuel's and your recommendations and I still get problems. I summarize below: Using the input (see below) I get a message saying that only Gamma points should work and that I need to give istwfk 1.
If you set shiftk 0 0 0 in your input file, as Emmanuel said, you should have the Gamma point (kpt 0 0 0). So, the code should detect and echo istwfk 2.

Giving this option allows the calculation to run however in output it is reported that all 30 bands are treated by node 0. So the question is am I running parallel on bands or FFT ?

Yes. At the top level of ABINIT, these are the plane waves coefficients which are distributed over the processors. More deeper within the code, the distribution switch between this one and a band distribution. For more informations see the article, F. Bottin, S. Leroux, A. Knyazev and G. Zerah, /Large scale ab initio calculations based on three levels of parallelization/, Comput. Mat. Science *42*, 329 (2008) or the presentations at the 3rd and 4th ABINIT Workshop.


Moreover, reading the http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/varpar.html#paral_kgb I found the following (which makes me think that in PAW case kgb parallelization doesn't work, is this correct ?):
The keywords wfoptalg <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/vardev.html#wfoptalg> <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/vardev.html#wfoptalg> =4, nloalg <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/vardev.html#nloalg> <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/vardev.html#nloalg> =4, fftalg <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/vardev.html#fftalg> <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/vardev.html#fftalg> =401, intxc <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/vardev.html#intxc> <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/vardev.html#intxc> =0, and fft_opt_lob <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/varpar.html#fft_opt_lob> <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/varpar.html#fft_opt_lob> =2 have to be used for a band/FFT/k-point parallelisation. Spin-polarized ( nsppol <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/varbas.html#nsppol> <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/varbas.html#nsppol> =2 or nspden <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/varbas.html#nspden> <http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/varbas.html#nspden> =2) as well as PAW method calculations are now allowed in production in the framework of band/FFT/k-point parallelisation.

No. "... are NOW allowed in ....". The PAW/spin polarized calculations with bandfft parallelization are in production and should work.
Regards, Francois
NB: What's going on wfoptalg=1 in parallel? I don't find any test in parallel for this case. Only sequential one (v3/t41). Dear Xenophon, does your calculation work with wfoptalg=1 nbdbock=6 in sequential?


Best regards, Xenophon

# pALA-paw paral_kgb 1 npband 6 wfoptalg 14 # keywords for kgb parallelization nloalg 4 fftalg 401 intxc 0 fft_opt_lob 2 strprecon 0.1 acell 4.601012 7.043279 6.411174 Angstr angdeg 89.847876 89.911537 62.696451 chkprim 0 ecut 15 Hartree ecutsm 0.5 Hartree pawecutdg 45 Hartree ionmov 2 ntime 200 tolmxf 0.00005 kptrlen 20.0 nkpt 0 kptopt 1 ngkpt 1 1 1

Xenophon Krokidis, PhD *Scienomics* <www.scienomics.com> T: + 33 (0)1 40 07 56 60 F: + 33 (0)1 71 19 75 83 M: + 33 (0)6 76 68 06 47


BOTTIN Francois wrote:

Dear Xenophon,
In fact, you use the band-only parallelization drived by wfoptalg=1 and nbdblock=6. What Emmanuel suggests is to use the triple parallelization (wfoptalg=14, npband=6). Note that, in this case, you also need paral_kgb=1 and a few others input variables (see the email of Emmanuel). But perhaps, you need a feature which is not available in the second parallelization scheme!
Moreover, by using the second scheme, if your calculation works fine, this will give indications concerning the problem.
Regards, Francois
Xenophon Krokidis a écrit :

Dear Emmanuel,
thank you for the response. Please find here some more precisions.
a) The system I simulating is a beta sheet structure of polyalanine (polymer chain not an isolated molecule) therefore there are periodic boundary conditions. The 0.5 0.5 0.5 k-points gives excellent result compared to Gamma. b) The same input but with FHI potentials (Norm concerving not PAW ones) works. c) The parallelization I want to use is only on bands. d) The same input (with PAW) one one core runs and easily on one core (the parallel one doesn't run) e) I am using ABINIT 5.8.3
Thank you for your kind support.
Xenophon
Xenophon Krokidis, PhD *Scienomics* <www.scienomics.com> T: + 33 (0)1 40 07 56 60 F: + 33 (0)1 71 19 75 83 M: + 33 (0)6 76 68 06 47


Emmanuel Arras wrote:

Hi, First there is a tab after nband, and abinit doesn't like that.
Then I'm guessing the computation is to heavy to be performed on only one core, and that is why nothing happens. Indeed, you are talking about parallel calculation, and since you don't have any k-points I guess you are talking band / fft parallelisation. In that case you must put /paral_kgb 1/ to activate band parallelisation (since abinit 5.6 I'd say, but I'm not sure). You must then specify npband, npfft, npkpt (see http://www.abinit.org/documentation/helpfiles/for-v5.8/input_variables/varpar.html#paral_kgb)

Two pther things : - I guess you are computing a molecule, and I guess you want to compute only the gamma point, so you should had /shiftk 0 0 0/ because default is 0.5 0.5 0.5 - if you want to perform the computation for the isolated molecule, i'd say your box is to small, and you don't need /optcell/, but that may not be your aim.

Emmanuel ARRAS


xenophon.krokidis@scienomics.com a écrit :

Dear All,
I am trying to perform a PAW calculation with the setup given below. However, there is no any SCF step printed out and the calculation seems to run (several wfoptalg values were tried such as 1, 11). Any ideas ?
Thank you in advance for the kind help. Xenophon
nbdblock 6 wfoptalg=1 strprecon 0.1 acell 5.0557 6.9684 6.1975 Angstr angdeg 90.1725 89.9883 55.884 chkprim 0 ecut 15 Hartree ecutsm 0.5 Hartree pawecutdg 45 Hartree ionmov 2 ntime 200 tolmxf 0.00005 nkpt 0 kptopt 1 ngkpt 1 1 1 nstep 50 ixc 11 nband 30 nsppol 1 natom 20 nsym 1 symrel 1 0 0 0 1 0 0 0 1 ntypat 4 occopt 1 strtarget 0 0 0 0 0 0 dilatmx 1.1 optcell 2 prtden 1 toldff 0.000001 typat 1 2 2 1 2 2 2 2 3 3 4 4 4 4 4 4 4 4 4 4 xred 0.514506 0.517578 0.337544 0.354529 0.65856 0.149121 0.543363 0.523401 0.948735 0.429453 0.418779 0.837167 0.589942 0.278259 0.647364 0.401054 0.412616 0.448164 0.321035 0.887354 0.165817 0.625195 0.0490203 0.661742 0.792921 0.505765 0.898215 0.152579 0.428644 0.397331 0.722061 0.497055 0.381453 0.220395 0.441001 0.880055 0.819528 0.25585 0.644386 0.554285 0.85897 0.171695 0.192376 0.999226 0.0304452 0.191765 0.978743 0.310754 0.753587 0.958288 0.80614 0.392761 0.076104 0.666813 0.756119 0.937023 0.526133 0.124522 0.681636 0.145734 znucl 7 6 8 1


-- Emmanuel ARRAS L_Sim (Laboratoire de Simulation Atomistique) SP2M / INAC CEA Grenoble tel : 00 33 (0)4 387 86862







--
##############################################################
Francois Bottin tel: 01 69 26 41 73
CEA/DIF fax: 01 69 26 70 77
BP 12 Bruyeres-le-Chatel email: Francois.Bottin@cea.fr
##############################################################






Archive powered by MHonArc 2.6.16.

Top of Page