Skip to Content.
Sympa Menu

forum - Re: [abinit-forum] parallelism over bands in ABINIT

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: [abinit-forum] parallelism over bands in ABINIT


Chronological Thread 
  • From: "Guillaume Dumont" <dumont.guillaume@gmail.com>
  • To: forum@abinit.org
  • Subject: Re: [abinit-forum] parallelism over bands in ABINIT
  • Date: Thu, 2 Nov 2006 10:54:21 -0500
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=AqvENuMdQSBTrjzUn2udp2tB8RP9c7PAnuA2P1E1NBwiQrB7wWNL6zq/89lx5qS1XigOGwm76LfK287NHB3gpQy7ay2pX1EGBA5DDeSyF3ZQI3rcykQfJqxT81VA0hrzNQlDSOy3KHFPQVvHOoIiz0E/hpJMBX6uxye34hXnHqQ=

Dear Dr. Geneste,

I have also had this kind of problem with abinit-4.6.5 but I can't recall what the error message exactly was. But I've tested the new band and fft parallelism available in abinit 5.2.3 and it seems to work well. I you already have a `hostname`.ac file you only have to add the following lines to it:

enable_parallel="yes"
enable_mpi="yes"
with_mpi_cppflags="-DMPI_FFT"
enable_smart_config="no"

Then in your abinit input file you have to use the following input variables:

fftalg 401
wfoptalg 4
fft_opt_lob 2
npband x
npfft y
iprcch 0

where x and y are integers that can take values between 1 and the number of processors you are using (8 in your case) with the following restrictions:

x*y = npband*npfft = number of cpus
npband has to be a multiple of nband (nband % npband = 0)
npfft has to be a multiple of BOTH ngfft(2) and ngftt(3)
(npfft % ngfft(2,3) = 0)

In my own experience, I had some trouble if either npband or npfft is set to 1, so you can try to avoid these situations.

Hope this helps



On 11/2/06, Geneste Gregory < Gregory.Geneste@ecp.fr > wrote:
Dear all,

we are presently trying to run the parallel version of ABINIT.
Our calculation contains 2 k-points and 144 bands.
We set nbdblock to 4 and wfoptalg to 1. The input file seems OK since it
perfectly runs with the sequentiel version of the code.
We run over 8 processors, and systematically we have the same message error:

ITER STEP NUMBER     1
vtorho : nnsclo_now=  2, note that nnsclo,dbl_nnsclo,istep=  0 0  1
1525-108 Error encountered while attempting to allocate a data object.  The
program will stop.
ERROR: 0031-300  Forcing all remote tasks to exit due to exit code 1 in task 6
ERROR: 0031-250  task 5: Terminated
ERROR: 0031-250  task 2: Terminated
ERROR: 0031-250  task 3: Terminated
ERROR: 0031-250  task 7: Terminated
ERROR: 0031-250  task 0: Terminated
ERROR: 0031-250  task 1: Terminated
ERROR: 0031-250  task 4: Terminated


The calculation is performed on an IBM power4 machine.
Did anybody obtain this kind of message? Can anyone identify the problem?
Thanks in advance for your help.

--
Gregory Geneste
Enseignant Chercheur
Laboratoire SPMS
Ecole Centrale de Paris
Grande Voie des Vignes
92295 Chatenay-Malabry cedex
tel : 01 41 13 16 23




--
Guillaume Dumont
=========================
guillaume.dumont.1@umontreal.ca
dumont.guillaume@gmail.com
(514) 341 5298
(514) 343 6111 ext. 13279


Archive powered by MHonArc 2.6.16.

Top of Page