forum@abinit.org
Subject: The ABINIT Users Mailing List ( CLOSED )
List archive
- From: matthieu verstraete <matthieu.jean.verstraete@gmail.com>
- To: forum@abinit.org
- Subject: Re: [abinit-forum] response-function jobs crash prematurely
- Date: Mon, 2 Mar 2009 17:18:05 +0100
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=P7kPWEzMxvJ1n4vQ22cNs3RCJb45H/7SnSq3UgMwJZERDOc1pJGF5v9Nrtw52tQ4OL mPpqEF76twjZEhWviUDJRpMt5L1ce4AZdxZuvvxkJWAWJF7XJrJEvGyit9o0iTgyNxfg HXuOIPWsrPgFq6vXWOJisszjnKLgpslYYH7Vg=
Please follow nettiquette _thoroughly_ - need a lot more information.
For the bug, one was indeed fixed in 5.6, but this crashed before the calculation of the perturbation, not after. As a rule, the computer is always right, so something is using up your memory...
Matthieu
On Sat, Feb 28, 2009 at 6:58 PM, P. Ganesh <pganesh@ciw.edu> wrote:
Dear Takeshi,
Thanks for the suggestions.
I have previously run similar calculations for the same system on another cluster which had ~ 1GB/processor and was a quad-core machine and had no problem. Even otherwise, for the present case, I found I can get the calculations to run if I submit it with: nodes=3:ppn=6 but it fails when I submit it with: nodes=6:ppn=6 or nodes=18:ppn=6 or nodes=27:ppn=4. This seems to imply that the memory per processor is sufficient for the calculations to proceed. But then I don;t understand the standard error that says insufficient virtual memory. Also, the jobs always crash right after the first perturbation has finished (see below):
I had looked up the previous mailing list post "frustrating RF calculations" but the solution to get around my problem wasn't evident, except that the scaling of the array with k-points was supposedly going to be fixed in v5.6.5. But I get the same error (as detailed above) with v5.6.5.
Thanks,
Ganesh
Last few line of the 'log' file:
----iterations are completed or convergence reached----
Thirteen components of 2nd-order total energy (hartree) are
1,2,3: 0th-order hamiltonian combined with 1st-order wavefunctions
kin0= 1.46751026E+04 eigvalue= 9.16178409E+02 local= 6.04883542E+03
4,5,6: 1st-order hamiltonian combined with 1st and 0th-order wfs
loc psp = -1.18192722E+03 Hartree= 1.02037734E+03 xc= -2.65606951E+02
note that "loc psp" includes a xc core correction that could be resolved
7,8,9: eventually, occupation + non-local contributions
edocc= 0.00000000E+00 enl0= 1.22011047E+04 enl1= -6.81403715E+04
1-9 gives the relaxation energy (to be shifted if some occ is /=2.0)
erelax= -3.47263072E+04
10,11,12 Non-relaxation contributions : frozen-wavefunctions and Ewald
fr.local= 2.58350344E+02 fr.nonlo= 3.44237800E+04 Ewald= 2.13861335E+02
prtene3 : non-relax= 3.489599E+04
13,14 Frozen wf xc core corrections (1) and (2)
frxc 1 = 0.00000000E+00 frxc 2 = 0.00000000E+00
Resulting in :
2DEtotal= 0.1696844672E+03 Ha. Also 2DEtotal= 0.461734917110E+04 eV
(2DErelax= -3.4726307230E+04 Ha. 2DEnonrelax= 3.4895991698E+04 Ha)
( non-var. 2DEtotal : 2.3484233656E+02 Ha)
rank 92 in job 1 abe0224_35176 caused collective abort of all ranks
exit status of rank 92: killed by signal 9
rank 88 in job 1 abe0224_35176 caused collective abort of all ranks
exit status of rank 88: killed by signal 9
NISHIMATSU Takeshi wrote:--- I wrote:1GB of memory per processorYou need more memory >= 4GB, I guess.Or search in this ML with keyword of "frustrating RF calculation". -- Takeshi
-- It is the very mind itself That leads the mind astray; Of the mind, It is essential to lose it, But, do not be mindless. (The Unfettered Mind)
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr. Matthieu Verstraete
European Theoretical Spectroscopy Facility (ETSF)
Dpto. Fisica de Materiales,
U. del Pais Vasco,
Centro Joxe Mari Korta, Av. de Tolosa, 72, Phone: +34-943018393
E-20018 Donostia-San Sebastian, Spain Fax : +34-943018390
Mail : matthieu.jean.verstraete@gmail.com
http://www-users.york.ac.uk/~mjv500
- Re: [abinit-forum] response-function jobs crash prematurely, matthieu verstraete, 03/02/2009
Archive powered by MHonArc 2.6.15.