forum@abinit.org
Subject: The ABINIT Users Mailing List ( CLOSED )
List archive
- From: "P. Ganesh" <pganesh@ciw.edu>
- To: forum@abinit.org
- Subject: Re: [abinit-forum] response-function jobs crash prematurely
- Date: Sat, 28 Feb 2009 12:58:26 -0500
Dear Takeshi, Thanks for the suggestions. I have previously run similar calculations for the same system on another cluster which had ~ 1GB/processor and was a quad-core machine and had no problem. Even otherwise, for the present case, I found I can get the calculations to run if I submit it with: nodes=3:ppn=6 but it fails when I submit it with: nodes=6:ppn=6 or nodes=18:ppn=6 or nodes=27:ppn=4. This seems to imply that the memory per processor is sufficient for the calculations to proceed. But then I don;t understand the standard error that says insufficient virtual memory. Also, the jobs always crash right after the first perturbation has finished (see below): I had looked up the previous mailing list post "frustrating RF calculations" but the solution to get around my problem wasn't evident, except that the scaling of the array with k-points was supposedly going to be fixed in v5.6.5. But I get the same error (as detailed above) with v5.6.5. Thanks, Ganesh Last few line of the 'log' file: ----iterations are completed or convergence reached---- Thirteen components of 2nd-order total energy (hartree) are 1,2,3: 0th-order hamiltonian combined with 1st-order wavefunctions kin0= 1.46751026E+04 eigvalue= 9.16178409E+02 local= 6.04883542E+03 4,5,6: 1st-order hamiltonian combined with 1st and 0th-order wfs loc psp = -1.18192722E+03 Hartree= 1.02037734E+03 xc= -2.65606951E+02 note that "loc psp" includes a xc core correction that could be resolved 7,8,9: eventually, occupation + non-local contributions edocc= 0.00000000E+00 enl0= 1.22011047E+04 enl1= -6.81403715E+04 1-9 gives the relaxation energy (to be shifted if some occ is /=2.0) erelax= -3.47263072E+04 10,11,12 Non-relaxation contributions : frozen-wavefunctions and Ewald fr.local= 2.58350344E+02 fr.nonlo= 3.44237800E+04 Ewald= 2.13861335E+02 prtene3 : non-relax= 3.489599E+04 13,14 Frozen wf xc core corrections (1) and (2) frxc 1 = 0.00000000E+00 frxc 2 = 0.00000000E+00 Resulting in : 2DEtotal= 0.1696844672E+03 Ha. Also 2DEtotal= 0.461734917110E+04 eV (2DErelax= -3.4726307230E+04 Ha. 2DEnonrelax= 3.4895991698E+04 Ha) ( non-var. 2DEtotal : 2.3484233656E+02 Ha) rank 92 in job 1 abe0224_35176 caused collective abort of all ranks exit status of rank 92: killed by signal 9 rank 88 in job 1 abe0224_35176 caused collective abort of all ranks exit status of rank 88: killed by signal 9 NISHIMATSU Takeshi wrote: --- I wrote:1GB of memory per processorYou need more memory >= 4GB, I guess.Or search in this ML with keyword of "frustrating RF calculation". -- Takeshi -- It is the very mind itself That leads the mind astray; Of the mind, It is essential to lose it, But, do not be mindless. (The Unfettered Mind) |
- [abinit-forum] response-function jobs crash prematurely, PGanesh, 02/27/2009
- Re: [abinit-forum] response-function jobs crash prematurely, NISHIMATSU Takeshi, 02/28/2009
- Re: [abinit-forum] response-function jobs crash prematurely, NISHIMATSU Takeshi, 02/28/2009
- Re: [abinit-forum] response-function jobs crash prematurely, P. Ganesh, 02/28/2009
- Re: [abinit-forum] response-function jobs crash prematurely, NISHIMATSU Takeshi, 02/28/2009
- Re: [abinit-forum] response-function jobs crash prematurely, NISHIMATSU Takeshi, 02/28/2009
Archive powered by MHonArc 2.6.15.