forum@abinit.org
Subject: The ABINIT Users Mailing List ( CLOSED )
List archive
- From: "Toby D. Young" <tyoung@ippt.gov.pl>
- To: forum@abinit.org
- Subject: parallel bail out
- Date: Tue, 22 Jan 2008 10:52:07 +0100
- Organization: IPPT, PAS
Hello users,
I am having trouble running my input file on a cluster system with 4
processors; that's two two-processor machines. I'm quite a newbie at
abinitp.
I can see from the log file two worrying messages:
p1_7266: p4_error: alloc_p4_msg failed: 0
which (after some other output) continues with
p2_6059: p4_error: interrupt SIGx: 13
p2_6059: (955.289062) net_send: could not write to fd=5, errno = 32
mpiexec: Warning: tasks 0-1 exited with status 1.
Googling did not give me much help; except for that this may be a
mpiexec problem writing to disk(?) Has anyone had this problem or know
how to get round it?
I haven't had any trouble running the parallel tests / tutorials.
Thanks in advance.
Best,
Toby
=====================
complete message
p1_7266: (949.214844) xx_shmalloc: returning NULL; requested 3280880
bytes p1_7266: (949.214844) p4_shmalloc returning NULL; request =
3280880 bytes You can increase the amount of memory by setting the
environment variable P4_GLOBMEMSIZE (in bytes); the current size is
4194304 p1_7266: p4_error: alloc_p4_msg failed: 0
=====================
iter Etot(hartree) deltaE(h) residm vres2 diffor
maxfor
getcut: wavevector= 0.0000 0.0000 0.0000 ngfft= 80 80 128
ecut(hartree)= 50.000 => boxcut(ratio)= 2.05901
ewald : nr and ng are 3 and 25
ITER STEP NUMBER 1
vtorho : nnsclo_now= 2, note that nnsclo,dbl_nnsclo,istep= 0 0 1
-P-0000 leave_test : synchronization done...
vtorho: loop on k-points and spins done in parallel
p2_6059: p4_error: interrupt SIGx: 13
p2_6059: (955.289062) net_send: could not write to fd=5, errno = 32
mpiexec: Warning: tasks 0-1 exited with status 1.
--
Toby D. Young - Adiunkt (Assistant Professor)
Department of Computational Science
Institute of Fundamental Technological Research
Polish Academy of Sciences
Room 206, Swietokrzyska 21
00-049 Warsaw, POLAND
- parallel bail out, Toby D. Young, 01/22/2008
- Re: [abinit-forum] parallel bail out, Josef Zwanziger, 01/22/2008
- Re: [abinit-forum] parallel bail out, 张�s, 01/24/2008
- Re: [abinit-forum] parallel bail out, Toby D. Young, 01/24/2008
Archive powered by MHonArc 2.6.16.