Skip to Content.
Sympa Menu

forum - Re: [abinit-forum] parallel bail out

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: [abinit-forum] parallel bail out


Chronological Thread 
  • From: "张�s" <zhangting1980323@gmail.com>
  • To: forum@abinit.org
  • Subject: Re: [abinit-forum] parallel bail out
  • Date: Thu, 24 Jan 2008 19:43:47 +0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=OEDWHeZcXGY5AHNAAMHLEEPUORdsIzkWJaLXK6O7ugos9XH3l9JfUO0oXCKp4R3Fl5oDi/H6IWzTgFRg12R5xBNkwtoa41EAvhiupS1ntagoiiFOj60FV9tHzl0fwCKDZuZGKvADCAx0EtUV0UbQas8IxGuZDMqKuQLNnl3Tlts=

Hi, I think maybe the problem is because the right to write on disk. In my memory, you can add  "umask" option to your environment file .bashrc, and then files created by abinip can be read and written by other machines. 


Regards

                                                        Zhang Ting
                                                        Peking Univ.
                                                        Jan, 24th, 2008

2008/1/22, Toby D. Young <tyoung@ippt.gov.pl>:



Hello users,

I am having trouble running my input file on a cluster system with 4
processors; that's two two-processor machines. I'm quite a newbie at
abinitp.

I can see from the log file two worrying messages:

p1_7266:  p4_error: alloc_p4_msg failed: 0

which (after some other output) continues with

p2_6059:  p4_error: interrupt SIGx: 13
p2_6059: (955.289062) net_send: could not write to fd=5, errno = 32
mpiexec: Warning: tasks 0-1 exited with status 1.

Googling did not give me much help; except for that this may be a
mpiexec problem writing to disk(?) Has anyone had this problem or know
how to get round it?

I haven't had any trouble running the parallel tests / tutorials.

Thanks in advance.
Best,
        Toby



=====================
complete message


p1_7266: (949.214844) xx_shmalloc: returning NULL; requested 3280880
bytes p1_7266: (949.214844) p4_shmalloc returning NULL; request =
3280880 bytes You can increase the amount of memory by setting the
environment variable P4_GLOBMEMSIZE (in bytes); the current size is
4194304 p1_7266:  p4_error: alloc_p4_msg failed: 0
=====================

     iter   Etot(hartree)      deltaE(h)  residm     vres2    diffor
maxfor

getcut: wavevector=  0.0000  0.0000  0.0000  ngfft=  80  80 128
         ecut(hartree)=     50.000    => boxcut(ratio)=   2.05901

ewald : nr and ng are    3 and   25

ITER STEP NUMBER     1
vtorho : nnsclo_now=  2, note that nnsclo,dbl_nnsclo,istep=  0 0  1
-P-0000  leave_test : synchronization done...
vtorho: loop on k-points and spins done in parallel
p2_6059:  p4_error: interrupt SIGx: 13
p2_6059: (955.289062) net_send: could not write to fd=5, errno = 32
mpiexec: Warning: tasks 0-1 exited with status 1.



--

Toby D. Young - Adiunkt (Assistant Professor)
Department of Computational Science
Institute of Fundamental Technological Research
Polish Academy of Sciences
Room 206, Swietokrzyska 21
00-049 Warsaw, POLAND




Archive powered by MHonArc 2.6.16.

Top of Page