Skip to Content.
Sympa Menu

forum - Re: [abinit-forum] job crashed in silicon cluster gw calculation

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: [abinit-forum] job crashed in silicon cluster gw calculation


Chronological Thread 
  • From: "Anglade Pierre-Matthieu" <anglade@gmail.com>
  • To: forum@abinit.org
  • Subject: Re: [abinit-forum] job crashed in silicon cluster gw calculation
  • Date: Thu, 20 Jul 2006 10:10:47 +0200
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=hdUy5tj/Rnyd55lyxjsQiGEefZtARgZeoKvrTmUbXPQn6zuStFNhkGyYCKrmbZfpuxghi5EjCB3pzO9kvowTDxshCFRuGkpWV7OpntiTZbRh5R77jF6SU+c8sRije9AEbe6Rz0b547aEVzxGBZAjRkpGANSxnL68pzioYWkvqmw=

Hi,
following you hints, I've had a look in csigme and there were a lot of
memory which was never deallocated. I'm currently testing the fix. Yet
I'm still not able to reproduce you computation and the bug you've
encountered. I hope that you will be able to help me:
- first run with input epsilon1.in runs perfect
-second run with epsilon2.in fails on reading the _KSS file
I get the following output:
"""
testing Kohn-Sham structure file: to_DS1_KSS
At line 188 of file ../../../src/01managempi/hdr_io.F90 (Unit 23 "to_DS1_KSS")
"""
I suppose that I'm doing something wrong ? (I have copyed the _KSS
file of the first run into to_DS1_KSS )

PMA

On 7/19/06, Deyu Lu <dylu@ucdavis.edu> wrote:
Anglade and Fabien:
Setting stack memory to unlimited does help this issue. I
managed to finish one test with this setting. After some debugging,
I found that static variables in routine csigme.F90 consumed a lot of
memory in my case. Under a default 10 M stack memory, the job always
crashed. Tough I haven't figured out why my previous test failed even
with "ulimit -s unlimited", I believe these jobs can be finished on
machines with enough memory.

Best
Deyu

On Tue, 2006-07-18 at 07:23 +0200, Anglade Pierre-Matthieu wrote:
> On 7/18/06, Deyu Lu <dylu@ucdavis.edu> wrote:
> > Anglade:
> > I haven't done that yet. I plan to submit the job later to the
> > teragrid supercomputer center and see how it works.
> >
> > Thanks
> > Deyu
>
> If abinit is failing this is going to spoil you some computing credit.
> Moreover once you have your data file computed abinit is failing
> pretty fast on the second input doesn't it ? May be you don't need a
> supercomputer to get there ? At least I would advise you to try to run
> a debugger on your case to know what is going wrong before hand.
>
> regards
>
> PMA
>
>
> >




--
Pierre-Matthieu Anglade



Archive powered by MHonArc 2.6.16.

Top of Page