Skip to Content.
Sympa Menu

forum - Re: [abinit-forum] job crashed in silicon cluster gw calculation

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: [abinit-forum] job crashed in silicon cluster gw calculation


Chronological Thread 
  • From: Deyu Lu <dylu@ucdavis.edu>
  • To: forum@abinit.org
  • Subject: Re: [abinit-forum] job crashed in silicon cluster gw calculation
  • Date: Mon, 17 Jul 2006 15:26:30 -0700

Anglade:
I haven't done that yet. I plan to submit the job later to the
teragrid supercomputer center and see how it works.

Thanks
Deyu

On Mon, 2006-07-17 at 21:33 +0200, Anglade Pierre-Matthieu wrote:
> Hi,
> I've tryed to triggered this bug on my own version of abinit. Sadly it
> fails somewhere else beforehand. I'll try again later with a more
> stable version. In the meanwhile, would you mind trying an other
> abinit compiled with a different compiler ? It may works better...
> PMA
>
> On 7/17/06, Deyu Lu <dylu@ucdavis.edu> wrote:
> > Dear Fabien:
> > Thank you for your help. After setting "ulimit -s unlimited", the
> > problem still persists. Abinit 5.1.2 quit while computing vxc matrix. I
> > listed some relevant info below.
> >
> > Deyu
> >
> > [root@zinfandel si5h12]# ulimit -a
> > core file size (blocks, -c) unlimited
> > data seg size (kbytes, -d) unlimited
> > file size (blocks, -f) unlimited
> > pending signals (-i) 1024
> > max locked memory (kbytes, -l) 32
> > max memory size (kbytes, -m) unlimited
> > open files (-n) 1024
> > pipe size (512 bytes, -p) 8
> > POSIX message queues (bytes, -q) 819200
> > stack size (kbytes, -s) unlimited
> > cpu time (seconds, -t) unlimited
> > max user processes (-u) unlimited
> > virtual memory (kbytes, -v) unlimited
> > file locks (-x) unlimited
> >
> > Error message:
> > [root@zinfandel si5h12]# abinis < si5h12x.files > epsilon2.log
> > forrtl: severe (174): SIGSEGV, segmentation fault occurred
> > Image PC Routine Line
> > Source
> > libpthread.so.0 0000003A4EA0C430 Unknown Unknown
> > Unknown
> > libpthread.so.0 0000003A4EA0C2CE Unknown Unknown
> > Unknown
> > libguide.so 0000002A9557CA1A Unknown Unknown
> > Unknown
> >
> >
> > The end of the log file:
> > cvxclda: calculating Vxc using ixc = 7
> > cvxclda: calling rhohxc to calculate Vxc[n_val] (excluding non-linear
> > core corrections)
> > cvxclda: rhohxc returned Exc[n_val] = -9.5730 [Ha]
> > and <Vxc[n_val]> = -0.0519 [Ha]
> >
> >
> > vxc(1:3,1:3)=
> > (-0.3894028,-9.3494627E-31) (0.0000000E+00,0.0000000E+00)
> >
> >
> > On Mon, 2006-07-17 at 09:37 +0200, Fabien Bruneval wrote:
> > > In Paris, we encountered the same kind of problems with the intel
> > > fortran compiler. It seems that ifort does nasty things with the stack
> > > memory.
> > >
> > > Can you try to set the stack size of your machine to maximum with the
> > > statement:
> > > ulimit -s unlimited
> > >
> > > and then run again your calculations in the same sesssion.
> > >
> > > Tell me if it solves the problem.
> > >
> > >
> > > Fabien
> > >
> > >
> > >
> > > dylu@ucdavis.edu wrote:
> > > > Indeed it is the memory issue. I ran a test job with smaller npwwfn,
> > > > nband, and npwsigx. Abinit 5.1.2 finished the job, while Abinit 4.6.5
> > > > quit when reading the KSS file giving an error message "Segmentation
> > > > fault (core dumped)". I
> > > > guess that the KSS file reader handles memeory differently in these
> > > > two versions.
> > > >
> > > > Deyu
> > > >
> > > > # Si5H12 in a SC box
> > > > # convergence test against ecuteps
> > > >
> > > > ndtset 2
> > > > acell 25 25 25
> > > > #ecut 15.0
> > > > ecut 3.5
> > > > istwfk 1
> > > >
> > > > # DATASET 1: Calculation of the screening (epsilon^-1 matrix)
> > > > optdriver1 3
> > > > getkss 1
> > > > #npwwfn11 1503 #1503 1.579
> > > > #npwwfn21 2109 #2109 2.021
> > > > #npwwfn31 2969 #2969 2.527
> > > > #npwwfn41 3431 #3431 2.779
> > > > #npwwfn51 3911 #3911 3.032
> > > > npwwfn1 895
> > > > nband1 32
> > > > nsheps 25 #25 305 0.537
> > > > #nqptdm 1
> > > > #qptdm 0.000010 0.000020 0.000030
> > > >
> > > > #DATASET 2: Calculation of the Self-Energy matrix elements (GW
> > > > corrections)
> > > >
> > > > optdriver2 4
> > > > getscr2 -1
> > > > npwwfn2 895
> > > > npwsigx2 895
> > > > nband2 32
> > > > nkptgw 1
> > > > kptgw 0.000 0.000 0.000
> > > > bdgw 14 18
> > > >
> > > > # GW calculation general parameters
> > > > ppmfrq 4.3 eV
> > > >
> > > > #Definition of the atom types
> > > > ntypat 2 # There are 2 types of atom
> > > > znucl 14 1 # The keyword "znucl" refers to the atomic number
> > > > of the
> > > > # possible type(s) of atom. The pseudopotential(s)
> > > > # mentioned in the "files" file must correspond
> > > > # to the type(s) of atom. Here, the only type is
> > > > Hydrogen.
> > > > natom 17 # There are two atoms
> > > > typat 5*1 12*2 # Silicon: 1-5; Hydrogen: 6-17
> > > >
> > > > #Definition of the k-point grid
> > > > nkpt 1 # Only one k point is needed for isolated system,
> > > > # taken by default to be 0.0 0.0 0.0
> > > > diemac 1.0 # Although this is not mandatory, it is worth to
> > > > # precondition the SCF cycle. The model dielectric
> > > > diemix 0.5 # function used as the standard preconditioner
> > > > # is described in the "dielng" input variable
> > > > section.
> > > > # Here, we follow the prescriptions for molecules
> > > > # in a big box
> > > >
> > > > #Definition of the atoms
> > > > xcart 0.0 0.0 0.0 # EXP Si-H bond length 2.797 Bohr (1.48 A)
> > > > 2.5168172178E+00 -2.5168172178E+00 -2.5168172178E+00
> > > > -2.5168172178E+00 2.5168172178E+00 -2.5168172178E+00
> > > > -2.5168172178E+00 -2.5168172178E+00 2.5168172178E+00
> > > > 2.5168172178E+00 2.5168172178E+00 2.5168172178E+00
> > > > 4.1600819556E+00 -4.1600819556E+00 -9.1988967829E-01
> > > > 4.1600819556E+00 -9.1988967829E-01 -4.1600819556E+00
> > > > 9.1988967829E-01 -4.1600819556E+00 -4.1600819556E+00
> > > > -9.1988967829E-01 4.1600819556E+00 -4.1600819556E+00
> > > > -4.1600819556E+00 9.1988967829E-01 -4.1600819556E+00
> > > > -4.1600819556E+00 4.1600819556E+00 -9.1988967829E-01
> > > > -4.1600819556E+00 -4.1600819556E+00 9.1988967829E-01
> > > > -4.1600819556E+00 -9.1988967829E-01 4.1600819556E+00
> > > > -9.1988967829E-01 -4.1600819556E+00 4.1600819556E+00
> > > > 9.1988967829E-01 4.1600819556E+00 4.1600819556E+00
> > > > 4.1600819556E+00 9.1988967829E-01 4.1600819556E+00
> > > > 4.1600819556E+00 4.1600819556E+00 9.1988967829E-01
> > > >
> > > > # Use only symmorphic operations
> > > > symmorphi 0
> > > >
> >
> >
>
>




Archive powered by MHonArc 2.6.16.

Top of Page