Skip to Content.
Sympa Menu

forum - Re: [abinit-forum] Crash in abinit 5.4.3 due to improper memory use

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: [abinit-forum] Crash in abinit 5.4.3 due to improper memory use


Chronological Thread 
  • From: paul fons <paul-fons@aist.go.jp>
  • To: Anglade Pierre-Matthieu <anglade@gmail.com>
  • Cc: forum@abinit.org
  • Subject: Re: [abinit-forum] Crash in abinit 5.4.3 due to improper memory use
  • Date: Mon, 24 Sep 2007 13:00:08 +0900

The problem originally came to my attention about when I used an (Intel compiler) optimized version of abinit.  I then recompiled turning on the -check option and including complete symbol information as well at the -traceback option (although it appears when an error is flagged via the check option a traceback isn't shown)-- this is what I posted.   In my limited experience, I have found the intel compiler to give much better performance than gfortran or g95 (athough I have used both before on a power pc when compiling abinit).   abinit as well as mpich2 were all compiled with 64 bits. I did try to use a debugger, but it appears that the Intel 64 debugger has problems loading symbols (it goes into an apparently endless loop trying to load them).  As I wanted to deal with larger cells, I have been wanting to use a 64 bit version.  Things may get better next month with the "leopard" OS from Apple -- this supposedly will be extensively 64 bit compatible and will also be more careful about assigning code threads (read MPI processes in my case) back to the same cpu to avoid cache hits.  I am planning to purchase a little cluster (three four core Mac Pros) shortly to try and speed things up and would very much like to get to the bottom of this problem.

I feared, but have no proof, that this may be a 64 bit problem.  I was planning to compile a 32 bit version to see if these problems go away.  Now with your suggestion, I will compile a 64 bit version with gfortran and see if this helps.  The funny thing is, I downloaded a precompiled version of abinit for Intel linux and have been running that with the same input file with no problems so far.  Of course, I have the latest Intel compiler (10.0.17) and the 5.4.3 binary may use an older version of the Intel compiler.   I have more things to try I realize, but I thought I would see if anyone has run into similar issues (with the Intel compiler) to avoid reinventing the wheel.

Thanks for your help

Paul


On Sep 21, 2007, at 9:26 PM, Anglade Pierre-Matthieu wrote:

Hi,

I can remember such kind of error messages with intel compiler.
Sometime Abinit expect a certain size for a variables when they are
transmitted from one function to the other . But since the variable is
sometimes unused it is not allocated. And you can get such error
message.
For your own case this is probably a little bit different. May be
Intel compiler has a different way to treat array large than a certain
size and PAWRAD has growth too much.

Do you reproduce this same bug with other compilers ? For instance g95
? This last one, if you compile with "-g -ftrace=full' will give you
lot of error about the precise instruction that is badly handled in
the code.
Have you try debugging and runtime checking option of intel compiler ?
They will likely also give you lots of useful information about the
location of the bug.

regards

PMA

On 9/21/07, Paul Fons <paul-fons@aist.go.jp> wrote:

Greetings,
  I was doing a convergence study for a structure for various cutoff
energies 20,30,40, and 50 Ha.  Strangely, all worked fine until I reached 50
Ha.  For ecut 50 Ha, abinit crashes without warning during the first SCF
loop. The version of abinit used is the current version of 5.4.3 compiled
with the latest Intel (10.0.17) fortran compiler on darwin.  All of the
automatic tests of the build work fine.  I have not seen strange crashes of
this type before and would like to track down the cause.  As the intel
compiler has a check option which generates runtime code that checks for
proper memory use, I compiled a "debug" version of abinit and ran the same
code again.  I would use the 64 bit debugger that comes with the compiler to
nail down a little bit more about location, but it has trouble finding some
symbols in dynamic libraries upon reading of the abinis binary that put it
into a seemingly endless error loop (that has nothing to do with the abinit
memory problem).

Any ideas as to what might be going wrong?

The input file as well as the full output (up to the crash) is included
below.  The code is compiled using 64 bits, so memory allocation failures
should not be a problem.

The last breath of the run is below.  The Intel compiler reports that PAWRAD
is being used without being allocated.




== DATASET  1
==================================================================

 dtsetcopy : copying area  algalch    the actual size (           3
 ) of the index (           1 )  differs from its standard size (
0 )
 dtsetcopy : copying area  kberry     the actual size (          20
 ) of the index (           2 )  differs from its standard size (
1 )
 dtsetcopy : copying area  nband      the actual size (          10
 ) of the index (           1 )  differs from its standard size (
1 )
  dtsetcopy : allocated densty= T
 dtsetcopy : copying area  mixalch    the actual size (           3
 ) of the index (           1 )  differs from its standard size (
0 )
 dtsetcopy : copying area  mixalch    the actual size (           3
 ) of the index (           2 )  differs from its standard size (
0 )
 dtsetcopy : copying area  shiftk     the actual size (           8
 ) of the index (           2 )  differs from its standard size (
1 )

 getdim_nloc : deduce lmnmax  =  16, lnmax  =   4,
                      lmnmaxso=  16, lnmaxso=   4.
forrtl: severe (408): fort: (8): Attempt to fetch from allocatable variable
PAWRAD when it is not allocated




The content of the (4) status files is also included here
 Status file, with repetition rate  49, status number    99

  Level abinit         : call driver
  Level driver         : call gstate
  Level gstate         : call scfcv
  Level scfcv          : call vtorho
  istep   =    1
  Level vtorho         : loop ikpt
  isppol  =    1
  ikpt    =    4

 Status file, with repetition rate  49, status number    99

  Level abinit         : call driver
  Level driver         : call gstate
  Level gstate         : call scfcv
  Level scfcv          : call vtorho
  istep   =    1
  Level vtorho         : call vtowfk
  isppol  =    1
  ikpt    =    6
  Level vtowfk         : deallocate

 Status file, with repetition rate  49, status number    99

  Level abinit         : call driver
  Level driver         : call gstate
  Level gstate         : call scfcv
  Level scfcv          : call vtorho
  istep   =    1
  Level vtorho         : call vtowfk
  isppol  =    1
  ikpt    =    9
  Level vtowfk         : call pw_orthon
  inonsc  =    2

 Status file, with repetition rate  49, status number    50

  Level abinit         : call driver
  Level driver         : call gstate
  Level gstate         : call scfcv
  Level scfcv          : call vtorho
  istep   =    1
  Level vtorho         : call vtowfk
  isppol  =    1
  ikpt    =   10
  Level vtowfk         : call normev
  inonsc  =    1










Dr. Paul Fons

Nano-Optics Reseach Team

Team Leader

National Institute for Advanced Industrial Science & Technology

METI

Center for Applied Near-Field Optics Research (CANFOR)

AIST Central 4, Higashi 1-1-1

Tsukuba, Ibaraki JAPAN 305-8568




tel. +81-298-61-5636

fax. +81-298-61-2939






The following lines are in a Japanese font

〒305-8562 茨城県つくば市つくば中央東 1-1-1
産業技術総合研究所
近接場光応用工学研究センター
近接場光基礎研究チーム チーム長
ポール・フォンス








-- 
Pierre-Matthieu Anglade

Paul Fons
Team Leader
Nano-Optics Research Team

Center for Applied Near-Field Optics
National Institute of Advanced Industrial Technology
Tsukuba Central 4, Higashi 1-1-1
Tsukuba, Ibaraki Japan 305-8562

tel. +81-298-61-5635
fax. +81-298-61-2939

The following lines are in a Japanese font
〒305-8562 茨城県つくば市つくば中央東 1-1-1
産業技術総合研究所
近接場光応用工学研究センター
近接場光基礎研究チーム チーム長
ポール・フォンス






Archive powered by MHonArc 2.6.16.

Top of Page