Skip to Content.
Sympa Menu

forum - Re: Re: Re: [abinit-forum] Parallel outkss?

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: Re: Re: [abinit-forum] Parallel outkss?


Chronological Thread 
  • From: cdhogan@roma2.infn.it
  • To: forum@abinit.org
  • Subject: Re: Re: Re: [abinit-forum] Parallel outkss?
  • Date: Thu, 24 Feb 2005 13:02:38 +0100

Hi again,

This technical post is related to generation of the KSS file in parallel
execution, and follows on my previous post, and so is not very interesting to
most people. I'm worried to find that Xavier reports no problems, so I hope
I'm not just doing something stupid! Perhaps it's related to the use of
kssform=3?

Below the detailed specs (I fear the wrath of the eagle-eyed Mikami)
of the tests I've run. Basically I'm seeing an array-bound violation in
outkss.f. I'd be very interested in seeing this matter resolved.

Returning to a 4.1.X version I'd really like to avoid, since the output
is interfaced elsewhere, and the header changes make things a real pain to
do.

Any takers?

Conor


Systems:
---------------
IBM SP4 - AIX 5.2
with IBM(R) XL Fortran Enterprise Edition V9.1

GNU/Linux RedHat 9.0 2.4.28-SMP
Intel(R) Xeon(TM) Dual CPU 2.40GHz 4Gb RAM
ifc Version 7.1 (7.1.038)
also ifort Version 8.0 (8.0.039)
mpich-1.2.6 ch_shmem device

--------------------------
IBM SP4: 4.3.3
--------------------------
Serial
--------
Compilation: Fine, with and without opt
Run (SCF): Fine
Run (KSS): Fine

Parallel: [see makefile.macros.noopt-par]
--------
Compilation: unable to compile with O3 optimization.
Compiled ok with no optimization.
Run (SCF): Gave errors when reading input file (problem with rprim, etc)
Run (KSS): Try using DEN file from serial 4.3.3 run, hangs after reading DEN.

--------------------------
IBM SP4: 4.4.3
--------------------------
Serial:
--------
Compilation: Fine, with and without opt
Run (SCF): Fine
Run (KSS): Fine

Parallel: [see makefile.macros.noopt-par]
---------
Compilation: Fine, with and without optimization (some minor problems in
Src_2nonlocal with -O3)
Run (SCF): Fine
Run (KSS, 1 cpu, -O0): Seems ok: eigenvector normalization fine
Run (KSS, >1 cpu, -O0): Eigenvalues ok, eigenvector normalization wrong
e.g.
Test on the normalization of the wavefunctions
min sum_G |a(n,k,G)| = 0.000000
max sum_G |a(n,k,G)| = INF

makefile.macros.sp4-noopt-par:
--------------------------
MACHINE=ibm
FC=mpxlf90_r
FFLAGS=-O0 -q64 -qnolm -qstrict
FFLAGS_LIBS=-O0 -q64 -qnolm -qstrict -qfixed
FLINK= -q64 -bloadmap:loadmap -bmaxdata:0x70000000
PP=/usr/lib/cpp
CPP_FLAGS=-P -q64
CC=mpcc
CFLAGS=-O -q64
AR_COMMAND = ar -X 32_64
LIBS= $(LAPACK) $(BLAS)

--------------------------
Linux: 4.4.3
Linux: 4.3.3
--------------------------
Serial: Fine.
Parallel: [see makefile.macros.linux-noopt-par/opt.par]
---------
Compilation: Fine, with and without optimization (some minor problems in
Src_2nonlocal with vectorization)
Run (SCF): Fine
Run (KSS, 1 cpu, -O0): Seems ok: eigenvector normalization fine
Run (KSS, >1 cpu, -O0): Eigenvalues ok, eigenvector normalization wrong
e.g.
Test on the normalization of the wavefunctions
min sum_G |a(n,k,G)| = 0.000000
max sum_G |a(n,k,G)| = +++++++++

Recompiled with:
--------------
-O0 -CB -CS -CU (in FFLAGS_PAR)
Array bounds exceeded In Procedure: outkss Line 1211
Traced the error to cg(1,X):

wfg(ig,ib)=cmplx(cg(1,ts(ig)+(ib-1)*npw_k*nspinor+k_index),&
& cg(2,ts(ig)+(ib-1)*npw_k*nspinor+k_index),&
& kind(0.0d0))

and there I stopped.

makefile.macros.linux-opt-par:
--------------------------
MACHINE=P6
FC=/usr/local/mpich-1.2.6/bin/mpif90
FFLAGS=-FR -w90 -w95 -tpp7 -xW -O3 -pc64
FFLAGS_LIBS= -w90 -w95 -tpp7 -xW -O3 -pc64
FFLAGS_Src_2nonlocal = -FR -w90 -w95 -tpp7 -O3 -pc64
FLINK = -xW
CPP=/lib/cpp
CPP_FLAGS=-P -traditional -D__IFC
FFLAGS_PAR= $(FFLAGS) -I /usr/local/mpich-1.2.6/include
MPI_A=/usr/local/mpich-1.2.6/lib/libmpich.a
LIBS_PAR=$(LIBS) $(MPI_A)

makefile.macros.linux-noopt-par:
--------------------------
MACHINE=P6
FC=/usr/local/mpich-1.2.6/bin/mpif90
FFLAGS=-FR -w90 -w95 -O0
FFLAGS_LIBS= -w90 -w95 -O0
FLINK =
(rest as above)

Abinit input: (an Aluminium slab, reduced for testing)
--------------------------
irdden 1
occopt 4
chkprim 0
tsmear 0.01
optcell 0
ionmov 0
ntime 100
dilatmx 1.05
ecutsm 0.5
ixc 7
ntypat 1
znucl 13
natom 11
typat 11*1
xcart -1.3272427601E-10 3.0651281645E+00 -2.1670484574E+01
2.6544788564E+00 1.5325640821E+00 -1.7298078871E+01
-1.3272424851E-10 -2.2988514182E-10 -1.2983059095E+01
-1.3272427601E-10 3.0651281645E+00 -8.6440933841E+00
2.6544788564E+00 1.5325640821E+00 -4.3151771182E+00
-1.3272424851E-10 -2.2988514182E-10 0.0000000000E+00
-1.3272427601E-10 3.0651281645E+00 4.3151771182E+00
2.6544788564E+00 1.5325640821E+00 8.6440933841E+00
-1.3272424851E-10 -2.2988514182E-10 1.2983059095E+01
-1.3272427601E-10 3.0651281645E+00 1.7298078871E+01
2.6544788564E+00 1.5325640821E+00 2.1670484574E+01
acell 5.308957713 5.308957713 108.368645526892
rprim 1.0000000000E+00 0.0000000000E+00 0.0000000000E+00
-5.0000000000E-01 8.6602540378E-01 0.0000000000E+00
0.0000000000E+00 0.0000000000E+00 1.0000000000E+00
ecut 10.0
iprcell 45
diemix 0.75
diemac 4.0
prtwf 1
iscf -2
nband 30
getden 1
prtvol 3
tolwfr 1.0d-6
nstep 500
nbandkss 30
kssform 3
nshift 1
shiftk 0 0 0
kptopt 0
istwfk 6*1
nkpt 6
kpt 0.00000000E+00 0.00000000E+00 0.00000000E+00
2.50000000E-02 0.00000000E+00 0.00000000E+00
5.00000000E-02 0.00000000E+00 0.00000000E+00
7.50000000E-02 0.00000000E+00 0.00000000E+00
7.50000000E-02 0.00000000E+00 0.00000000E+00
7.50000000E-02 0.00000000E+00 0.00000000E+00



Archive powered by MHonArc 2.6.16.

Top of Page