Skip to Content.
Sympa Menu

forum - Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!


Chronological Thread 
  • From: Masayoshi Mikami <mmikami@rc.m-kagaku.co.jp>
  • To: forum@abinit.org
  • Subject: Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!
  • Date: Thu, 26 Apr 2007 10:25:36 +0900

Dear all,

I have run a relatively big case (the model is confidential...
pardon !), to see the difference of the speed between v.5.3.4
and v.5.2.x (actually, v.5.2.3). I used the same input.
( relevant parameters are like this:
iscf=7/optcell=2/ionmov=3/ixc=11(PBE);
Two BROYDEN steps, and SCF cycles were 12 steps
for each BROYDEN step. I used high ecut (50 hartree)
for relatively hard Troullier-Martins pseudopotentials)
The jobs run on the Itanium2 single node (abinis).
The abinis esimates the necessary memory as 565 Mbytes.

The result of "diff" was like this:
-------
< .Version 5.2.3 of ABINIT
---
> .Version 5.3.4 of ABINIT
(snip)
< Total energy(eV)= -8.52960977840337E+03 ; Band energy (Ha)= -2.0397416022E+01
---
> Total energy(eV)= -8.52960977840336E+03 ; Band energy (Ha)= -2.0397416022E+01
(snip)
< - Total cpu time (s,m,h): 7433.6 123.89 2.065
< - Total wall clock time (s,m,h): 7433.6 123.89 2.065
---
> - Total cpu time (s,m,h): 73885.9 1231.43 20.524
> - Total wall clock time (s,m,h): 73885.9 1231.43 20.524
1168,1177c1168,1174
< - fourwf(pot) 3003.864 40.4 3004.089 40.4 22079
< - projbd 2030.388 27.3 2030.565 27.3 36206
< - nonlop(apply) 1041.758 14.0 1041.648 14.0 22079
< - fourwf(den) 281.831 3.8 281.837 3.8 3264
< - vtowfk(ssdiag) 204.163 2.7 204.165 2.7 -1
< - nonlop(forces) 153.787 2.1 153.775 2.1 3264
< - forces 67.505 0.9 67.506 0.9 24
< - fourdp 44.322 0.6 44.332 0.6 330
< - getghc-other 41.512 0.6 41.255 0.6 -1
< - 57 others 96.387 1.3 96.374 1.3
---
> - nonlop(apply) 51494.610 69.7 51494.717 69.7 22079
> - nonlop(forces) 10453.660 14.1 10453.713 14.1 3264
> - projbd 5534.380 7.5 5534.500 7.5 36206
> - fourwf(pot) 3041.498 4.1 3041.495 4.1 22079
> - nonlop(forstr) 1271.925 1.7 1271.934 1.7 272
> - vtowfk(ssdiag) 779.864 1.1 779.856 1.1 -1
> - 60 others 588.176 0.8 587.959 0.8
1179c1176
< - subtotal 6965.517 93.7 6965.546 93.7
---
> - subtotal 73164.113 99.0 73164.174 99.0

1184,1185c1181,1182
< .Delivered 0 WARNINGs and 1 COMMENTs to log file.
< +Overall time at end (sec) : cpu= 7433.6 wall= 7433.6
---
> .Delivered 1 WARNINGs and 1 COMMENTs to log file.
> +Overall time at end (sec) : cpu= 73885.9 wall= 73885.9
------

While "fourwf(pot) seems the same speed (still a bit slower
with v.5.3.x), v.5.3.4 needed much longer CPU time for
nonlop(apply)/nonlop(forces)/nonlop(stress) and projbd... and else (?).

I would appreciate your comments very much...

Regards,
Masayoshi

On 2007/04/17, at 11:00, Masayoshi Mikami wrote:

Dear all,

When I run large jobs with v.5.3.4, I noticed that
it takes much longer time than before (e.g. v.5.2.4).
So, I checked with tests_cpu (on MacOSX/Tiger/gfortran
... keeping the same situation), and noticed something.
(kindly take a look below my signature)
This is just an accident only in my place ?
(if so, sorry for this post...)

(BTW, the reference data seemingly remain the same ... ?)

Cheers,
Masayoshi

for example, diff_B6 in v.5.2.4 (the last part)
-------------------------------------------
116,123c116,123
< - fourwf(pot) 2.120 48 44.167 1769472 0.025
< - fourwf(den) 0.240 8 30.000 884736 0.034
< - fourdp 1.520 23 66.087 884736 0.075
< - nonlop(apply) 0.290 48 6.042 113238 0.053
< - nonlop(forces) 0.000 8 0.000 113238 0.000
< - nonlop(forstr) 0.000 4 0.000 113238 0.000
< - projbd 0.210 64 3.281 226476 0.014
< - xc:pot/=fourdp 1.060 3 353.333 884736 0.399
---
> - fourwf(pot) 7.262 48 151.286 1769472 0.085
> - fourwf(den) 0.684 8 85.449 884736 0.097
> - fourdp 4.881 23 212.211 884736 0.240
> - nonlop(apply) 0.570 48 11.881 113238 0.105
> - nonlop(forces) 0.105 8 13.184 113238 0.116
> - nonlop(stress) 0.104 4 25.879 113238 0.229
> - projbd 0.223 64 3.479 226476 0.015
> - xc:pot/=fourdp 2.299 3 766.276 884736 0.866
125c125
< +Overall time at end (sec) : cpu= 8.0 wall= 12.6
---
> +Overall time at end (sec) : cpu= 20.7 wall= 20.7
-------------------------------------------

On the other hand, diff_B6 in v.5.3.4
-------------------------------------------
116,123c116,123
< - fourwf(pot) 3.232 48 67.333 1769472 0.038
< - fourwf(den) 0.311 8 38.875 884736 0.044
< - fourdp 2.968 23 129.043 884736 0.146
< - nonlop(apply) 0.354 48 7.375 113238 0.065
< - nonlop(forces) 0.047 8 5.875 113238 0.052
< - nonlop(forstr) 0.038 4 9.500 113238 0.084
< - projbd 0.160 64 2.500 226476 0.011
< - xc:pot/=fourdp 2.364 3 788.000 884736 0.891
---
> - fourwf(pot) 7.262 48 151.286 1769472 0.085
> - fourwf(den) 0.684 8 85.449 884736 0.097
> - fourdp 4.881 23 212.211 884736 0.240
> - nonlop(apply) 0.570 48 11.881 113238 0.105
> - nonlop(forces) 0.105 8 13.184 113238 0.116
> - nonlop(stress) 0.104 4 25.879 113238 0.229
> - projbd 0.223 64 3.479 226476 0.015
> - xc:pot/=fourdp 2.299 3 766.276 884736 0.866
125c125
< +Overall time at end (sec) : cpu= 13.7 wall= 13.7
---
> +Overall time at end (sec) : cpu= 20.7 wall= 20.7
-------------------------------------------








Archive powered by MHonArc 2.6.16.

Top of Page