forum@abinit.org
Subject: The ABINIT Users Mailing List ( CLOSED )
List archive
- From: Marc Torrent <marc.torrent@cea.fr>
- To: forum@abinit.org
- Subject: Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!
- Date: Wed, 09 May 2007 14:41:25 +0200
Dear Masayoshi,
First of all, a remark concerning the memory need for useylm=1:
the additional amount of memory is justified by the fact that non-local form factors (ffnl Abinit variable) are now discritized by (l,m,n) quantum number instead of (l,n).
And you also add the memory needed by the spherical harmonics...
Now, concerning your problem of cpu time:
I'm currently looking at the 5.2.4-5.3.0 differences but it will need more time... it will be the subject of of a later mail.
Between 5.2.3 and 5.2.4:
You found that cpu time increases (1.618 h -> 3.489 h)... it's surprising !... because these two versions of Abinit do not have any differences for ground states calculations...
Only a few comments, new input keywords and tests have been added.
Just verify that your input file does not contain userid keyword (because Pierre-Matthieu changed some lines for that in 5.2.4).
The "major evolution" between 5.2.3 and 5.2.4 is the build system !...
Do you think it would be possible for you to compile Abinit 5.2.4 with 5.2.3 build system ? (just try to copy /src directory from 5.2.4 to 5.2.3 ???).
Another question: are you sure that you ran Abinit with the same environnement, in particular the computer load...
Can you reproduce several times the difference between 5.2.3 and 5.2.4 ?
That's all for today,
sorry,
Marc
Masayoshi Mikami a écrit :
Dear all,
I was away from office, so my reply was later than expected...
I am going to report the two things:
- put useylm=1 in your input and see if Ylm version of nonlop has the same behaviour as Legendre polynomials one.
- try to follow the changes by testing v5.2.3, v5.2.4, v5.3.0, v5.3.2 and v5.3.4...
(1) with/without "useylm 1" for v.5.3.4
No effective... (even slower with useylm 1). diff "without" and "with" :
< Total energy(eV)= -8.52960977840336E+03 ; Band energy (Ha)= -2.0397416022E+01
---
> Total energy(eV)= -8.52960977840312E+03 ; Band energy (Ha)= -2.0397416022E+01
(snip)
< - Total cpu time (s,m,h): 73885.9 1231.43 20.524
< - Total wall clock time (s,m,h): 73885.9 1231.43 20.524
---
> - Total cpu time (s,m,h): -121662.4 -2027.71 -33.795
> - Total wall clock time (s,m,h): 93085.6 1551.43 25.857
(snip)
1168,1174c1170,1176
< - nonlop(apply) 51494.610 69.7 51494.717 69.7 22079
< - nonlop(forces) 10453.660 14.1 10453.713 14.1 3264
< - projbd 5534.380 7.5 5534.500 7.5 36206
< - fourwf(pot) 3041.498 4.1 3041.495 4.1 22079
< - nonlop(forstr) 1271.925 1.7 1271.934 1.7 272
< - vtowfk(ssdiag) 779.864 1.1 779.856 1.1 -1
< - 60 others 588.176 0.8 587.959 0.8
---
> - nonlop(forces) 11423.508 -9.4 11423.488 12.3 3264
> - projbd 5688.608 -4.7 5688.702 6.1 36206
> - fourwf(pot) 3073.283 -2.5 3073.392 3.3 22079
> - nonlop(forstr) 2151.584 -1.8 2151.586 2.3 272
> - vtowfk(ssdiag) 801.072 -0.7 801.067 0.9 -1
> - nonlop(apply) -146198.784 120.2 68549.316 73.6 22079
> - 60 others 649.352 -0.5 649.152 0.7
1176c1178
< - subtotal 73164.113 99.0 73164.174 99.0
---
> - subtotal -122411.377 100.6 92336.703 99.2
Please do not get stunned with the negative CPU time.
(Maybe some parameters get over the limitation of the range... ?)
We should see the wall time in this output.
"useylm 1" apparently needs more memory ...
< P This job should need less than 565.407 Mbytes of memory.
---
> P This job should need less than 633.511 Mbytes of memory.
(2) esting v5.2.3, v5.2.4, v5.3.0, v5.3.2 and v5.3.4...
I have noticed that the big CPU time difference between v.5.2.3 and v.5.2.4 !
(please let me correct my former comment, pardon !)
The summary is like this:
v.5.2.3:- Total wall clock time (s,m,h): 5824.1 97.07 1.618
v.5.2.4:- Total wall clock time (s,m,h): 12559.0 209.32 3.489
v.5.3.0:- Total wall clock time (s,m,h): 13056.1 217.60 3.627
v.5.3.2:- Total wall clock time (s,m,h): 74086.2 1234.77 20.580
v.5.3.3:- Total wall clock time (s,m,h): 74144.5 1235.74 20.596
v.5.3.4:- Total wall clock time (s,m,h): 74040.8 1234.01 20.567
(NB: To test this, I recompiled all the binaries on Itanium2/Linux(2.4.24)
with ifort 8.1 (l_fc_pc_8.1.019). All the tests run with "abinis".)
So the transition between v.5.2.3 and v.5.2.4 as well as
the another transition between v.5.3.0 and v.5.3.2 seems quite big !
Here is a more detailed memo;
< .Version 5.2.3 of ABINIT
---
> .Version 5.2.4 of ABINIT
(snip)
< Total energy(eV)= -8.52960977840337E+03 ; Band energy (Ha)= -2.0397416022E+01
---
> Total energy(eV)= -8.52960977840336E+03 ; Band energy (Ha)= -2.0397416022E+01
(snip)
1160,1161c1160,1161
< - Total cpu time (s,m,h): 5824.1 97.07 1.618
< - Total wall clock time (s,m,h): 5824.1 97.07 1.618
---
> - Total cpu time (s,m,h): 12559.0 209.32 3.489
> - Total wall clock time (s,m,h): 12559.0 209.32 3.489
1168,1177c1168,1175
< - fourwf(pot) 2606.561 44.8 2606.631 44.8 22079
< - projbd 1381.013 23.7 1381.014 23.7 36206
< - nonlop(apply) 799.226 13.7 799.310 13.7 22079
< - fourwf(den) 231.073 4.0 231.079 4.0 3264
< - vtowfk(ssdiag) 156.961 2.7 156.959 2.7 -1
< - nonlop(forces) 125.116 2.1 125.128 2.1 3264
< - forces 53.086 0.9 53.088 0.9 24
< - getghc-other 32.156 0.6 31.870 0.5 -1
< - fourdp 32.090 0.6 32.082 0.6 330
< - 57 others 80.349 1.4 80.341 1.4
---
> - projbd 5564.227 44.3 5564.303 44.3 36206
> - fourwf(pot) 3050.735 24.3 3050.726 24.3 22079
> - nonlop(apply) 2149.281 17.1 2149.324 17.1 22079
> - vtowfk(ssdiag) 336.827 2.7 336.827 2.7 -1
> - fourwf(den) 273.986 2.2 274.021 2.2 3264
> - nonlop(forces) 213.336 1.7 213.283 1.7 3264
> - getghc-other 65.512 0.5 65.414 0.5 -1
> - 59 others 191.804 1.5 191.829 1.5
1179c1177
< - subtotal 5497.630 94.4 5497.502 94.4
---
> - subtotal 11845.706 94.3 11845.727 94.3
1184,1185c1182,1183
< .Delivered 6 WARNINGs and 1 COMMENTs to log file.
< +Overall time at end (sec) : cpu= 5824.1 wall= 5824.1
---
> .Delivered 1 WARNINGs and 1 COMMENTs to log file.
> +Overall time at end (sec) : cpu= 12559.0 wall= 12559.0
In passing, ...
< .Version 5.2.4 of ABINIT
---
> .Version 5.3.0 of ABINIT
1160,1161c1162,1163
< - Total cpu time (s,m,h): 12559.0 209.32 3.489
< - Total wall clock time (s,m,h): 12559.0 209.32 3.489
---
> - Total cpu time (s,m,h): 13056.1 217.60 3.627
> - Total wall clock time (s,m,h): 13056.1 217.60 3.627
1168,1175c1170,1178
< - projbd 5564.227 44.3 5564.303 44.3 36206
< - fourwf(pot) 3050.735 24.3 3050.726 24.3 22079
< - nonlop(apply) 2149.281 17.1 2149.324 17.1 22079
< - vtowfk(ssdiag) 336.827 2.7 336.827 2.7 -1
< - fourwf(den) 273.986 2.2 274.021 2.2 3264
< - nonlop(forces) 213.336 1.7 213.283 1.7 3264
< - getghc-other 65.512 0.5 65.414 0.5 -1
< - 59 others 191.804 1.5 191.829 1.5
---
> - projbd 5577.244 42.7 5577.338 42.7 36206
> - fourwf(pot) 3043.856 23.3 3043.865 23.3 22079
> - nonlop(apply) 2145.970 16.4 2146.085 16.4 22079
> - vtowfk(ssdiag) 778.078 6.0 778.083 6.0 -1
> - fourwf(den) 273.408 2.1 273.436 2.1 3264
> - nonlop(forces) 213.192 1.6 213.174 1.6 3264
> - forces 95.838 0.7 95.837 0.7 24
> - getghc-other 66.057 0.5 65.945 0.5 -1
> - 58 others 145.294 1.1 145.273 1.1
1177c1180
< - subtotal 11845.706 94.3 11845.727 94.3
---
> - subtotal 12338.938 94.5 12339.036 94.5
1182,1183c1185,1186
< .Delivered 1 WARNINGs and 1 COMMENTs to log file.
< +Overall time at end (sec) : cpu= 12559.0 wall= 12559.0
---
> .Delivered 2 WARNINGs and 1 COMMENTs to log file.
> +Overall time at end (sec) : cpu= 13056.1 wall= 13056.1
Then,
< .Version 5.3.0 of ABINIT
---
> .Version 5.3.2 of ABINIT
(snip)
1162,1163c1162,1163
< - Total cpu time (s,m,h): 13056.1 217.60 3.627
< - Total wall clock time (s,m,h): 13056.1 217.60 3.627
---
> - Total cpu time (s,m,h): -140661.8 -2344.36 -39.073
> - Total wall clock time (s,m,h): 74086.2 1234.77 20.580
1170,1178c1170,1176
< - projbd 5577.244 42.7 5577.338 42.7 36206
< - fourwf(pot) 3043.856 23.3 3043.865 23.3 22079
< - nonlop(apply) 2145.970 16.4 2146.085 16.4 22079
< - vtowfk(ssdiag) 778.078 6.0 778.083 6.0 -1
< - fourwf(den) 273.408 2.1 273.436 2.1 3264
< - nonlop(forces) 213.192 1.6 213.174 1.6 3264
< - forces 95.838 0.7 95.837 0.7 24
< - getghc-other 66.057 0.5 65.945 0.5 -1
< - 58 others 145.294 1.1 145.273 1.1
---
> - nonlop(forces) 10460.239 -7.4 10460.209 14.1 3264
> - projbd 5551.615 -3.9 5551.733 7.5 36206
> - fourwf(pot) 3043.078 -2.2 3043.082 4.1 22079
> - nonlop(forstr) 1271.740 -0.9 1271.742 1.7 272
> - vtowfk(ssdiag) 775.022 -0.6 775.023 1.0 -1
> - nonlop(apply) -163092.779 115.9 51655.317 69.7 22079
> - 60 others 587.954 -0.4 587.956 0.8
1180c1178
< - subtotal 12338.938 94.5 12339.036 94.5
---
> - subtotal -141403.131 100.5 73345.062 99.0
And,
< .Version 5.3.2 of ABINIT
---
> .Version 5.3.3 of ABINIT
(snip)
1162,1163c1162,1163
< - Total cpu time (s,m,h): -140661.8 -2344.36 -39.073
< - Total wall clock time (s,m,h): 74086.2 1234.77 20.580
---
> - Total cpu time (s,m,h): 74144.5 1235.74 20.596
> - Total wall clock time (s,m,h): 74144.5 1235.74 20.596
1170,1176c1170,1176
< - nonlop(forces) 10460.239 -7.4 10460.209 14.1 3264
< - projbd 5551.615 -3.9 5551.733 7.5 36206
< - fourwf(pot) 3043.078 -2.2 3043.082 4.1 22079
< - nonlop(forstr) 1271.740 -0.9 1271.742 1.7 272
< - vtowfk(ssdiag) 775.022 -0.6 775.023 1.0 -1
< - nonlop(apply) -163092.779 115.9 51655.317 69.7 22079
< - 60 others 587.954 -0.4 587.956 0.8
---
> - nonlop(apply) 51699.264 69.7 51699.243 69.7 22079
> - nonlop(forces) 10459.982 14.1 10459.994 14.1 3264
> - projbd 5587.185 7.5 5587.384 7.5 36206
> - fourwf(pot) 3044.780 4.1 3044.884 4.1 22079
> - nonlop(forstr) 1271.329 1.7 1271.329 1.7 272
> - vtowfk(ssdiag) 775.432 1.0 775.425 1.0 -1
> - 60 others 586.740 0.8 586.618 0.8
1178c1178
< - subtotal -141403.131 100.5 73345.062 99.0
---
> - subtotal 73424.712 99.0 73424.877 99.0
1183,1184c1183,1184
< .Delivered 19 WARNINGs and 1 COMMENTs to log file.
< +Overall time at end (sec) : cpu= -140661.8 wall= 74086.2
---
> .Delivered 4 WARNINGs and 1 COMMENTs to log file.
> +Overall time at end (sec) : cpu= 74144.5 wall= 74144.5
And, finally,
< .Version 5.3.3 of ABINIT
---
> .Version 5.3.4 of ABINIT
(snip)
1162,1163c1162,1163
< - Total cpu time (s,m,h): 74144.5 1235.74 20.596
< - Total wall clock time (s,m,h): 74144.5 1235.74 20.596
---
> - Total cpu time (s,m,h): 74040.8 1234.01 20.567
> - Total wall clock time (s,m,h): 74040.8 1234.01 20.567
1170,1176c1170,1176
< - nonlop(apply) 51699.264 69.7 51699.243 69.7 22079
< - nonlop(forces) 10459.982 14.1 10459.994 14.1 3264
< - projbd 5587.185 7.5 5587.384 7.5 36206
< - fourwf(pot) 3044.780 4.1 3044.884 4.1 22079
< - nonlop(forstr) 1271.329 1.7 1271.329 1.7 272
< - vtowfk(ssdiag) 775.432 1.0 775.425 1.0 -1
< - 60 others 586.740 0.8 586.618 0.8
---
> - nonlop(apply) 51572.901 69.7 51573.041 69.7 22079
> - nonlop(forces) 10465.045 14.1 10465.066 14.1 3264
> - projbd 5597.726 7.6 5597.835 7.6 36206
> - fourwf(pot) 3042.212 4.1 3042.068 4.1 22079
> - nonlop(forstr) 1272.081 1.7 1272.077 1.7 272
> - vtowfk(ssdiag) 777.669 1.1 777.668 1.1 -1
> - 60 others 590.731 0.8 590.541 0.8
1178c1178
< - subtotal 73424.712 99.0 73424.877 99.0
---
> - subtotal 73318.364 99.0 73318.296 99.0
1183,1184c1183,1184
< .Delivered 4 WARNINGs and 1 COMMENTs to log file.
< +Overall time at end (sec) : cpu= 74144.5 wall= 74144.5
---
> .Delivered 5 WARNINGs and 1 COMMENTs to log file.
> +Overall time at end (sec) : cpu= 74040.8 wall= 74040.8
... I wish this benchmark could give some hints ....
Bien a vous,
Masayoshi
- Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Masayoshi Mikami, 05/07/2007
- Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Marc Torrent, 05/09/2007
- Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Yann Pouillon, 05/09/2007
- Message not available
- Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Masayoshi Mikami, 05/11/2007
- Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Marc Torrent, 05/11/2007
- Message not available
- Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Masayoshi Mikami, 05/11/2007
- Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Yann Pouillon, 05/11/2007
- (Solved) Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Masayoshi Mikami, 05/14/2007
- Re: (Solved) Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Anglade Pierre-Matthieu, 05/15/2007
- configuration to mix -O2/-O3 for different routines, Masayoshi Mikami, 05/16/2007
- Re: [abinit-forum] configuration to mix -O2/-O3 for different routines, Masayoshi Mikami, 05/16/2007
- Re: (Solved) Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Gilles Zerah, 05/16/2007
- Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Masayoshi Mikami, 05/11/2007
- Message not available
- Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Marc Torrent, 05/11/2007
- Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Masayoshi Mikami, 05/11/2007
- Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!, Marc Torrent, 05/09/2007
Archive powered by MHonArc 2.6.16.