Skip to Content.
Sympa Menu

forum - Re: (Solved) Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: (Solved) Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!


Chronological Thread 
  • From: "Anglade Pierre-Matthieu" <anglade@gmail.com>
  • To: forum@abinit.org
  • Subject: Re: (Solved) Re: [abinit-forum] v.5.3.x is slower than v.5.2.x ?!
  • Date: Tue, 15 May 2007 11:59:55 +0200
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=ShTj11E9tO/wpgW0hO88TiunlqClQUu3dozhhIs6OmM6EDuUtYFIl7KKKCL10vNPzNLqOZXTZP2se+0+2WY77K3XrPaNcizf8KLEwpnkCCyC6REJxTzOlqjy/2ROpIob47PvzBytOnUz6XdYJzP0NmcbJJ698FypUz7FvvDyTa4=

Dear Masayoshi,

Itanium 2 are EPIC (explicitely parallel instruction) processor. This means that the compiler is in charge of all optimisation at compile time. This is very different from usual RISC and SISC processor where this job is shared between compiler and processor. This is the reason why you must be very carefull of your compiler work when working with itanum processor and also the reson why this speedup i not really amazing.

Best regards,

PMA


On 5/14/07, Masayoshi Mikami <mmikami@rc.m-kagaku.co.jp > wrote:
Dear all,

Sorry for the series of the posts...
And very sorry to those who take his/her time
to dwell on this topic; In particular, Marc, I will be pleased
to owe you a bottle of special Japanese "sake" ... (next time !)

The solution : FCFLAGS_OPT="-O3 -tpp2",
whiche enables 10-15 times faster than "-O2"(default)
on Itanium2/ifort. (This is amazing !)

The previous results (the default configuration)
v.5.2.3:- Total wall clock time (s,m,h):   5824.1      97.07    1.618
(-O3 -tpp1)
v.5.2.4:- Total wall clock time (s,m,h):   12559.0   209.32   3.489 (-O2)
v.5.3.0:- Total wall clock time (s,m,h)   13056.1    217.60  3.627 (-O2)
v.5.3.2:- Total wall clock time (s,m,h)   74086.2  1234.77  20.580 (-O2)
v.5.3.3:- Total wall clock time (s,m,h):   74144.5  1235.74  20.596 (-O2)
v.5.3.4:- Total wall clock time (s,m,h):  74040.8  1234.01  20.567 (-O2)

Then, the same test with "-O3 -tpp2"
v.5.2.3:- Total wall clock time (s,m,h):  7652.5   127.54    2.126
v.5.2.4:- Total wall clock time (s,m,h):  7529.7   125.50   2.092
v.5.3.0:- Total wall clock time (s,m,h):  7634.1   127.23   2.121
v.5.3.2:- Total wall clock time (s,m,h):  7662.8   127.71   2.129
v.5.3.3:- Total wall clock time (s,m,h):  7687.2   128.12   2.135
v.5.3.4:- Total wall clock time (s,m,h):  7723.7   128.73   2.145

So, the gap between v.5.2.3 and v.5.2.4 (about twice times),
and the other gap between v.5.3.0 and v.5.3.2 (about 6-7 times)
can be overcome just with "-O3 -tpp2".

My preoccupation came from the experience with v.4.6.x,
where I used attached makefile_macros with "-O2" ...
Oh, yikes !!  Pardon !
(Still, don't you think this optimization effect is awesome ?
  So, the binary users might want to compile the source tar.gz
  with such higher optimization ...)

Thus, I would like to suggest "-O3 -tpp2" as default in configuration
for Itanium2/ifort... Yann, how do you think ?

BTW, speaking of configuration,
knowing about "FCFLAGS_FREEFORM"/"FCFLAGS_FIXEDFORM"
might help for some old computers with old compilers that do not
understand free/fixed format automatically.

Bien a vous,
Masayoshi



--
Pierre-Matthieu Anglade



Archive powered by MHonArc 2.6.16.

Top of Page