Skip to Content.
Sympa Menu

forum - Re: [abinit-forum] efficiency of PAW (and spin) bands/FFT parallelization

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

Re: [abinit-forum] efficiency of PAW (and spin) bands/FFT parallelization


Chronological Thread 
  • From: Zeila Zanolli <zeila.zanolli@uclouvain.be>
  • To: forum@abinit.org
  • Subject: Re: [abinit-forum] efficiency of PAW (and spin) bands/FFT parallelization
  • Date: Sat, 16 Jan 2010 20:12:41 +0100

Hi,

I'm attaching here the timin analysis for both the serial, parallel (with wfoptalg 4 and nloalg 4) and parallel  (with wfoptalg 14 and no nloalg 4).
The parallel jobs are run on 7 cpus. I've only asked for 2 broyden steps for each calculation.

Serial:                                                                Max Virtual Memory = 1.058 Gb, Time = 05:05:30
Parallel with wfoptalg 4 and nloalg 4:         Max Virtual Memory = 4.702 Gb, Time = 02:39:53

The run with  wfoptalg 14 and no nloalg 4  stopped after the broyden step 0 because the PAW sphere were overlapping and this was because the scf cicle at the broyden step 0 did not converged (after 100 iterations).

Thanks for all the suggestions........

All the best,
Zeila




Attachment: timing_parall_wfoptalg4
Description: Binary data


Attachment: timing_serial
Description: Binary data


On 14 Jan 2010, at 13:34, BOTTIN Francois wrote:

Hi Zeila,

Zeila Zanolli a écrit :
the mpw values in the serial and parallel output files are different:
serial: mpw = 30332
7cpu:  mpw =   4333 where mpw is the maximum number of plane waves.
Hence it seems that the parallelization is done on the plane waves (and indeed is  30332./7 ~ 4333 )
However, in the input file it is   npfft = 1, that is no parallelization on the plane waves.
So... on what is actually performed the parallelization?
- mpw is the maximum number of PW you have on the processor 0.
As you have seen, this number is not connected to npfft (=1) but to npfft*npband (=7).
Indeed, in Abinit with paral_kgb=1, there are two distributions,
the "plane-wave" and the "band/FFT" ones, which are used in turn.
In the first one, the PW are distributed over the two "band and FFT levels" (npfft*npband).
In the second one, the bands are distributed over the "band level" (npband)
and their PW coefficients over the "FFT level" (npfft).
The code go from the first one to the second one by performing a communication
along the band level (see the article and/or the Abinit workshop pdf for more details).
So don't panic, mpw is well defined.

- Concerning the proposition of Emmanuel Arras, you can also put npkpt=2
using paral_kgb=1, in order to parallelize over spin components.

- As you already said, I haven't see any significant difference between your input and output files.

- This point has no relation with your problem but with abinit6.0 you can
use wfoptalg 14 and remove nloalg 4 from your input. It should improve the performance.

Let me know if you obtain some good (or bad) results from your timing analysis.
Good continuation,
Francois

-- 
##############################################################
Francois Bottin                    tel: 01 69 26 41 73
CEA/DIF                            fax: 01 69 26 70 77
BP 12 Bruyeres-le-Chatel         email: Francois.Bottin@cea.fr
##############################################################


---------------------------------------------------------------------------------------------
Dr. Zeila Zanolli

Université Catholique de Louvain (UCL)
Unité Physico-Chimie et de Physique des Matériaux (PCPM) 
Place Croix du Sud, 1 (Boltzmann)
B-1348 Louvain-la-Neuve, Belgium
Phone: +32 (0)10 47 3501 
Mobile: +32 (0)487 556699
Fax: +32 (0)10 47 3452
---------------------------------------------------------------------------------------------







Archive powered by MHonArc 2.6.16.

Top of Page