Skip to Content.
Sympa Menu

forum - MPI communicators

forum@abinit.org

Subject: The ABINIT Users Mailing List ( CLOSED )

List archive

MPI communicators


Chronological Thread 
  • From: a.cote@ucl.ac.uk
  • To: forum@abinit.org
  • Subject: MPI communicators
  • Date: Thu, 18 Oct 2007 16:03:01 +0200

Dear abinit users,

I wonder if anyone has the solution to this problem. I am trying to run
a phonon calculation on both a Cray and an IBM machine. In both cases,
it crashes complaining about the number of MPI communicators.
On an SGI machine this problem was solved by setting MPI_GROUP_MAX and
MPI_COMM_MAX
to large values, but it is not possible to explicitely do that on other
machines.

In an earlier correspondence with Michel Cote I was told that the new
version contained a fix by X. Gonze that freed most of the unused
communicators, so I was hoping it would work... alas, it doesn't.

Here is the error from the log. If anyone has any idea how to fix this,
I would really appreciate it. The error is the same for v. 5.3.5, and 5.4.4.

-P-0000
================================================================================
-P-0000 == DATASET 2
==================================================================
-P-0000
dtsetcopy : copying area algalch the actual size () of the index ()
differs from its standard size ()
dtsetcopy : copying area istwfk the actual size (À) of the index ()
differs from its standard size (`)
dtsetcopy : copying area kberry the actual size () of the index ()
differs from its standard size ()
) of the index () differs from its standard size () (
dtsetcopy : allocated densty= T
dtsetcopy : copying area kpt the actual size (À) of the index ()
differs from its standard size (`)
dtsetcopy : copying area kptns the actual size (À) of the index ()
differs from its standard size (`)
dtsetcopy : copying area mixalch the actual size () of the index ()
differs from its standard size ()
dtsetcopy : copying area mixalch the actual size () of the index ()
differs from its standard size ()
dtsetcopy : copying area occ_orig the actual size (') of the index ()
differs from its standard size (Q)
dtsetcopy : copying area shiftk the actual size ) of the index ()
differs from its standard size ()
dtsetcopy : copying area wtk the actual size (À) of the index ()
differs from its standard size (`)
mkfilename : getwfk/=0, take file _WFK from output of DATASET 1.

getdim_nloc : deduce lmnmax = 8, lnmax = 2,
lmnmaxso= 8, lnmaxso= 2.
distrb2: enter
mpi_enreg%parareel= 0
mpi_enreg%paralbd= 1
mpi_enreg%paral_compil_respfn= 0
distrb2: exit
aborting job:
Fatal error in MPI_Comm_create: Other MPI error, error stack:
MPI_Comm_create(217): MPI_Comm_create(MPI_COMM_WORLD, group=0xc8000066,
new_comm=0x33c5178) failed
MPI_Comm_create(120): Too many communicators

...repeats a few times, with different numbers for group and new_comm, and
then:

[NID 11612]Apid 20033: initiated application termination
Application 20033 exit codes: 13
Application 20033 resources: utime 0, stime 0

I even tried doing the first calculation first, outside the dataset, and
then reading from the WFK file, and again it crashes.

Thank you very much,
Alex



Archive powered by MHonArc 2.6.16.

Top of Page