forum@abinit.org
Subject: The ABINIT Users Mailing List ( CLOSED )
List archive
- From: "Ludwig, Christian" <ludwigc@uni-mainz.de>
- To: "forum@abinit.org" <forum@abinit.org>
- Subject: [abinit-forum] Response Function calculation Help
- Date: Mon, 7 Dec 2009 10:37:24 +0100
- Accept-language: de-DE
- Acceptlanguage: de-DE
Hello,
I am using abinit for parallel calculations on a cluster of x86 nodes. The
nodes have 4 and 8 processors and I am using one or multiple nodes for one
job. Sometimes everything works fine, but sometimes I get the following.
After some Broyden Steps I get the message
"At Broyd/MD step 3, gradients are converged :
max grad (force/stress) = 9.3796E-04 < tolmxf= 1.0000E-03 ha/bohr (free
atoms)"
This is the last thing written to abinit.out. When I look into the log-file,
after this message I find
From Broyden minimization, approx. inverse Hessian(ndim,ndim):
1
2.61406366996692E-03 1.57678343240617E-03 0.00000000000000E+00
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
...
...
...
13
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
2.61406366996692E-03 1.57678343240617E-03 0.00000000000000E+00
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
0.00000000000000E+00 0.00000000000000E+00 0.00000000000000E+00
forrtl: Resource temporarily unavailable
forrtl: severe (38): error during write, unit 6, file stdout
Image PC Routine Line Source
abinip 00000000014249FA Unknown Unknown Unknown
abinip 0000000001423BFA Unknown Unknown Unknown
abinip 00000000013CC4AA Unknown Unknown Unknown
abinip 000000000138B932 Unknown Unknown Unknown
abinip 000000000138AF4E Unknown Unknown Unknown
abinip 00000000013C3F3E Unknown Unknown Unknown
abinip 00000000013C1C7A Unknown Unknown Unknown
abinip 00000000012825C3 Unknown Unknown Unknown
abinip 0000000001281F33 Unknown Unknown Unknown
abinip 000000000051E24D Unknown Unknown Unknown
abinip 000000000041FD4C Unknown Unknown Unknown
abinip 000000000040E2F5 Unknown Unknown Unknown
abinip 00000000004053F0 Unknown Unknown Unknown
abinip 000000000040030E Unknown Unknown Unknown
abinip 000000000142D65E Unknown Unknown Unknown
abinip 0000000000400229 Unknown Unknown Unknown
p4_11598: p4_error: net_recv read: probable EOF on socket: 1
p1_4097: p4_error: net_recv read: probable EOF on socket: 1
bm_list_11557: (125868.613281) wakeup_slave: unable to interrupt slave 0 pid
11534
It looks like the connection between master and node is lost. But why does
this happen regularly and _always_ after convergence is reached and the
Inverse Hessians are written to the log-file?
Since the calculations are converged, I want to use the data. Is there a way
to extract energy and str_relax.out from an incomplete file?
Thanks for your help.
Cheers,
Christian
- [abinit-forum] Response Function calculation Help, Alex Huang, 12/06/2009
- Re: [abinit-forum] Response Function calculation Help, H.Y Liu, 12/07/2009
- [abinit-forum] Response Function calculation Help, Ludwig, Christian, 12/07/2009
- Re: [abinit-forum] Response Function calculation Help, H.Y Liu, 12/07/2009
Archive powered by MHonArc 2.6.16.