ROMS Revision 564 Parallel I/O problem

General scientific issues regarding ROMS

Moderators: arango, robertson

Post Reply
Message
Author
User avatar
lmkli
Posts: 24
Joined: Wed Aug 02, 2006 1:21 pm
Location: TAMU

ROMS Revision 564 Parallel I/O problem

#1 Unread post by lmkli »

Hello,

I compiled ROMS 564 with PGI fortran and netcdf4.4.1, and tried to active CPP definition HDF5 and PARALLEL_IO.
The compilation is ok, but when I ran the executable by 6 CPUs, I got error message:

ERROR: Abnormal termination: NetCDF OUTPUT.
REASON: Parallel operation on file opened for non-parallel access

I tried both openmpi and mvapich2 and got same result. The case I tested is UPWELLING.
Any hint?

Thanks.

The output details are listed below.


Model Input Parameters: ROMS/TOMS version 3.5
Friday - July 1, 2011 - 11:56:13 AM
-----------------------------------------------------------------------------

Wind-Driven Upwelling/Downwelling over a Periodic Channel Node # 1 (pid= 30525) is active.
Node # 4 (pid= 30528) is active. Node # 3 (pid= 30527) is active.


Operating system : Linux
CPU/hardware : x86_64
Compiler system : pgi
Compiler command : /public/soft/mvapich215-pgi102/bin/mpif90
Compiler flags : -O3 -Mfree

Input Script : ./ocean_upwelling.in

SVN Root URL : https://www.myroms.org/svn/src/trunk
SVN Revision :

Local Root : /public/home/lmkli/romspac/roms201106
Header Dir : /public/home/lmkli/romspac/roms201106/ROMS/Include
Header file : upwelling.h
Analytical Dir: /public/home/lmkli/romspac/roms201106/ROMS/Functionals

Resolution, Grid 01: 0401x0800x016, Parallel Nodes: 6, Tiling: 002x003


Physical Parameters, Grid: 01
=============================

1440 ntimes Number of timesteps for 3-D equations.
300.000 dt Timestep size (s) for 3-D equations.
30 ndtfast Number of timesteps for 2-D equations between
each 3D timestep.
1 ERstr Starting ensemble/perturbation run number.
1 ERend Ending ensemb Node # 2 (pid= 30526) is active. Node # 5 (pid= 30529) is active.


le/perturbation run number.
0 nrrec Number of restart records to read from disk.
T LcycleRST Switch to recycle time-records in restart file.
288 nRST Number of timesteps between the writing of data
into restart fields.
1 ninfo Number of timesteps between print of information
to standard output.
T ldefout Switch to create a new output NetCDF file(s).
72 nHIS Number of timesteps between the writing fields
into history file.
1 ntsAVG Starting timestep for the accumulation of output
time-averaged data.
72 nAVG Number of timesteps between the writing of
time-averaged data into averages file.
1 ntsDIA Starting timestep for the accumulation of output
time-averaged diagnostics data.
72 nDIA Number of timesteps between the writing of
time-averaged data into diagnostics file.
0.0000E+00 nl_tnu2(01) NLM Horizontal, harmonic mixing coefficient
(m2/s) for tracer 01: temp
0.0000E+00 nl_tnu2(02) NLM Horizontal, harmonic mixing coefficient
(m2/s) for tracer 02: salt
5.0000E+00 nl_visc2 NLM Horizontal, harmonic mixing coefficient
(m2/s) for momentum.
1.0000E-06 Akt_bak(01) Background vertical mixing coefficient (m2/s)
for tracer 01: temp
1.0000E-06 Akt_bak(02) Background vertical mixing coefficient (m2/s)
for tracer 02: salt
1.0000E-05 Akv_bak Background vertical mixing coefficient (m2/s)
for momentum.
3.0000E-04 rdrg Linear bottom drag coefficient (m/s).
3.0000E-03 rdrg2 Quadratic bottom drag coefficient.
2.0000E-02 Zob Bottom roughness (m).
2 Vtransform S-coordinate transformation equation.
4 Vstretching S-coordinate stretching function.
3.0000E+00 theta_s S-coordinate surface control parameter.
0.0000E+00 theta_b S-coordinate bottom control parameter.
25.000 Tcline S-coordinate surface/bottom layer width (m) used
in vertical coordinate stretching.
1025.000 rho0 Mean density (kg/m3) for Boussinesq approximation.
0.000 dstart Time-stamp assigned to model initialization (days).
0.00 time_ref Reference time for units attribute (yyyymmdd.dd)
0.0000E+00 Tnudg(01) Nudging/relaxation time scale (days)
for tracer 01: temp
0.0000E+00 Tnudg(02) Nudging/relaxation time scale (days)
for tracer 02: salt
0.0000E+00 Znudg Nudging/relaxation time scale (days)
for free-surface.
0.0000E+00 M2nudg Nudging/relaxation time scale (days)
for 2D momentum.
0.0000E+00 M3nudg Nudging/relaxation time scale (days)
for 3D momentum.
0.0000E+00 obcfac Factor between passive and active
open boundary conditions.
14.000 T0 Background potential temperature (C) constant.
35.000 S0 Background salinity (PSU) constant.
1027.000 R0 Background density (kg/m3) used in linear Equation
of State.
1.7000E-04 Tcoef Thermal expansion coefficient (1/Celsius).
0.0000E+00 Scoef Saline contraction coefficient (1/PSU).
1.000 gamma2 Slipperiness variable: free-slip (1.0) or
no-slip (-1.0).
T Hout(idFsur) Write out free-surface.
T Hout(idUbar) Write out 2D U-momentum component.
T Hout(idVbar) Write out 2D V-momentum component.
T Hout(idUvel) Write out 3D U-momentum component.
T Hout(idVvel) Write out 3D V-momentum component.
T Hout(idWvel) Write out W-momentum component.
T Hout(idOvel) Write out omega vertical velocity.
T Hout(idTvar) Write out tracer 01: temp
T Hout(idTvar) Write out tracer 02: salt

T Aout(idFsur) Write out averaged free-surface.
T Aout(idUbar) Write out averaged 2D U-momentum component.
T Aout(idVbar) Write out averaged 2D V-momentum component.
T Aout(idUvel) Write out averaged 3D U-momentum component.
T Aout(idVvel) Write out averaged 3D V-momentum component.
T Aout(idWvel) Write out averaged W-momentum component.
T Aout(idOvel) Write out averaged omega vertical velocity.
T Aout(idTvar) Write out averaged tracer 01: temp
T Aout(idTvar) Write out averaged tracer 02: salt

T Dout(M2rate) Write out 2D momentun acceleration.
T Dout(M2pgrd) Write out 2D momentum pressure gradient.
T Dout(M2fcor) Write out 2D momentum Coriolis force.
T Dout(M2hadv) Write out 2D momentum horizontal advection.
T Dout(M2xadv) Write out 2D momentum horizontal X-advection.
T Dout(M2yadv) Write out 2D momentum horizontal Y-advection.
T Dout(M2hvis) Write out 2D momentum horizontal viscosity.
T Dout(M2xvis) Write out 2D momentum horizontal X-viscosity.
T Dout(M2yvis) Write out 2D momentum horizontal Y-viscosity.
T Dout(M2sstr) Write out 2D momentum surface stress.
T Dout(M2bstr) Write out 2D momentum bottom stress.

T Dout(M3rate) Write out 3D momentun acceleration.
T Dout(M3pgrd) Write out 3D momentum pressure gradient.
T Dout(M3fcor) Write out 3D momentum Coriolis force.
T Dout(M3hadv) Write out 3D momentum horizontal advection.
T Dout(M3xadv) Write out 3D momentum horizontal X-advection.
T Dout(M3yadv) Write out 3D momentum horizontal Y-advection.
T Dout(M3vadv) Write out 3D momentum vertical advection.
T Dout(M3hvis) Write out 3D momentum horizontal viscosity.
T Dout(M3xvis) Write out 3D momentum horizontal X-viscosity.
T Dout(M3yvis) Write out 3D momentum horizontal Y-viscosity.
T Dout(M3vvis) Write out 3D momentum vertical viscosity.

T Dout(iTrate) Write out rate of change of tracer 01: temp
T Dout(iTrate) Write out rate of change of tracer 02: salt
T Dout(iThadv) Write out horizontal advection, tracer 01: temp
T Dout(iThadv) Write out horizontal advection, tracer 02: salt
T Dout(iTxadv) Write out horizontal X-advection, tracer 01: temp
T Dout(iTxadv) Write out horizontal X-advection, tracer 02: salt
T Dout(iTyadv) Write out horizontal Y-advection, tracer 01: temp
T Dout(iTyadv) Write out horizontal Y-advection, tracer 02: salt
T Dout(iTvadv) Write out vertical advection, tracer 01: temp
T Dout(iTvadv) Write out vertical advection, tracer 02: salt
T Dout(iThdif) Write out horizontal diffusion, tracer 01: temp
T Dout(iThdif) Write out horizontal diffusion, tracer 02: salt
T Dout(iTxdif) Write out horizontal X-diffusion, tracer 01: temp
T Dout(iTxdif) Write out horizontal X-diffusion, tracer 02: salt
T Dout(iTydif) Write out horizontal Y-diffusion , tracer 01: temp
T Dout(iTydif) Write out horizontal Y-diffusion , tracer 02: salt
T Dout(iTvdif) Write out vertical diffusion, tracer 01: temp
T Dout(iTvdif) Write out vertical diffusion, tracer 02: salt

Output/Input Files:

Output Restart File: ocean_rst.nc
Output History File: ocean_his.nc
Output Averages File: ocean_avg.nc
Output Diagnostics File: ocean_dia.nc

Tile partition information for Grid 01: 0401x0800x0016 tiling: 002x003

tile Istr Iend Jstr Jend Npts

0 1 201 1 267 858672
1 202 401 1 267 854400
2 1 201 268 534 858672
3 202 401 268 534 854400
4 1 201 535 800 855456
5 202 401 535 800 851200

Tile minimum and maximum fractional grid coordinates:
(interior points only)

tile Xmin Xmax Ymin Ymax grid

0 0.50 201.50 0.50 267.50 RHO-points
1 201.50 401.50 0.50 267.50 RHO-points
2 0.50 201.50 267.50 534.50 RHO-points
3 201.50 401.50 267.50 534.50 RHO-points
4 0.50 201.50 534.50 800.50 RHO-points
5 201.50 401.50 534.50 800.50 RHO-points

0 1.00 201.50 0.50 267.50 U-points
1 201.50 401.00 0.50 267.50 U-points
2 1.00 201.50 267.50 534.50 U-points
3 201.50 401.00 267.50 534.50 U-points
4 1.00 201.50 534.50 800.50 U-points
5 201.50 401.00 534.50 800.50 U-points

0 0.50 201.50 1.00 267.50 V-points
1 201.50 401.50 1.00 267.50 V-points
2 0.50 201.50 267.50 534.50 V-points
3 201.50 401.50 267.50 534.50 V-points
4 0.50 201.50 534.50 800.00 V-points
5 201.50 401.50 534.50 800.00 V-points

Maximum halo size in XI and ETA directions:

HaloSizeI(1) = 642
HaloSizeJ(1) = 837
TileSide(1) = 273
TileSize(1) = 56784


Activated C-preprocessing Options:

UPWELLING Wind-Driven Upwelling/Downwelling over a Periodic Channel
ANA_BSFLUX Analytical kinematic bottom salinity flux.
ANA_BTFLUX Analytical kinematic bottom temperature flux.
ANA_GRID Analytical grid set-up.
ANA_INITIAL Analytical initial conditions.
ANA_SMFLUX Analytical kinematic surface momentum flux.
ANA_SSFLUX Analytical kinematic surface salinity flux.
ANA_STFLUX Analytical kinematic surface temperature flux.
ANA_VMIX Analytical vertical mixing coefficients.
ASSUMED_SHAPE Using assumed-shape arrays.
AVERAGES Writing out time-averaged nonlinear model fields.
DIAGNOSTICS_TS Computing and writing tracer diagnostic terms.
DIAGNOSTICS_UV Computing and writing momentum diagnostic terms.
DJ_GRADPS Parabolic Splines density Jacobian (Shchepetkin, 2002).
DOUBLE_PRECISION Double precision arithmetic.
EW_PERIODIC East-West periodic boundaries.
HDF5 Creating NetCDF-4/HDF5 format files.
MIX_S_TS Mixing of tracers along constant S-surfaces.
MIX_S_UV Mixing of momentum along constant S-surfaces.
MPI MPI distributed-memory configuration.
NONLINEAR Nonlinear Model.
!NONLIN_EOS Linear Equation of State for seawater.
PARALLEL_IO Parallel I/O processing.
POWER_LAW Power-law shape time-averaging barotropic filter.
PROFILE Time profiling activated .
!RST_SINGLE Double precision fields in restart NetCDF file.
SALINITY Using salinity.
SOLVE3D Solving 3D Primitive Equations.
SPLINES Conservative parabolic spline reconstruction.
TS_U3HADVECTION Third-order upstream horizontal advection of tracers.
TS_C4VADVECTION Fourth-order centered vertical advection of tracers.
TS_DIF2 Harmonic mixing of tracers.
UV_ADV Advection of momentum.
UV_COR Coriolis term.
UV_U3HADVECTION Third-order upstream horizontal advection of 3D momentum.
UV_C4VADVECTION Fourth-order centered vertical advection of momentum.
UV_LDRAG Linear bottom stress.
UV_VIS2 Harmonic mixing of momentum.
VAR_RHO_2D Variable density barotropic mode.

Process Information:

Node # 0 (pid= 30530) is active.

INITIAL: Configuring and initializing forward nonlinear model ...


Vertical S-coordinate System:

level S-coord Cs-curve Z at hmin at hc half way at hmax

16 0.0000000 0.0000000 0.000 0.000 0.000 0.000
15 -0.0625000 -0.0019442 -0.809 -0.806 -1.348 -1.589
14 -0.1250000 -0.0078455 -1.668 -1.661 -2.966 -3.687
13 -0.1875000 -0.0179119 -2.580 -2.568 -4.867 -6.321
12 -0.2500000 -0.0324983 -3.549 -3.531 -7.077 -9.535
11 -0.3125000 -0.0521190 -4.581 -4.558 -9.630 -13.397
10 -0.3750000 -0.0774659 -5.686 -5.656 -12.573 -17.996
9 -0.4375000 -0.1094327 -6.875 -6.837 -15.967 -23.445
8 -0.5000000 -0.1491465 -8.162 -8.114 -19.889 -29.890
7 -0.5625000 -0.1980075 -9.564 -9.506 -24.435 -37.512
6 -0.6250000 -0.2577387 -11.104 -11.034 -29.721 -46.531
5 -0.6875000 -0.3304460 -12.808 -12.724 -35.892 -57.218
4 -0.7500000 -0.4186931 -14.709 -14.609 -43.121 -69.903
3 -0.8125000 -0.5255915 -16.846 -16.726 -51.622 -84.987
2 -0.8750000 -0.6549105 -19.266 -19.124 -61.651 -102.953
1 -0.9375000 -0.8112096 -22.028 -21.859 -73.518 -124.388
0 -1.0000000 -1.0000000 -25.200 -25.000 -87.600 -150.000

Time Splitting Weights: ndtfast = 30 nfast = 42

Primary Secondary Accumulated to Current Step

1-0.0008094437383769 0.0333333333333333-0.0008094437383769 0.0333333333333333
2-0.0014053566728197 0.0333603147912792-0.0022148004111966 0.0666936481246126
3-0.0017877524645903 0.0334071600137066-0.0040025528757869 0.1001008081383191
4-0.0019566842408176 0.0334667517625262-0.0059592371166046 0.1335675599008453
5-0.0019122901320372 0.0335319745705535-0.0078715272486418 0.1670995344713988
6-0.0016548570247459 0.0335957175749547-0.0095263842733877 0.2006952520463536
7-0.0011849025289723 0.0336508794757796-0.0107112868023601 0.2343461315221331
8-0.0005032751608632 0.0336903762267453-0.0112145619632232 0.2680365077488784
9 0.0003887272597151 0.0337071520654408-0.0108258347035082 0.3017436598143192
10 0.0014892209965583 0.0336941944901169-0.0093366137069498 0.3354378543044362
11 0.0027955815694920 0.0336445537902317-0.0065410321374578 0.3690824080946679
12 0.0043042707117221 0.0335513677379153-0.0022367614257357 0.4026337758325831
13 0.0060106451121704 0.0334078920475245 0.0037738836864347 0.4360416678801076
14 0.0079087469427945 0.0332075372104522 0.0116826306292293 0.4692492050905598
15 0.0099910761708920 0.0329439123123590 0.0216737068001212 0.5021931174029188
16 0.0122483446563884 0.0326108764399960 0.0339220514565096 0.5348039938429148
17 0.0146692120341107 0.0322025982847830 0.0485912634906203 0.5670065921276978
18 0.0172400033810439 0.0317136245503127 0.0658312668716642 0.5987202166780105
19 0.0199444086685725 0.0311389577709445 0.0857756755402367 0.6298591744489550
20 0.0227631639997064 0.0304741441486588 0.1085388395399431 0.6603333185976138
21 0.0256737146312911 0.0297153720153352 0.1342125541712341 0.6900486906129490
22 0.0286498597812016 0.0288595815276255 0.1628624139524358 0.7189082721405745
23 0.0316613792205220 0.0279045862015855 0.1945237931729577 0.7468128583421599
24 0.0346736416507075 0.0268492068942347 0.2291974348236652 0.7736620652363947
25 0.0376471948657328 0.0256934188392112 0.2668446296893980 0.7993554840756059
26 0.0405373376992233 0.0244385123436867 0.3073819673886213 0.8237939964192926
27 0.0432936737565710 0.0230872677537126 0.3506756411451923 0.8468812641730052
28 0.0458596469320356 0.0216441452951603 0.3965352880772279 0.8685254094681655
29 0.0481720587108285 0.0201154903974257 0.4447073467880564 0.8886408998655912
30 0.0501605672561820 0.0185097551070648 0.4948679140442384 0.9071506549726560
31 0.0517471682814031 0.0168377361985254 0.5466150823256414 0.9239883911711814
32 0.0528456577069106 0.0151128305891453 0.5994607400325520 0.9391012217603266
33 0.0533610761022577 0.0133513086655816 0.6528218161348097 0.9524525304259083
34 0.0531891349131380 0.0115726061288397 0.7060109510479476 0.9640251365547480
35 0.0522156244733761 0.0097996349650684 0.7582265755213237 0.9738247715198164
36 0.0503158038019030 0.0080591141492892 0.8085423793232267 0.9818838856691056
37 0.0473537721847154 0.0063819206892258 0.8558961515079421 0.9882658063583314
38 0.0431818225418188 0.0048034616164019 0.8990779740497609 0.9930692679747333
39 0.0376397765791564 0.0033640675316746 0.9367177506289173 0.9964333355064079
40 0.0305543017255206 0.0021094083123694 0.9672720523544379 0.9985427438187773
41 0.0217382098544504 0.0010909315881854 0.9890102622088883 0.9996336754069627
42 0.0109897377911118 0.0003663245930371 1.0000000000000000 0.9999999999999998

ndtfast, nfast = 30 42 nfast/ndtfast = 1.40000

Centers of gravity and integrals (values must be 1, 1, approx 1/2, 1, 1):

1.000000000000 1.047601458608 0.523800729304 1.000000000000 1.000000000000

Power filter parameters, Fgamma, gamma = 0.28400 0.18933

Minimum X-grid spacing, DXmin = 1.00000000E+00 km
Maximum X-grid spacing, DXmax = 1.00000000E+00 km
Minimum Y-grid spacing, DYmin = 1.00000000E+00 km
Maximum Y-grid spacing, DYmax = 1.00000000E+00 km
Minimum Z-grid spacing, DZmin = 8.08965824E-01 m
Maximum Z-grid spacing, DZmax = 2.56123321E+01 m

Minimum barotropic Courant Number = 2.22358627E-01
Maximum barotropic Courant Number = 5.42494240E-01
Maximum Coriolis Courant Number = 2.47800000E-02


Maximum grid stiffness ratios: rx0 = 6.931666E-02 (Beckmann and Haidvogel)
rx1 = 8.661243E-01 (Haney)


Initial basin volumes: TotVolume = 4.7107108807E+13 m3
MinVolume = 8.4521383562E+05 m3
MaxVolume = 2.5612332106E+07 m3
Max/Min = 3.0302783777E+01


NL ROMS/TOMS: started time-stepping: (Grid: 01 TimeSteps: 00000001 - 00001440)


STEP Day HH:MM:SS KINETIC_ENRG POTEN_ENRG TOTAL_ENRG NET_VOLUME
C => (i,j,k) Cu Cv Cw Max Speed

0 0 00:00:00 0.000000E+00 7.333540E+02 7.333540E+02 4.710711E+13
(000,000,00) 0.000000E+00 0.000000E+00 0.000000E+00 0.000000E+00
DEF_HIS - creating history file: ocean_his.nc

DEF_VAR - error while setting parallel access flag for variable: ntimes
in NetCDF file: ocean_his.nc

Elapsed CPU time (seconds):

Node # 0 CPU: 1.283
Node # 2 CPU: 1.283
Node # 3 CPU: 1.267
Node # 1 CPU: 1.282
Node # 4 CPU: 1.283
Total: 7.681

Nonlinear model elapsed time profile:

Allocation and array initialization .............. 6.194 (80.6433 %)
Ocean state initialization ....................... 0.834 (10.8605 %)
Reading of input data ............................ 0.000 ( 0.0002 %)
Processing of input data ......................... 0.005 ( 0.0625 %)
Processing of output time averaged data .......... 0.001 ( 0.0139 %)
Computation of vertical boundary conditions ...... 0.005 ( 0.0686 %)
Computation of global information integrals ...... 0.064 ( 0.8353 %)
2D/3D coupling, vertical metrics ................. 0.323 ( 4.2037 %)
Omega vertical velocity .......................... 0.081 ( 1.0587 %)
Equation of state for seawater ................... 0.128 ( 1.6677 %)
Total: 7.636 99.4143

Nonlinear model message Passage profile:

Message Passage: 2D halo exchanges ............... 0.095 ( 1.2370 %)
Message Passage: 3D halo exchanges ............... 0.104 ( 1.3563 %)
Message Passage: 4D halo exchanges ............... 0.024 ( 0.3148 %)
Message Passage: data broadcast .................. 0.000 ( 0.0003 %)
Message Passage: data reduction .................. 0.002 ( 0.0300 %)
Total: 0.226 2.9384

All percentages are with respect to total time = 7.681

ROMS/TOMS - Output NetCDF summary for Grid 01:

Analytical header files used:

ROMS/Functionals/ana_btflux.h
ROMS/Functionals/ana_grid.h
ROMS/Functionals/ana_initial.h
ROMS/Functionals/ana_smflux.h
Node # 5 CPU: 1.282
ROMS/Functionals/ana_stflux.h
ROMS/Functionals/ana_vmix.h

ROMS/TOMS - Output error ............ exit_flag: 3


ERROR: Abnormal termination: NetCDF OUTPUT.
REASON: Parallel operation on file opened for non-parallel access

User avatar
lmkli
Posts: 24
Joined: Wed Aug 02, 2006 1:21 pm
Location: TAMU

Re: ROMS Revision 564 Parallel I/O problem

#2 Unread post by lmkli »

Any body/help here?

User avatar
arango
Site Admin
Posts: 1356
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: ROMS Revision 564 Parallel I/O problem

#3 Unread post by arango »

I mentioned in the release that ROMS parellel I/O is broken with the latest versions of the NetCDF library. It works with NetCDF version 4.1.x. Unidata changed the parallel library again and they seem to have problems with collective and independent operations when writing ROMS header file. We are investigating this. I don't have the time to debug the NetCDF library again. The C-code is recursive and a complete nightmare. David will be at Unidata this week, at the NetCDF workshops, and he is going to bring this to their attention.

:idea: I recommend you to use serial I/O. It is more efficient anyway. Parallel I/O requires special computer architecture to be efficient.

User avatar
lmkli
Posts: 24
Joined: Wed Aug 02, 2006 1:21 pm
Location: TAMU

Re: ROMS Revision 564 Parallel I/O problem

#4 Unread post by lmkli »

Sorry, Arango. I think I must misunderstood something about the parallel I/O options and missed some information about the release news. Hope this feature can be ready to test.

Thank you anyway, and sorry for bothering everyone here.

Post Reply