WRT_AVG when restarting

General scientific issues regarding ROMS

Moderators: arango, robertson

Post Reply
Message
Author
lolhsson
Posts: 23
Joined: Wed Jun 02, 2010 9:07 pm
Location: UC Berkeley

WRT_AVG when restarting

#1 Unread post by lolhsson »

Hi everyone!

I've run into an interesting problem: my ROMS run seems to be hanging, post-restart, on its first attempt to write a new timepoint into ocean_avg.nc (one model day after restarting). I think I've done everything normally; nrrec is -1, my ininame is my restart file, ldefout went to false to append to existing output files. And WRT_HIS worked quite normally, putting output into ocean_his.nc; it's WRT_AVG that doesn't seem to even be called.

This is with PERFECT_RESTART on, for the curious. This isn't a wetting/drying run, though it does have source terms on for a user-defined river input.

The logfile looks like this:

It finishes initializing, writes into history, and starts trucking along:
******* 6581 00:00:00 1.971709E-03 1.416971E+04 1.416971E+04 2.395136E+14
(123,554,40) 2.000008E-02 1.213839E-02 0.000000E+00 7.773813E-01
DEF_HIS - inquiring history file: /glade/scratch/lolhsson/hadleytest/ocean_his.nc
WRT_HIS - wrote history fields (Index=1,1) into time record = 0000006
DEF_AVG - inquiring average file: /glade/scratch/lolhsson/hadleytest/ocean_avg.nc
******* 6581 00:00:30 1.971825E-03 1.416971E+04 1.416971E+04 2.395136E+14
(161,421,28) 4.630709E-03 8.797220E-03 5.644617E-02 8.100294E-01
Then, at the end of the day:
******* 6581 23:59:00 2.151098E-03 1.416963E+04 1.416963E+04 2.395127E+14
(160,421,02) 9.288979E-03 9.841658E-03 3.532372E-02 8.531022E-01
******* 6581 23:59:30 2.151190E-03 1.416963E+04 1.416963E+04 2.395127E+14
(160,421,02) 9.281945E-03 9.844583E-03 3.528618E-02 8.529353E-01
******* 6582 00:00:00 2.151282E-03 1.416963E+04 1.416963E+04 2.395127E+14
(160,421,02) 9.274921E-03 9.847489E-03 3.524867E-02 8.531803E-01
WRT_HIS - wrote history fields (Index=1,1) into time record = 0000007
And then it stops (at least, in terms of output to the logfile), and does nothing, until the walltime runs out and the run is terminated externally.

I hadn't thought much of DEF_AVG inquiring ocean_avg.nc but not trying to write; it made sense to me that it had nothing to write yet, that it had to average another day's worth of data before it would try to write something new. But perhaps the problem is starting there.

Has anyone ever encountered a similar problem?

UPDATE: For a first step in troubleshooting, I set ldefout back to T and had it start writing a new set of his/avg files. It did so beautifully at the end of the first day and then continued on normally. Hmmm...

Post Reply