OpenMP problem with restart from file

Bug reports, work arounds and fixes

Moderators: arango, robertson

Post Reply
Message
Author
User avatar
m.hadfield
Posts: 521
Joined: Tue Jul 01, 2003 4:12 am
Location: NIWA

OpenMP problem with restart from file

#1 Unread post by m.hadfield »

Attached are output files from two consecutive runs of the FLT_TEST case. In both cases ROMS (latest source) is run under OpenMP with 1x2 tiles, using Gfortran under Cygwin. The first run (rom001.log) proceeds as it should: the model is initialised analytically (NRREC=0) and runs to 1152 time steps, or 0.8 days. The second (rom002.log) starts from the restart fields saved by the first run (NRREC=-1) and initially looks OK. However after time step 1214 we have time step 63, and the time jumps back accordingly. Then after time step 73 we jump back to 1226, and from there the time step counter jumps back and forth from time to time. It finally hangs, for no obvious reason at time step 2304.

My interpretation of this is that one of the threads knows about the data read from the restart file and the other doesn't.

This misbehaviour occurs only in OpenMP runs, not in serial or MPI runs. It occurs with Cygwin/Gfortran and AIX/xlf, but not with Linux/Gfortran. Overall it looks like a matter of misplaced OpenMP directives, subject to different interpretation by the different compilers and is somewhat reminiscent (perhaps) of the OpenMP FLOAT_VWALK problem I encountered in 2012:

viewtopic.php?f=19&t=2584

I will look into it further when I can.
Attachments
rom002.log
Output file from second run (initialised from restart)
(190.85 KiB) Downloaded 263 times
rom001.log
Output file from first run (initialised analytically)
(191.96 KiB) Downloaded 261 times

User avatar
m.hadfield
Posts: 521
Joined: Tue Jul 01, 2003 4:12 am
Location: NIWA

Re: OpenMP problem with restart from file

#2 Unread post by m.hadfield »

A couple of extra data points:
  • The problem I reported also occurs in the UPWELLING test case. It's not specific to float simulations (though float simulations are where I tend to use OpenMP).
  • Contrary to what I said earlier, it does occur with Linux/Gfortran, as well as Cygwin/Gfortran and AIX/xlf.

User avatar
m.hadfield
Posts: 521
Joined: Tue Jul 01, 2003 4:12 am
Location: NIWA

Re: OpenMP problem with restart from file

#3 Unread post by m.hadfield »

See also the discussion on this thread:

viewtopic.php?t=3265

User avatar
arango
Site Admin
Posts: 1347
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: OpenMP problem with restart from file

#4 Unread post by arango »

Yes, I was able to reproduce this bug. I finally have time to check it in the debugger. I corrected the bug. See :arrow: ticket for details. Thank you for bringing this to my attention. Please update.

User avatar
m.hadfield
Posts: 521
Joined: Tue Jul 01, 2003 4:12 am
Location: NIWA

Re: OpenMP problem with restart from file

#5 Unread post by m.hadfield »

Thanks, Hernan!

Post Reply