Leak/Artifact at CPU Tiles

Discussion about modeling ice with ROMS

Moderators: arango, robertson

Post Reply
Message
Author
bilge.tutak
Posts: 20
Joined: Wed Jun 04, 2014 1:45 pm
Location: Istanbul Technical University

Leak/Artifact at CPU Tiles

#1 Unread post by bilge.tutak »

Hi All,

I am running a fork of Kate's ROMS-Ice from Trond https://github.com/trondkr/NS8KM-ROMS.

I am getting this weird leak like artifacts around especially xi-direction CPU tiles for ice related parameters. The model is running without! any other significant problems (as far as I can tell).
I am getting them especially on aice (ice fraction) and hice (ice thickness) variables. For example, I am not seeing these artifacts at Temperature or salinity variables.
I am attaching two figures one with 4x4 tiling and one with 8x16 tiling.

My guess is it might have something to do with the Halo ~ Ghost Points, but I can quite put my finger on it.

Can anybody suggest a fix for the problem?
Thank you.

Best,
Bilge.
Attachments
margi_sea_ice_tile_problem_128cpu.jpg
margi_sea_ice_tile_problem.jpg

User avatar
kate
Posts: 4088
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Leak/Artifact at CPU Tiles

#2 Unread post by kate »

That's a seven year old fork! I suggest you try my moldy two year old repo instead.

bilge.tutak
Posts: 20
Joined: Wed Jun 04, 2014 1:45 pm
Location: Istanbul Technical University

Re: Leak/Artifact at CPU Tiles

#3 Unread post by bilge.tutak »

HI Kate,

Actually I have started with your version of the code, but for some reason I am getting frazil ice problem at the very beginning of the simulation (around time step 3 to 60, depending on which ice parameters I open in cpp_flags).
I have retried your version again, using exact cpp_flag options, just before writing, just to make sure. No chance yet :(.

Interesting enough, the old code (Trond's) propagates both aice and hice in time except with those tiling problems :).

I will continue to look into it.

Thank you,
Bilge.

User avatar
arango
Site Admin
Posts: 1347
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Leak/Artifact at CPU Tiles

#4 Unread post by arango »

That's a classic parallel collision bug of state variables computing horizontal operators. They are tough to find and usually take much TotalView debugging time. It implies that the ice state variable illegally accesses global data belonging to another parallel tile somewhere in the DO-loops. We use local arrays to evaluate the horizontal operators in such cases. I assume that you are running in distributed memory (OpenMPI). Such errors are more fatal in shared memory (OpenMP).

User avatar
kate
Posts: 4088
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Leak/Artifact at CPU Tiles

#5 Unread post by kate »

There were some serious problems with the ice code of roughly seven years ago, since fixed in my branch. Maybe you should start over with METROMS, coupling a modern ROMS with CICE. Or better yet, join me on the MOM6-SIS2 adventure.

The frazil ice code is the canary in the coal mine - the most sensitive check for out of bounds anything like salinity and negative Hz. So yes, learn to debug.

bilge.tutak
Posts: 20
Joined: Wed Jun 04, 2014 1:45 pm
Location: Istanbul Technical University

Re: Leak/Artifact at CPU Tiles

#6 Unread post by bilge.tutak »

Kate, Hernan,

Thank you for the comments and suggestions.

I was hoping the tiling problem could have been solved much easily.
On the other hand I actually started using METROMS before trying the ROMS-ICE code. The time I have spent just to compile METROMS was painfully long.
In the end, I have a METROMS setup currently, but I have trouble with the forcing setup of the METROMS, that is maybe for another thread :roll: .

I will try to work on these cases to try and find the best solution.

Kate, I hope to move to recent endeavors like MOM6-SIS2, but I currently have a time limit that's why I am trying to move along at least with an ocean model that I am comfortable with.

Bilge.

User avatar
kate
Posts: 4088
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Leak/Artifact at CPU Tiles

#7 Unread post by kate »

If you want to check out another dead branch, I've got a metroms branch on my ROMS github repo. I did fuss with the forcing.

bilge.tutak
Posts: 20
Joined: Wed Jun 04, 2014 1:45 pm
Location: Istanbul Technical University

Re: Leak/Artifact at CPU Tiles

#8 Unread post by bilge.tutak »

Thank you Kate,

I generally check the forks along with the code, but somehow I missed your fork of metroms.

I believe your commits to exchange the atmospheric forcing from ROMS to CICE will be very helpful.

Bilge.

Post Reply