Different upwelling results using different compilers

Report or discuss software problems and other woes

Moderators: arango, robertson

Post Reply
Message
Author
jwn4548
Posts: 8
Joined: Mon Jun 10, 2013 7:23 pm
Location: Rochester Institute of Technology

Different upwelling results using different compilers

#1 Unread post by jwn4548 »

I have been using the ocean_upwelling.in file in order to test an issue I was having with a more complicated model, in which switching between different fortran compilers is significantly impacting my results. For the upwelling case I am using ifort/mpiifort with one build and gfortran/mpif90 with the other (both run with the FFLAG -O2 because -O3 causes a segmentation fault for the ifort build).

When running the resulting oceanM files the salinity data is consistent, but the temperature difference is on the order of 10^-6 after only one timestep! This is shocking, as I would expect the difference to build from an initial difference near the precision of the compiling languages, but that does not seem to be the case (I guess because a lot of calculations are performed before even one step is taken?).

If anyone has any ideas/questions please let me know!

User avatar
kate
Posts: 4089
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Different upwelling results using different compilers

#2 Unread post by kate »

My best guess is that there is a literal constant like 0.1 that doesn't have a _r8 to force it into being double precision. You can test this by using a compiler flag like "-r8" (making sure of course to see exactly what it does). You might also investigate exactly what "-O2" means to your compilers. Does this happen at "-O0"? If those agree, does one match the "-O2" numbers?

ptimko
Posts: 35
Joined: Wed Mar 02, 2011 6:46 pm
Location: Environment and Climate Change Canada

Re: Different upwelling results using different compilers

#3 Unread post by ptimko »

I'd say that Kate's best guess is correct since a 32 bit representation of a number is only accurate to about 7 decimal places.

I'm curious why you're getting a segmentation fault with -O3 on ifort; might want to try compiling with -traceback to see where the code breaks.

jwn4548
Posts: 8
Joined: Mon Jun 10, 2013 7:23 pm
Location: Rochester Institute of Technology

Re: Different upwelling results using different compilers

#4 Unread post by jwn4548 »

I tried running the ifort build with -r8 and I tried running the gfortran build with -fdefault-real-8 and the results were the same as before (the results for either did not change as a result of adding the flags).

As for what I said about the segmentation fault using -O3 with ifort. That was occurring while trying to run a different set of inputs (for Chesapeake Bay) and does not occur for the upwelling case, so that would seem to be a separate issue.

To be thorough I checked the affect of changing -O3 to -O2 for the different builds. This also had no effect.

jwn4548
Posts: 8
Joined: Mon Jun 10, 2013 7:23 pm
Location: Rochester Institute of Technology

Re: Different upwelling results using different compilers

#5 Unread post by jwn4548 »

Oo, I forgot to mention that the ifort build will run for 001x016 (I'm running on 16 cpus) but will blow-up if I try to partition the 16 differently. I'm not sure if this gives any clues...

User avatar
arango
Site Admin
Posts: 1355
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Different upwelling results using different compilers

#6 Unread post by arango »

But why do you want to run the UPWELLING test case on 16 processors? As distributed, this application has only 41x80x16 points. It is an overkill to run this on 16 processors. If you want to play with a lot of CPUs, use the BENCHMARK application: 512x64x30, 1024x128x30, or 2048x512x30. Notice that in this application the horizontal dimensions are powers of two so you can distribute and balance equally all the tile partitions.

We keep getting this type of parallel overkill in this forum. There seems to be a misunderstanding about ROMS coarse grain parallelization, ROMS spatial discretization, and tile size. If you look correctly, the ROMS writes this information to standard output:

Code: Select all

 Tile partition information for Grid 01:  0041x0080x0016  tiling: 002x002

     tile     Istr     Iend     Jstr     Jend     Npts

        0        1       21        1       40    13440
        1       22       41        1       40    12800
        2        1       21       41       80    13440
        3       22       41       41       80    12800

jwn4548
Posts: 8
Joined: Mon Jun 10, 2013 7:23 pm
Location: Rochester Institute of Technology

Re: Different upwelling results using different compilers

#7 Unread post by jwn4548 »

My mistake! I am new to ROMS and had chosen to use UPWELLING due to the boundary conditions and I wasn't aware that I should choose the number of cpus in a way that corresponds to the grid dimensions.

Thank you for the response and I will play around with the benchmark tests.

ptimko
Posts: 35
Joined: Wed Mar 02, 2011 6:46 pm
Location: Environment and Climate Change Canada

Re: Different upwelling results using different compilers

#8 Unread post by ptimko »

With regards to choosing the number of processors to use:

You always want to make sure that the memory requirements for the job fit within the physical memory of the system you are using.

On a stand alone machine you might as well compile the code under OMP (shared memory model); assuming of course that the computer you're using has enough memory for the job. You should not specify more cores than exist on the stand alone system. If you don't have enough physical memory on a single computer than you will have to run the job on a cluster and use MPI.

When compiling under MPI to run the on a cluster using too many cores (thereby increasing the number of nodes required) can actually decrease model execution time due to the overhead required to pass information from node to node. For very large jobs you want to make sure that you select enough cores so that the amount of memory required per core multiplied by the number of cores per node does not exceed the physical memory available on each node.

If you are compiling under MPI on a stand alone machine with multiple cores specifying more cores than are physically available on the machine can also lead to problems, this should be avoided.

I'm not sure exactly how ROMS stores the arrays but IF they are mostly held in common blocks then the linux command:

size -d oceanM

should provide an estimate of the amount of memory required per MPI task.

User avatar
kate
Posts: 4089
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: Different upwelling results using different compilers

#9 Unread post by kate »

I'm not sure exactly how ROMS stores the arrays but IF they are mostly held in common blocks...
The myroms.org code stores nothing in common blocks. The large arrays are dynamically allocated.

Post Reply