Opened 12 years ago

Last modified 12 years ago

#552 closed upgrade

IMPORTANT: OpenMP shared-memory directives Revisited — at Initial Version

Reported by: arango Owned by: arango
Priority: major Milestone: Release ROMS/TOMS 3.6
Component: Nonlinear Version: 3.6
Keywords: Cc:

Description

This update includes a full revision of ROMS shared-memory pragma directives using OpenMP standard. This is a very important and delicate update that requires expertise. Luckly, I doubth that will affect you customized code.

All the parallel loops of ROMS are modified to simpler directives. For example, the old strategy:

!$OMP PARALLEL DO PRIVATE(thread,subs,tile) SHARED(numthreads)
            DO thread=0,numthreads-1
              subs=NtileX(ng)*NtileE(ng)/numthreads
              DO tile=subs*thread,subs*(thread+1)-1,+1
                ...
              END DO
            END DO
!$OMP END PARALLEL DO

is replaced with:

            DO tile=first_tile(ng),last_tile(ng),+1
              ...
            END DO
!$OMP BARRIER

In shared-memory, the parallel threads are spawn at higher calling routines. For example, we now have:

!$OMP PARALLEL
      CALL main3d (RunInterval)
#endif 
!$OMP END PARALLEL

This directive is less restrictive and allows MASTER, BARRIER, and other useful OpenMP pragmas inside the parallel region. If you are interested, please see the following discussion in the Forum.

This change cleans the code and facilitates parallelization of tricky algorithms for nesting, MPDATA, random number generation, point-sources, etc using the shared-memory paradigm.

WARNINGS:

  • The values of NtileX(ng) and NtileE(ng) are no longer equal to one in distributed-memory (MPI). They have the same values as the one specified in standard input NtileI(ng) and NtileJ(ng). Notice that in the critical regions for global reduction operatios we now use instead the following code:
    #ifdef DISTRIBUTE
          NSUB=1                             ! distributed-memory
    #else
          IF (DOMAIN(ng)%SouthWest_Corner(tile).and.                        &
         &    DOMAIN(ng)%NorthEast_Corner(tile)) THEN
            NSUB=1                           ! non-tiled application
          ELSE
            NSUB=NtileX(ng)*NtileE(ng)       ! tiled application
          END IF
    #endif
    
    That is, we do a special exception for distribute-memory. This change is necessary in your customized versions of ana_grid.h and ana_psource.h.
  • Notice that few important variables of ROMS in mod_scalars.F and mod_stepping.F use the THREADPRIVATE directive in shared-memory so all the parallel threads have a private copy of such variables to avoid parallel collisions.
  • Two new variables (first_tile(ng) and last_tile(ng)) are introduced to specify the tile range in each parallel region:
          integer, allocatable :: first_tile(:)
          integer, allocatable :: last_tile(:)
    
    !$OMP THREADPRIVATE (first_tile, last_tile)
    
    These variables are specified during the initialization of ROMS kernel using:
    !$OMP PARALLEL
    #if defined _OPENMP
          MyThread=my_threadnum()
    #elif defined DISTRIBUTE
          MyThread=MyRank
    #else
          MyThread=0
    #endif
          DO ng=1,Ngrids
            chunk_size=(NtileX(ng)*NtileE(ng)+numthreads-1)/numthreads
            first_tile(ng)=MyThread*chunk_size
            last_tile (ng)=first_tile(ng)+chunk_size-1
          END DO
    !$OMP END PARALLEL
    

Many thanks to Sasha shchepetkin for suggesting this strategy. Also many thanks to Mark Hadfield for his persistence and testing.

Change History (0)

Note: See TracTickets for help on using tickets.