Custom Query (964 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (652 - 654 of 964)

Ticket Owner Reporter Resolution Summary
#785 arango Done Updated dynamic and automatic memory reporting
Description

In src:ticket:783, I introduced the reporting of dynamic memory and automatic memory estimates for a particular ROMS application. There is still some memory requirements that are not accounted.

The reporting is now done at the end of the computations to allow for unaccounted automatic memory. A new variable BmemMax(ng) is introduced to track the maximum automatic buffer size used in distributed-memory (MPI) exchanges. In distributed-memory applications with serial I/O, the size of the automatic, temporary buffers needed for scattering/gathering of data increases as the ROMS grid size increases. It can become a memory bottleneck with the increasing of tile partitions since every parallel CPU allocates a full copy of the data array to process. The temporary buffers are automatically allocated on stack or heap. The user has the option to activate INLINE_2DIO to process 3D and 4D arrays as 2D-slabs to reduce the memory requirements. Alternatively, one can activate PARALLEL_IO if such hardware infrastructure is available.

A new subroutine, memory.F, is introduce to compute and report ROMS dynamic and automatic memory requirements. It is called from ROMS_finalize. That is, at the end of execution.

!
!  Report dynamic memory and automatic memory requirements.
!
!$OMP PARALLEL
      CALL memory
!$OMP END PARALLEL
!
!  Close IO files.
!
      CALL close_out

      RETURN
      END SUBROUTINE ROMS_finalize

For a three-nested grid MPI application, I get:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

 Dynamic and Automatic memory (MB) usage for Grid 01:  240x104x40  tiling: 2x2

     tile          Dynamic        Automatic            USAGE      MPI-Buffers

        0           229.42            16.94           246.36             9.83
        1           230.56            16.94           247.51             9.83
        2           231.97            16.94           248.92             9.83
        3           233.14            16.94           250.08             9.83

      SUM           925.09            67.78           992.87            39.32

 Dynamic and Automatic memory (MB) usage for Grid 02:  204x216x40  tiling: 2x2

     tile          Dynamic        Automatic            USAGE      MPI-Buffers

        0           382.54            35.84           418.38            35.84
        1           380.54            35.84           416.38            35.84
        2           380.65            35.84           416.48            35.84
        3           378.66            35.84           414.50            35.84

      SUM          1522.40           143.34          1665.74           143.34

 Dynamic and Automatic memory (MB) usage for Grid 03:  276x252x40  tiling: 2x2

     tile          Dynamic        Automatic            USAGE      MPI-Buffers

        0           406.61            55.21           461.82            55.21
        1           404.30            55.21           459.50            55.21
        2           404.09            55.21           459.29            55.21
        3           401.79            55.21           456.99            55.21

      SUM          1616.79           220.83          1837.61           220.83

    TOTAL          4064.28           431.95          4496.23           403.50

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Notice that the last column reports the maximum size of the MPI-Buffers computed from BmemMax for each nested grid. It is the limiting factor in grids 2 and 3 since it is the same value reported in the Automatic column.

We will research third-party memory profiling software to see how accurated are the reported memory estimates.

#786 jcwarner Fixed Reading forcing data with DT < 1 second
Description

I am using a NetCDF forcing data file with a baroclinic DT of 0.2 sec to drive a lab test case. But ROMS does not interpolate the data correctly because in set_ngfld (same for set_2dlfd and set_3dlfd) we have:

     fac1=ANINT(Tintrp(it2,ifield,ng)-time(ng),r8)
     fac2=ANINT(time(ng)-Tintrp(it1,ifield,ng),r8)

which truncates the time interpolation weights to a whole number for the nearest second towards zero.

We got it to work for smaller than a second baroclinic timestep by using:

      fac1=ANINT((Tintrp(it2,ifield,ng)-time(ng))*SecScale,r8)
      fac2=ANINT((time(ng)-Tintrp(it1,ifield,ng))*SecScale,r8)

where SecScale=1000. That is, the time interpolation weights are rounded to the nearest millisecond instead.

The following statements at the full precision did not work:

     fac1=Tintrp(it2,ifield,ng)-time(ng)
     fac2=time(ng)-Tintrp(it1,ifield,ng)

because there can be a small value of fac1 that is negative because of roundoff, and then the interpolation is stopped by

      ELSE IF (((fac1*fac2).ge.0.0_r8).and.(fac1+fac2).gt.0.0_r8) THEN
...

indicating unbounded interpolants.


WARNING:

Notice that we no longer will get identical solutions with previous versions due to very small differences in the time interpolated fields. It does not matter much because the differences are in the order of roundoff. However, users need to be aware of such fact.

#787 arango Done Removed checking of mass fluxes conservation in nested Removed checking of mass fluxes conservation in nested refinement refinement
Description

The checking of mass fluxes between coarse and fine grids for volume conservation during refinement is no longer done by default to accelerate computations. We need to activate the C-preprocessing option NESTING_DEBUG to activate the reporting of such diagnostics. The report is very verbose in file fort.300, and it is only needed for debugging purposes.

Many thanks to John Warner for bringing it to my attention.

Batch Modify
Note: See TracBatchModify for help on using batch modify.
Note: See TracQuery for help on using queries.