sorry if this is "obvious" stuff, but I've trawled the forums and am having a hard time putting together some pieces. If someone can answer more-or-less clearly a few things, it would help me tremendously.
Basic context: I'm a sysadmin setting up a cluster to run Roms 2.2 (possibly 3.0 at a later time). Cluster is two-way dual-core opteron, gig-ether interconnect, Rocks (Redhat/centOS) platform, using PGI compiler to build (Intel/ifort seemed too hellish after first attempts for MPI)
I've got a "successful" (ie, it works but performance stinks) build of Roms thus,
-PGI compiler suite (latest and greatest version)
-MPICH for MPI, default config/install - compiled with PGI "by hand"
"performance stinks" means that as we add more CPUs the overall runtime *increases*. Thus, a 4-cpu job (single node) takes 30minutes with a test data set; then the same data set on 8-CPU run takes approx 60 minutes, and then 16 CPU it takes about 80-90minutes. In all cases they run as straight MPI-only job though, launched in identical manner. (Brief review of output suggests that "Halo exchange" is punishing us with the MPI scale-up? and also that 2d analysis phase in particular is suffering .. ? but alas I'm not really familiar with this, being a "sysadmin-type", not a "modeller-type")
I'm curious,
* What is the "recommended MPI for best performance" with roms 2.2? Roms 3.0? (one posting I've seen suggests that LAM is much better than MPICH ; now it seems LAM is replaced by "OpenMPI" though? and I see no mention of it anywhere.. and also I gather PGI may have available an integrated "tuned" MPI of some kind too? (not just a MPICH rebuild?) )
* Is there any option (now?future?) for "hybrid" builds, ie, OpenMP for SMP operation withing single SMP cluster node, but MPI job spanning multiple nodes in cluster ? I've seen discussion in forum on this topic on-and-off but haven't exactly seen clear concensus..
* IF anyone feels so inclined, pointers or "specific build hints" for a given "recommended / working well" MPI/PGI built setup .. would certainly be **VERY** welcome also. For that matter, comments on possible benefits of migrating successfully from Roms 2.2->3.X would also not be unwelcome
![Smile :-)](./images/smilies/icon_smile.gif)
(I've tried,for example to build my Roms using not just Mpich, but also tried LAM and OpenMPI. The OpenMPI build attempts have simply failed so far - odd link/library issues (?) - and my attempt to build with LAM has been semi-successful, in that I believe I have a binary now compiled which launches, but I'm not certain it actually works (more testing needed on calling / launching it properly; slight hassle since cluster uses "SSH-no-passwords-needed" not RSH which is default of LAM? ugh?)
If anyone actually recommends it, I can also do builds using ifort,not just PGI, but I gather that ifort is a bit messy to build MPI Roms .. (?)
And - of course - I will summarize and post back to this thread any findings / progress I have with this topic, in case it is of use / interest to others.
Many thanks,
--Tim Chipman