W10 NH Dycore with SL transport Performance Phase 1

This page should describe Performance Assessment Tests performed for this stand alone feature and should provide links to all the result pages.

Date last modified:  

Contributors: Andrew Bradley, Oksana Guba

Summary

The NH dycore with SL transport is significantly faster at all node counts and at resolutions ranging from 100km to 13km.    

Performance Test 1    Standalone dycore


SL performance was measured at full scale on Edison and Cori-KNL. Results are documented in these slides. The first figure below is from those slides. The second figure shows SL performance when coupled to preqx, theta-l hydrostatic, and thetal-l nonydrostatic dycores/modes. In the first figure, there are 40 tracers; in the second, only 10.


Performance Test 2: theta + sl in an FC compset 


Performance for coupled runs from /wiki/spaces/COM/pages/941096965 is evaluated from timers for FC simulations on 75 anvil nodes (2700 mpi ranks, 2 homme elements per rank). We compare two runs, both use theta dycore, one uses Euler transport, another one uses SL.

Run with theta dycore and Euler tracers, default settings for divergence damping: cam_run3 (total homme time)=12613, dynamics in homme=2897, adv+remap in homme=8951

Run 20, as in the page above, with theta dycore and SL transport, almost optimal configuration for divergence damping: cam_run3 (total homme time)=3654, dynamics in homme=1887, adv+remap in homme=1618

Overall, total saving in homme time is of more than factor of 3 for 1 degree run. (Once we have the most tuned configuration, we will post its timers here, and maybe compare them with default preqx run).