L7_UQ_efficiency_improvements Performance assessment phase 1
This page should describe Performance Assessment Tests performed for this stand alone feature and should provide links to all the result pages.
Summary
Performance testing for coupler bypass offline land simulations vs. standard offline land simulations using DATM.
Performance test 1 - Global test
For the standard simulations with DATM:
Compset: I1850CLM45CN, Resolution: f09_f09, Machine: Titan, simulation length: 3 years
For the coupler bypass simulation:
Compset: I1850CLM45CBCN, Resolution: f09_f09, Machine: Titan, simulation length: 3 years
For simulations using 512 cores, the coupler bypass speeds up the simulation by about 30%, primarily by eliminating the time spent in DATM. The coupler bypass option improves the scalability of offline runs significantly. When using 2048 cores, the walltime used in the standard offline simulation using DATM does not change, while the coupler bypass simulation is about 3x faster than coupler bypass with 512 cores and nearly 5x faster than DATM with 512 or 2048 cores.
Performance test 2 - Single point test
For the standard simulations with DATM:
Compset: I1850CLM45CN, Resolution: CLM1PT, Machine: OICphase2, simulation length: 100 years
For the coupler bypass simulation:
Compset: I1850CLM45CBCN, Resolution: CLM1PT, Machine: OICphase2, simulation length: 100 years
The coupler bypass simulation is slightly under 2.5x faster than the standard DATM simulation on a single processor. More detailed performance data are available below. Even excluding DATM, there is a significant performance benefit when using the coupler bypass option within the land model. This is because the coupler bypass option also avoids the use of stream files for nitrogen deposition and fire input data, which are responsible for nearly half of the run time within the land model loop. Note that these simulations were conducted with a 1-hour timestep and annual output data for a subset of about 30 output variables.
Performance Test 1
Performance Test 1: Coupler bypass: Global test on Titan
Date last modified:
Contributors: Daniel Ricciuto
Provenance:
ACME git hash: 4532ebff59a9ba93332e4c024e0728918e5a84d4
Results:
Case directory location on Titan:
/lustre/atlas/proj-shared/cli112/zdr/models/ACME/cime/scripts/TEST_GLOBPERF/ (standard run with DATM)
/lustre/atlas/proj-shared/cli112/zdr/models/ACME/cime/scripts/TEST_GLOBPERF_CB/ (coupler bypass run)
CPL BYPASS | DATM | |||||||||||||||||
512 cores | 512 cores | |||||||||||||||||
TOT | Run | Time: | 3960.409 | seconds | 3.617 | seconds/mday | 65.45 | myears/wday | TOT | Run | Time: | 5875.978 | seconds | 5.366 | seconds/mday | 44.11 | myears/wday | |
LND | Run | Time: | 3787.495 | seconds | 3.459 | seconds/mday | 68.44 | myears/wday | LND | Run | Time: | 4115.522 | seconds | 3.758 | seconds/mday | 62.98 | myears/wday | |
ROF | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | ROF | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | |
ICE | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | ICE | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | |
ATM | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | ATM | Run | Time: | 1640.854 | seconds | 1.498 | seconds/mday | 157.97 | myears/wday | |
OCN | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | OCN | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | |
GLC | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | GLC | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | |
WAV | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | WAV | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | |
CPL | Run | Time: | 4.985 | seconds | 0.005 | seconds/mday | 51995.99 | myears/wday | CPL | Run | Time: | 85.33 | seconds | 0.078 | seconds/mday | 3037.62 | myears/wday | |
CPL | COMM | Time: | 1207.525 | seconds | 1.103 | seconds/mday | 214.65 | myears/wday | CPL | COMM | Time: | 1243.335 | seconds | 1.135 | seconds/mday | 208.47 | myears/wday | |
2048 cores | 2048 cores | |||||||||||||||||
TOT | Run | Time: | 1334.271 | seconds | 1.219 | seconds/mday | 194.26 | myears/wday | TOT | Run | Time: | 5829.534 | seconds | 5.324 | seconds/mday | 44.46 | myears/wday | |
LND | Run | Time: | 1121.122 | seconds | 1.024 | seconds/mday | 231.2 | myears/wday | LND | Run | Time: | 2108.362 | seconds | 1.925 | seconds/mday | 122.94 | myears/wday | |
ROF | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | ROF | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | |
ICE | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | ICE | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | |
ATM | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | ATM | Run | Time: | 3534.852 | seconds | 3.228 | seconds/mday | 73.33 | myears/wday | |
OCN | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | OCN | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | |
GLC | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | GLC | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | |
WAV | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | WAV | Run | Time: | 0 | seconds | 0 | seconds/mday | 0 | myears/wday | |
CPL | Run | Time: | 2.491 | seconds | 0.002 | seconds/mday | 104054.62 | myears/wday | CPL | Run | Time: | 74.085 | seconds | 0.068 | seconds/mday | 3498.68 | myears/wday | |
CPL | COMM | Time: | 581.574 | seconds | 0.531 | seconds/mday | 445.69 | myears/wday | CPL | COMM | Time: | 1020.487 | seconds | 0.932 | seconds/mday | 254 | myears/wday |
Performance Test 2
Performance Test 2: Coupler bypass: Single-point test on OIC
Date last modified:
Contributors: Daniel Ricciuto
Provenance:
ACME git hash: 4532ebff59a9ba93332e4c024e0728918e5a84d4
Results:
Location on OIC:
/home/zdr/models/ACME/cime/scripts/FULL_US-UMB_I1850CLM45CBCN_ad_spinup (with DATM)
/home/zdr/models/ACME/cime/scripts/FULL_US-UMB_I1850CLM45CN_ad_spinup (cpl bypass)
name | CPL BYPASS | with DATM |
CPL:INIT | 0.147 | 0.725 |
CPL:cesm_pre_init2 | 0.007 | 0.007 |
cesm_init | 0.14 | 0.717 |
CPL:RUN_LOOP_BSTART | 0 | 0 |
CPL:RUN_LOOP | 923.757 | 2202.374 |
CPL:CLOCK_ADVANCE | 80.015 | 76.334 |
CPL:RUN | 6.42 | 24.262 |
CPL:LNDPREP | 2.307 | 17.871 |
CPL:COMM | 37.527 | 55.808 |
CPL:C2L | 18.464 | 17.894 |
CPL:LND_RUN | 792.31 | 1444.823 |
lc_lnd_import | 3.472 | 0.623 |
clm_run | 697.983 | 1357.342 |
shr_orb_decl | 1.917 | 2.114 |
decomp_vert | 11.695 | 11.645 |
begcnbal | 14.599 | 13.727 |
dyn_subgrid | 57.589 | 57.35 |
begwbal | 0.642 | 0.613 |
ndep_interp | N/A | 507.868 |
ndepdyn_strd_adv_readLBUB | N/A | 2.836 |
ndepdyn_readLBUB_setup | N/A | 0.415 |
ndepdyn_readLBUB_filemgt | N/A | 0.309 |
ndepdyn_strd_adv_tint | N/A | 124.211 |
hdmdyn_strd_adv_readLBUB | N/A | 2.28 |
hdmdyn_readLBUB_setup | N/A | 0.389 |
hdmdyn_readLBUB_filemgt | N/A | 0.249 |
hdmdyn_strd_adv_tint | N/A | 63.261 |
lnfmdyn_strd_adv_readLBUB | N/A | 296.355 |
lnfmdyn_readLBUB_setup | N/A | 0.364 |
lnfmdyn_readLBUB_filemgt | N/A | 0.414 |
lnfmdyn_strd_adv_tint | N/A | 3.673 |
pdnep_interp | N/A | 0 |
pdepdyn_strd_adv_readLBUB | N/A | 2.356 |
pdepdyn_readLBUB_setup | N/A | 0.366 |
pdepdyn_readLBUB_filemgt | N/A | 0.294 |
pdepdyn_strd_adv_tint | N/A | 122.987 |
drvinit | 2.336 | 2.191 |
canhydro | 2.901 | 2.686 |
surfrad | 5.343 | 5.535 |
bgp1 | 2.054 | 2.172 |
bgflux | 2.462 | 2.555 |
canflux | 60.01 | 60.758 |
can_iter | 45.261 | 45.74 |
uflux | 6.44 | 6.896 |
bgplake | 27.759 | 28.612 |
bgc | 15.809 | 15.817 |
soiltemperature | 37.177 | 38.237 |
SoilThermProp | 4.529 | 4.635 |
SoilTempBandDiag | 6.312 | 6.65 |
PhaseChangeH2osfc | 0.414 | 0.421 |
PhaseChangebeta | 2.305 | 2.307 |
bgp2 | 3.408 | 3.465 |
bgp2_loop_1 | 0.388 | 0.384 |
bgp2_loop_2 | 0.301 | 0.314 |
bgp2_loop_3 | 0.499 | 0.496 |
bgp2_loop_4 | 0.372 | 0.336 |
patch2col | 1.869 | 1.877 |
hydro without drainage | 32.523 | 33.459 |
hylake | 7.422 | 8.481 |
snow_init | 0.326 | 0.317 |
ecosysdyn | 200.648 | 203.608 |
CNZero | 25.435 | 25.542 |
CNDeposition | 0.15 | 0.171 |
CNFixation | 2.171 | 2.161 |
CNMResp | 2.754 | 2.733 |
PDeposition | 0.136 | 0.135 |
CNAllocation | 3.474 | 3.597 |
CNDecompAlloc | 53.611 | 54.663 |
CNAllocation | 9.272 | 9.74 |
CNAllocation | 2.895 | 3.04 |
CNPhenology | 8.134 | 8.341 |
CNGResp | 0.468 | 0.455 |
CNRootDyn | 5.865 | 5.513 |
CNUpdate0 | 0.204 | 0.184 |
CNUpdate1 | 9.879 | 10.134 |
CNSoilLittVertTransp | 43.635 | 43.744 |
CNGapMortality | 6.031 | 6.137 |
CNUpdate2 | 25.285 | 25.855 |
depvel | 0.962 | 0.961 |
ch4 | 53.6 | 54.427 |
hydro2 | 4.039 | 4.195 |
PWeathering | 1.638 | 1.629 |
PAdsorption | 1.993 | 2.04 |
PDesorption | 1.868 | 1.83 |
POcclusion | 1.861 | 1.818 |
PBiochemMin | 12.121 | 11.668 |
CNUpdate3 | 1.567 | 1.625 |
PUpdate3 | 4.335 | 4.461 |
CNPsum | 25.225 | 26.307 |
balchk | 1.98 | 2.203 |
lnd2atm | 12.901 | 12.685 |
wrtdiag | 0.107 | 10.195 |
hbuf | 29.413 | 29.942 |
clm_drv_io | 7.976 | 8.557 |
clm_drv_io_htapes | 7.266 | 7.892 |
hist_htapes_wrapup_define | 3.722 | 4.161 |
hist_htapes_wrapup_tconst | 0.085 | 0.056 |
hist_htapes_wrapup_write | 0.173 | 0.185 |
lc_lnd_export | N/A | 0.393 |
lc_clm2_adv_timestep | 0.515 | 0.434 |
accum | 15.8 | 16.486 |
CPL:L2C | 18.402 | 18.329 |
CPL:LNDPOST | 0.206 | 0.27 |
CPL:FRACSET | 1.772 | 1.895 |
CPL:ATM_RUN | N/A | 589.837 |
DATM_RUN | N/A | 554.634 |
datm_run1 | N/A | 9.802 |
datm | N/A | 533.357 |
datm_strdata_advance | N/A | 470.285 |
datm_strd_adv_readLBUB | N/A | 337.459 |
datm_readLBUB_setup | N/A | 0.854 |
datm_readLBUB_filemgt | N/A | 0.778 |
datm_strd_adv_tint | N/A | 126.035 |
datm_scatter | N/A | 53.158 |
datm_mode | N/A | 0.72 |
datm_run2 | N/A | 10.739 |
CPL:A2C | N/A | 18.6 |
CPL:ATMPOST | N/A | 0.25 |
CPL:HISTORY | 0.225 | 0.242 |
CPL:TSTAMP_WRITE | 0.359 | 1.557 |
CPL:TPROF_WRITE | 0.376 | 0.544 |
lnfmdyn_readLBUB_fbound | N/A | 4.668 |
lnfmdyn_readLBUB_bcast | N/A | 0.181 |
lnfmdyn_readLBUB_LB_copy | N/A | 20.906 |
lnfmdyn_readLBUB_UB_setup | N/A | 8.125 |
lnfmdyn_readLBUB_UB_readpio | N/A | 254.055 |
lnfmdyn_strd_adv_map | N/A | 4.98 |
cnbalchk | 1.264 | 1.318 |
surfalb | 15.773 | 16.172 |
urbsurfalb | 11.176 | 11.566 |
datm_readLBUB_fbound | N/A | 4.572 |
datm_readLBUB_bcast | N/A | 0.309 |
datm_readLBUB_LB_setup | N/A | 13.269 |
datm_readLBUB_LB_readpio | N/A | 144.641 |
datm_readLBUB_UB_setup | N/A | 12.614 |
datm_readLBUB_UB_readpio | N/A | 141.815 |
datm_strd_adv_rearr | N/A | 1.858 |
datm_readLBUB_LB_copy | N/A | 0.011 |
datm_strd_adv_fill | N/A | 0.081 |
datm_strd_adv_map | N/A | 0 |
hdmdyn_readLBUB_fbound | N/A | 0 |
hdmdyn_readLBUB_bcast | N/A | 0 |
hdmdyn_readLBUB_LB_copy | N/A | 0.001 |
hdmdyn_readLBUB_UB_setup | N/A | 0 |
hdmdyn_readLBUB_UB_readpio | N/A | 0.015 |
hdmdyn_strd_adv_map | N/A | 0 |
sync1_tprof | 0 | 0 |
t_prf | 0.329 | 0.485 |
sync2_tprof | 0 | 0 |
clm_drv_io_wrest | 0.358 | 0.289 |
datm_restart | N/A | 0.186 |
CPL:RESTART | 0.032 | 0.08 |
CPL:RUN_LOOP_BSTOP | 0 | 0 |
CPL:FINAL | 0.001 | 0.008 |
DATM_FINAL | N/A | 0 |