During the V2 development process, we transitioned from CLUBB V1 to CLUBB V2. In CLUBB V2, a process controlled by c_k10 was modified to add finer control with c_k10 and c_k10h. The old behavior was recovered if c_k10h = c_k10 = 0.35. In V2 development, since c_k10h was a new parameter it was not set in the namelist and inheriting a default value of c_k10h = 1.0. This resulted in an unexpected degradation in the climate, which was not detected for several months. For more detailed background, see V2 Case Studies
Can E3SM’s NBFB tests detect this difference? And how sensitive are they to changes in c_k10h?
Additional test (harder to perform, since CLUBB V1 is on in older code base): Will the NBFB tests consider simulations with CLUBB V1 (c_k10=0.35) statistically similar to CLUBB V2 (c_k10h=c_k10=0.35)?
Tests were run on Compy. All tests are first run with “-g” to generate baselines with E3SM master as of 2021/11/8. Then they are rerun with “-c” (compare to baseline), with various values of c_k10h
RESULTS: All tests pass with roundoff level changes to c_k10h, and fail (detect statistically different results) with small changes in c_k10h. With the default thresholds, PGN is the most sensitive (detecting a statistical difference with a 1e-8 change, followed by TSC at 1e-3, and MVK at 2e-2). These results are correlated with the timescale of each test, with PGN looking at physics columns after 1 timstep, TSC time step convergence with 300 timesteps, and MVK examining 1 year climatologies.
MVK_P24x1.ne4_oQU240.F2010-CICE
30 member ensemble of ~ 1 year simulations. Takes about 1.3 hours on 18 nodes.
c_10kh | Test Result | TestStatus.log Metrics threshold=13 |
---|---|---|
0.35 (default) | PASS | |
0.36 | PASS | reject 7/121 |
0.38 | FAIL | reject 21/121 |
0.40 | FAIL | reject 50/121 |
Hack to reuse same “-c” case to run multiple experiments:
rm -f run/*.nc (otherwise we get PIO run time errors)
add “clubb_c_k10h=0.40” to user_nl_eam_???? files
./case.submit
PGN_P32x1.ne4_oQU240.F2010
20 member ensemble of ~ 1 timestep simulations. Takes about 1 min on 16 nodes.
c_10kh | Test Result | TestStatus.log Metrics T test (t,p) |
---|---|---|
0.35 (default) | PASS | (0.000, 1.000) |
0.350000001d0 | PASS | (-1.424, 0.169) |
0.35000001 | FAIL | (-2.542, 0.019); |
0.36 | FAIL | (-12.564, 0.000); |
Hack to reuse same “-c” case to run multiple experiments:
rm -f run/*.nc (otherwise we get PIO run time errors)
add “clubb_c_k10h=0.40” to user_nl_eam_???? files
./case.submit
TSC_P36x1.ne4_ne4.F2010-CICE
12 member ensemble of 5 day simulations. Takes about 10min on 11 nodes.
c_10kh | Test Result (possible bug in scripts? fails for all values except when results are bfb) | Alternative test result: PASS = all values in P_min plot > PASS threshold | TestStatus.log Metrics region by region results Global, Land, Ocean |
---|---|---|---|
0.35 (default) | PASS | PASS | PASS, PASS, PASS |
0.350001 | FAIL | PASS | PASS, PASS, PASS |
0.35001 | FAIL | PASS | FAIL, PASS, PASS pmin plot |
0.3501 | FAIL | PASS | PASS, PASS, PASS pmin plot |
0.351 | FAIL | FAIL | PASS, FAIL, PASS pmin plot |
0.36 | FAIL | FAIL | FAIL, FAIL, FAIL |
Hack to reuse a “-c” case to run multiple experiments used in MVK and PGN tests does not work. For TSC, during the RUN phase, the user_nl_eam_???? ensemble member namelists will be created anew by cime/scripts/lib/CIME/SystemTests/tsc.py. This script can be edited to append user_nl_eam to each user_nl_eam_???? file, and then one can set parameters in user_nl_eam.
link to PASS/FAIL post processing script:
https://github.com/LIVVkit/evv4esm/blob/master/evv4esm/extensions/tsc.py