This page should describe Performance Assessment Tests performed for this stand alone feature and should provide links to all the result pages.

Summary

Short summary of what was done and what was the result.

Performance Test 1

To look at how close physgrid gets to ideal speedup of 9/4 in the
physics computations, I did a 1-month run with -pecount S on cori-knl,
to provide a reasonable number of columns per core. Relevant
high-level timers are as follows:
 
ne30np4
"CPL:RUN_LOOP"      693      693 1.031184e+06   2.330065e+06  3365.724 (   296      0)  3361.610 (   544      0)
"CPL:OCNT_RUN"      693      693 1.030491e+06   6.395224e+02     2.122 (     0      0)     0.901 (   657      0)
"CPL:ICE_RUN"       693      693 1.031184e+06   5.078033e+03     9.215 (   319      0)     5.647 (   666      0)
"CPL:LND_RUN"       693      693 1.031184e+06   2.091669e+04    34.030 (   282      0)    27.238 (   669      0)
"CPL:ATM_RUN"       693      693 1.031184e+06   1.965188e+06  3226.064 (   296      0)  2668.105 (   420      0)
"a:CAM_run1"        693      693 1.031877e+06   1.307641e+06  2267.392 (   296      0)  1713.941 (   377      0)
"a:CAM_run2"        693      693 1.031877e+06   2.497877e+05   382.013 (   296      0)   349.990 (   151      0)
"a:CAM_run3"        693      693 1.031877e+06   3.823804e+05   567.488 (   372      0)   526.157 (   182      0)
"a:CAM_run4"        693      693 1.031877e+06   2.388912e+04    36.367 (     0      0)    34.452 (   401      0)
"a:UniquePoints"    693      693 1.031877e+06   2.399564e+03     4.179 (   296      0)     2.732 (   562      0)
"a:putUniquePoints" 693      693 1.031877e+06   5.007935e+03     8.052 (   296      0)     6.124 (   562      0)
 
ne30pg2
"CPL:RUN_LOOP"      693      693 1.031184e+06   1.197055e+06  1727.589 (   145      0)  1727.089 (   532      0)
"CPL:OCNT_RUN"      693      693 1.030491e+06   6.345620e+02     2.469 (     0      0)     0.880 (   518      0)
"CPL:ICE_RUN"       693      693 1.031184e+06   4.448382e+03     7.500 (   523      0)     4.345 (   648      0)
"CPL:LND_RUN"       693      693 1.031184e+06   1.419479e+04    23.461 (     0      0)    18.509 (   585      0)
"CPL:ATM_RUN"       693      693 1.031184e+06   1.119652e+06  1649.384 (    26      0)  1541.688 (   448      0)
"a:CAM_run1"        693      693 1.031877e+06   5.988779e+05   901.520 (   692      0)   787.083 (   370      0)
"a:CAM_run2"        693      693 1.031877e+06   1.315582e+05   193.712 (   396      0)   185.811 (   676      0)
"a:CAM_run3"        693      693 1.031877e+06   3.717056e+05   543.277 (   369      0)   528.668 (   545      0)
"a:CAM_run4"        693      693 1.031877e+06   1.642631e+04    25.760 (     0      0)    23.682 (   644      0)
"a:dyn_to_fv_phys"  693      693 1.031877e+06   8.753453e+03    12.869 (   396      0)    12.479 (   654      0)
"a:fv_phys_to_dyn"  693      693 1.031877e+06   2.451420e+04    40.740 (    73      0)    32.149 (   640      0)
                                                ^ timer sum
 
The speedups based on the timer sum column are as follows:
    ideal speedup: (/ 9.0 4.0) 2.25
    run1, before coupler: (/ 1.307641e+06 5.988779e+05) 2.1834851478072577
    run2, after  coupler: (/ 2.497877e+05 1.315582e+05) 1.8986859047934677
Thus, there's a little room for improvement in run2, but not much in run1.
 
The fv_phys vs UniquePoints timers show the cost of high-order remap.

Performance Test 2

Performance Test 2: short-desciption-of-testing-here

Date last modified:

Contributors: (add your name to this list if it does not appear)

Provenance: (Run provenance Link, Code Tag, etc:)

Results: (link to results, data and plots)

How was XXX be tested? i.e. how do we know when we have met requirement XXX. Will these unit tests be included in the ongoing going forward?

W16 Physics Grid Performance Phase 1