Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Short summary of what was done and what was the result.


Performance Test 1

Performance Test 1: short-desciption-of-testing-here

Date last modified:

Contributors: (add your name to this list if it does not appear)

Provenance: (Run provenance Link, Code Tag, etc:)

Results: (link to results, data and plots)


How was XXX be tested? i.e. how do we know when we have met requirement XXX. Will these unit tests be included in the ongoing going forward?
Code Block
languagetext
To look at how close physgrid gets to ideal speedup of 9/4 in the
physics computations, I did a 1-month run with -pecount S on cori-knl,
to provide a reasonable number of columns per core. Relevant
high-level timers are as follows:
 
ne30np4
"CPL:RUN_LOOP"      693      693 1.031184e+06   2.330065e+06  3365.724 (   296      0)  3361.610 (   544      0)
"CPL:OCNT_RUN"      693      693 1.030491e+06   6.395224e+02     2.122 (     0      0)     0.901 (   657      0)
"CPL:ICE_RUN"       693      693 1.031184e+06   5.078033e+03     9.215 (   319      0)     5.647 (   666      0)
"CPL:LND_RUN"       693      693 1.031184e+06   2.091669e+04    34.030 (   282      0)    27.238 (   669      0)
"CPL:ATM_RUN"       693      693 1.031184e+06   1.965188e+06  3226.064 (   296      0)  2668.105 (   420      0)
"a:CAM_run1"        693      693 1.031877e+06   1.307641e+06  2267.392 (   296      0)  1713.941 (   377      0)
"a:CAM_run2"        693      693 1.031877e+06   2.497877e+05   382.013 (   296      0)   349.990 (   151      0)
"a:CAM_run3"        693      693 1.031877e+06   3.823804e+05   567.488 (   372      0)   526.157 (   182      0)
"a:CAM_run4"        693      693 1.031877e+06   2.388912e+04    36.367 (     0      0)    34.452 (   401      0)
"a:UniquePoints"    693      693 1.031877e+06   2.399564e+03     4.179 (   296      0)     2.732 (   562      0)
"a:putUniquePoints" 693      693 1.031877e+06   5.007935e+03     8.052 (   296      0)     6.124 (   562      0)
 
ne30pg2
"CPL:RUN_LOOP"      693      693 1.031184e+06   1.197055e+06  1727.589 (   145      0)  1727.089 (   532      0)
"CPL:OCNT_RUN"      693      693 1.030491e+06   6.345620e+02     2.469 (     0      0)     0.880 (   518      0)
"CPL:ICE_RUN"       693      693 1.031184e+06   4.448382e+03     7.500 (   523      0)     4.345 (   648      0)
"CPL:LND_RUN"       693      693 1.031184e+06   1.419479e+04    23.461 (     0      0)    18.509 (   585      0)
"CPL:ATM_RUN"       693      693 1.031184e+06   1.119652e+06  1649.384 (    26      0)  1541.688 (   448      0)
"a:CAM_run1"        693      693 1.031877e+06   5.988779e+05   901.520 (   692      0)   787.083 (   370      0)
"a:CAM_run2"        693      693 1.031877e+06   1.315582e+05   193.712 (   396      0)   185.811 (   676      0)
"a:CAM_run3"        693      693 1.031877e+06   3.717056e+05   543.277 (   369      0)   528.668 (   545      0)
"a:CAM_run4"        693      693 1.031877e+06   1.642631e+04    25.760 (     0      0)    23.682 (   644      0)
"a:dyn_to_fv_phys"  693      693 1.031877e+06   8.753453e+03    12.869 (   396      0)    12.479 (   654      0)
"a:fv_phys_to_dyn"  693      693 1.031877e+06   2.451420e+04    40.740 (    73      0)    32.149 (   640      0)
                                                ^ timer sum
 
The speedups based on the timer sum column are as follows:
    ideal speedup: (/ 9.0 4.0) 2.25
    run1, before coupler: (/ 1.307641e+06 5.988779e+05) 2.1834851478072577
    run2, after  coupler: (/ 2.497877e+05 1.315582e+05) 1.8986859047934677
Thus, there's a little room for improvement in run2, but not much in run1.
 
The fv_phys vs UniquePoints timers show the cost of high-order remap.


Performance Test 2

Performance Test 2: short-desciption-of-testing-here

Date last modified:

Contributors: (add your name to this list if it does not appear)


Provenance: (Run provenance Link, Code Tag, etc:)

Results: (link to results, data and plots)


How was XXX be tested? i.e. how do we know when we have met requirement XXX. Will these unit tests be included in the ongoing going forward?