This page should describe Performance Assessment Tests performed for this stand alone feature and should provide links to all the result pages.
Summary
Short summary of what was done and what was the result.
Performance Test 1
To look at how close physgrid gets to ideal speedup of 9/4 in the physics computations, I did a 1-month run with -pecount S on cori-knl, to provide a reasonable number of columns per core. Relevant high-level timers are as follows: ne30np4 "CPL:RUN_LOOP" 693 693 1.031184e+06 2.330065e+06 3365.724 ( 296 0) 3361.610 ( 544 0) "CPL:OCNT_RUN" 693 693 1.030491e+06 6.395224e+02 2.122 ( 0 0) 0.901 ( 657 0) "CPL:ICE_RUN" 693 693 1.031184e+06 5.078033e+03 9.215 ( 319 0) 5.647 ( 666 0) "CPL:LND_RUN" 693 693 1.031184e+06 2.091669e+04 34.030 ( 282 0) 27.238 ( 669 0) "CPL:ATM_RUN" 693 693 1.031184e+06 1.965188e+06 3226.064 ( 296 0) 2668.105 ( 420 0) "a:CAM_run1" 693 693 1.031877e+06 1.307641e+06 2267.392 ( 296 0) 1713.941 ( 377 0) "a:CAM_run2" 693 693 1.031877e+06 2.497877e+05 382.013 ( 296 0) 349.990 ( 151 0) "a:CAM_run3" 693 693 1.031877e+06 3.823804e+05 567.488 ( 372 0) 526.157 ( 182 0) "a:CAM_run4" 693 693 1.031877e+06 2.388912e+04 36.367 ( 0 0) 34.452 ( 401 0) "a:UniquePoints" 693 693 1.031877e+06 2.399564e+03 4.179 ( 296 0) 2.732 ( 562 0) "a:putUniquePoints" 693 693 1.031877e+06 5.007935e+03 8.052 ( 296 0) 6.124 ( 562 0) ne30pg2 "CPL:RUN_LOOP" 693 693 1.031184e+06 1.197055e+06 1727.589 ( 145 0) 1727.089 ( 532 0) "CPL:OCNT_RUN" 693 693 1.030491e+06 6.345620e+02 2.469 ( 0 0) 0.880 ( 518 0) "CPL:ICE_RUN" 693 693 1.031184e+06 4.448382e+03 7.500 ( 523 0) 4.345 ( 648 0) "CPL:LND_RUN" 693 693 1.031184e+06 1.419479e+04 23.461 ( 0 0) 18.509 ( 585 0) "CPL:ATM_RUN" 693 693 1.031184e+06 1.119652e+06 1649.384 ( 26 0) 1541.688 ( 448 0) "a:CAM_run1" 693 693 1.031877e+06 5.988779e+05 901.520 ( 692 0) 787.083 ( 370 0) "a:CAM_run2" 693 693 1.031877e+06 1.315582e+05 193.712 ( 396 0) 185.811 ( 676 0) "a:CAM_run3" 693 693 1.031877e+06 3.717056e+05 543.277 ( 369 0) 528.668 ( 545 0) "a:CAM_run4" 693 693 1.031877e+06 1.642631e+04 25.760 ( 0 0) 23.682 ( 644 0) "a:dyn_to_fv_phys" 693 693 1.031877e+06 8.753453e+03 12.869 ( 396 0) 12.479 ( 654 0) "a:fv_phys_to_dyn" 693 693 1.031877e+06 2.451420e+04 40.740 ( 73 0) 32.149 ( 640 0) ^ timer sum The speedups based on the timer sum column are as follows: ideal speedup: (/ 9.0 4.0) 2.25 run1, before coupler: (/ 1.307641e+06 5.988779e+05) 2.1834851478072577 run2, after coupler: (/ 2.497877e+05 1.315582e+05) 1.8986859047934677 Thus, there's a little room for improvement in run2, but not much in run1. The fv_phys vs UniquePoints timers show the cost of high-order remap.
Performance Test 2
Performance Test 2: short-desciption-of-testing-here
Date last modified:
Contributors: (add your name to this list if it does not appear)
Provenance: (Run provenance Link, Code Tag, etc:)
Results: (link to results, data and plots)
How was XXX be tested? i.e. how do we know when we have met requirement XXX. Will these unit tests be included in the ongoing going forward?