OP-E6.4 Visualizing Detailed Profiling Data and Examples of Use
Abstract
This work is an attempt to visualize how our computer simulations are engaging with parallel machines. For hand-picked specific E3SM timers in the code, save the exact start and stop times for each MPI process and write to files. A tool then analyzes the data to create a large zoomable PNG image consisting of lines or rectangles that represent when the code is in the specified timer region. For each MPI process, there will exist a stream of data points which can be plotted in a horizontal line (with a color indicating the timer) to show when the execution of the code was inside a timed event. More detail can be attained by requesting that more timers save this information (certainly there is a limit). There may be many uses of this detailed profiling data, including basic understanding of where the time is being spent, debugging an issue, and trying to improve the performance, especially when MPI communication or load balance can be improved. Several examples of how this data was used will be presented.