O_30_O OpenMP_for_MPAS_Ocean Design

The Design Document page provides a description of the algorithms, implementation and planned testing including unit, verification, validation and performance testing. Please read  Step 1.3 Performance Expectations that explains feature documentation requirements from the performance group point of view. 

Design Document

 Click here for instructions to fill up the table below ......

The first table in Design Document gives overview of this document, from this info the Design Documents Overview page is automatically created.

In the overview table below 4.Equ means Equations and Algorithms, 5.Ver means Verification, 6.Perf - Performance, 7. Val - Validation

  • Equations: Document the equations that are being solved and describe algorithms
  • Verification Plans: Define tests that will be run to show that implementation is correct and robust. Involve unit tests to cover range of inputs as well as benchmarks.
  • Performance expectations: Explain the expected performance impact from this development
  • Validation Plans: Document what process-based, stand-alone component, and coupled model runs will be performed, and with what metrics will be used to assess validity

Use the symbols below (copy and paste) to indicate if the section is in progress or done or not started.

In the table below 4.Equ means Equations and Algorithms, 5.Ver means Verification, 6.Perf - Performance, 7. Val - Validation,   (tick) - competed, (warning) - in progress, (error) - not done


Overview table for the owner and an approver of this feature

1.Description

Implementation of OpenMP threading in MPAS-Ocean
2.Owner
3.Created
 
4.Equ(tick)
5.Ver(tick)
6.Perf(tick)
7.Val(tick)
8.Approver
9.Approved Date
V1.0Declined
 Click here for Table of Contents ...

Table of Contents




OpenMP threading in MPAS-Ocean

Requirements and Design

ACME Ocean/Ice  Group

Date: September 26, 2015  

Summary

The purpose of this design is to allow mixed MPI / OpenMP domain decomposition in MPAS-Ocean. We have used flat MPI for several years now. This has allowed us to execute simulations on up to 75K processors on machines like Edison. But on other machines, such as Mira, where OpenMP threading is required to efficiently utilize computing resources, the model efficiency has suffered. Success here is an implementation of OpenMP threading to allow for heterogenous parallelism using MPI and OpenMP.


Requirements

Requirement: bit-for-bit

Date last modified: September 25, 2015  
Contributors: Doug Jacobsen (Unlicensed)


Regardless of whether decomposition is flat MPI or hybrid MPI/OpenMP, simulation results are bit-for-bit identical.


Algorithmic Formulations

Design solution: element-based OpenMP 

Date last modified: September 26, 2015
Contributors: Doug Jacobsen (Unlicensed)


After scoping and prototyping several different approaches, the algorithmic approach will employ loop-by-loop OpenMP directives.


Design and Implementation

Implementation: element-based OpenMP

Date last modified: September 26, 2015
Contributors: Doug Jacobsen (Unlicensed)


This section should detail the plan for implementing the design solution for requirement XXX. In general, this section is software-centric with a focus on software implementation. Pseudo code is appropriate in this section. Links to actual source code are appropriate. Project management items, such as svn branches, timelines and staffing are also appropriate. How do we typeset pseudo code?


Planned Verification and Unit Testing 

Verification and Unit Testing: short-desciption-of-testing-here

Date last modified: September 26, 2015
Contributors: Doug Jacobsen (Unlicensed)


Since we require bit-for-bit, verification is straightforward. We will verify bit-for-bit across the full suite of MPAS-Ocean test cases.

Planned Validation Testing 

Validation Testing: short-desciption-of-testing-here

Date last modified: September 26, 2015
Contributors: Doug Jacobsen (Unlicensed)


Since we require bit-for-bit, validation is not a part of this process.

Planned Performance Testing 

Performance Testing: short-desciption-of-testing-here

Date last modified: September 26, 2015
Contributors: Doug Jacobsen (Unlicensed)


Performance evaluation will be carried out on (at a minimum) Edison and Mira. On both platforms, we will evaluate performance across a broad range of thread-per-MPI task. An example of our testing can be found in the file below. This testing will be conducted again at completion of the implementation.