Development Reference

This document provides details on E3SM development conventions and practices.

 


Introduction

All E3SM developers should consider themselves stewards of the repository history. Our goal with defining a workflow is to improve the utility of the repository and create a useful history that can provide a tangible benefit to other developers.

As always, it’s a good idea to understand what you’re doing before you do it. This document should not take the place of understanding, but can be used to learn and remember.

How to use this document

This document expands on portions of the Development Getting Started Guide providing more detail on E3SM development workflow.  You should read the Quick Guide first, and use this document as needed.

Specific pieces of information will be colored as follows:


important -- Items colored in red mean they are important, and should not be ignored.

one time -- Items colored in green are commands to be issued once per machine.

repo once -- Items colored in orange are commands to be issued once per local repository.

common -- Items colored in bold black are commands that will be commonly used.


 Project Life Cycle  


Before getting into the git commands related to our workflow, here is an overview of the life cycle of a feature. This can be used to get a big picture that will be broken up in the following steps.



In the above image, red dots represent merge commits of features into an integration branch. Orange dots represent commits a developer makes as part of a feature branch. Blue commits represent merge commits incorporating a completed feature into the master branch. Purple commits represent tagged versions of the full model.


NOTE:

Within this diagram, blue dots have associated feature branches that are omitted for the sake of readability. However, each of them followed the same development cycle as the feature that is being focused on in the diagram. Also, red commits are made to next. Both next and master should only be modified by integrators or gatekeepers. 
 

Basic Development


Creating a new branch-new feature

New development should be carried out on a branch. New developments include the addition of a new feature (github-username/component/feature) or fixing a bug (github-username/component/bug-fix).

Your branch will typically start from master.  It may start from another developers branch or a maintenance branch.   Never start a branch from "next".

When creating a branch, please name it based on guidelines contained in Branch, Tag, and Version name conventions. You should also create a set of baselines for tests you plan to use to verify your code changes, and store these baselines in a location specific to your new branch. This procedure is described in Testing.


NOTE: 

 As seen above, E3SM utilizes a naming convention for branches. Branches have compound names separated by a “/” to describe what large scale part of E3SM the branch changes will modify. The / does not denote a directory, it just is used in naming the branch. This lets other developers know what these branches are developing and what parts of the model they should be modifying. This practice will be referred to as “namespacing a branch”


When creating a branch, multiple levels of namespacing are allowed. In E3SM, a branch should  always be namespaced first by github user in charge of the branch. Next comes the main component the feature is being developed for.  Finally, a description of the new development. For example, a branch that is implementing a new parameterization within EAM would be named:

joeuser/eam/new-parameterization

And a branch that is modifying the HOMME dynamics would be:

janeuser/homme/new-se-feature

Branch names should be as long as needed to clearly convey what the developer is working on, but they shouldn’t be overly long and descriptive. The following example is a branch name that is too long:

joeuser/cam/this-is-my-awesome-new-feature-that-does-something-cool-and-we-might-not-use-in-the-future


The following diagram depicts two branches that are created. One to fix a bug (github-usernam/component/bug-fix) and another to master a new feature (github-username/component/feature).

 

The following commands can be used to create the feature branches:


git branch github-username/component/feature

The above command assumes you are on the head of the master branch and that is where you want to start the new development.

If you want to start development from a specific tag, include the tag name as an additional argument.

git branch github-username/component/feature v1.0

When creating a new feature branch, best practice suggests you should branch from a tested version of the code.   HOWEVER, while the model is in "version 0" mode, always start development from the HEAD of master.

Creating a new branch-bug fix

Typically, a bug fix is applied to the commit that introduced the bug.  In that case, the branch creation has an additional argument:

 git branch github-username/component/bug-fix commit-with-bug

The last argument is the hash of the commit that first introduced the bug.   If that was a tagged version, you can use the tag name as for the feature branch above.

It is ok to fix multiple bugs on a single bug fix branch if it makes sense to do so (e.g. one related set of code changes can fix multiple bugs.)

In order to find the hash of the commit that introduced the bug, a developer can use:

git blame file-with-bug

This will open the file with the bug in a text editor, and annotate each line with the commit that last touched it. This can be traced back to find the commit that introduced the bug.

EXCEPTIONS:   You can start a bug-fix branch from the HEAD of master if any of the following are true:

  • The bug existed in code imported from CESM such as in the initial version added to the repo.
  • The bug was introduced prior to v0.3 (the version where the testing is working)


Creating a new branch-on a maintenance branch

The E3SM repository will have one or more active maintenance branches usually to keep adding bug fixes and machine updates to old versions.   See E3SM permanent branches for the list of maintenance branches, their purpose and what kind of changes they will accept.

If the feature, machine update or bug fix you are working on needs to go on a maintenance branch, switch to that branch before making your feature branch.  For example:

git checkout maint-1.0

This will put your working directory at the head of the maintenance branch.  You can then start a new branch as you do for master.   When you make a pull request, specify the maintenance branch as the "base".

If you need to fix a bug on a maintenance branch and the bug was introduced by a commit on the branch, you can follow the same directions as for a bug-fix branch.

If you are developing something for both a maintenance branch AND master, do the maintenance development first.  You can then ask the integrator to merge maint to master.  Or you can make 2 pull requests, one for maint and one for master.

If your development depends on something that was added to master AFTER the maintenance branch started, it can not be added to the maintenance branch.   Master "contains" the content on the maintenance branches (if they are not too old) and one can merge from maint to master.  We can not merge from master to maint.

IMPORTANT:  We do not have "next" branches for maintenance branches to test multiple features against each other.  To compensate, all development on a maintenance branch should be done sequentially.  That is, one developer makes their branch, completes the feature, tests it and then has it integrated to the branch before the next development starts.   This should not be a problem because maintenance branches are not supposed to have heavy development.   You should always test feature branches but this is more important for maintenance features since there is no "next" testing and you don't want to break the maintenance branch by integrating a broken feature.

Changing Branches

Creating a new branch will not change the current branch that is being worked on in the current working directory.

Because the git branch command does not modify the current branch in the current working directory, new commits will not be made to the new branch. In order to change the branch that is being worked on, the git checkout command is used. For example:

git checkout github-username/component/feature

will change the current branch to be github-username/component/feature.

NOTE: 
Creating a new branch and swapping to it can be done in a single command via:

git checkout -b github-username/component/feature master


Committing new files and changes as you go

While working on a feature or a bug fix, you will be editing source code files. After editing a file, you might want to commit those changes to your local repository in order to checkpoint your work.  When you make a change that you want to commit you "add" it to the git stage with this command:

git add path/to/file/in/repo

The next commit that is created will contain all changes in the stage by default.  If you create a new file, also use git add to make the repo aware of it.

A commit can have changes from multiple files. In order to commit only the changes in the staging area, the following command can be used:

git commit

NOTE:   The git add command will only stage the changes in the file at the time you issue the command. If the file is subsequently modified you need to run git add again if you want those changes to be committed.   Use git status frequently to see the status of staged and un-staged files.

git commit will open up an editor where a commit message can be written.   Please refer to the Commit and PR message template for more instructions on the E3SM commit message guidelines.

It is tempting to do development with lots of little commits like this:

f8ef2b5f75 fix
d2e3616d55 progress 
e58c74745f more progress
0be335ead0 almost working

If you make "checkpoint" commits like this, be prepared to rewrite the commit history using "git rebase" (See https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History) so that each commit has a substantive message as required in Commit and PR message template.  See 

Utilizing the repository history

 

Seeing as all of the E3SM developers are stewards of the repository history, it would be useful for developers to understand how to make use of the history they are maintaining. 

The most useful command for making use of the history is the git log command. This command has an abundance of optional arguments and this second can be used to get an idea of what you can do with git log.

The basic usage of this command gives the equivalent of an svn log. Where you get a list of all commits and their commit messages. By default, git log only shows commits that are reachable in the history of the HEAD. Optional arguments can be used to look at different (or all) branches.

 One extremely useful version of git log gives a command line graph of the history.

 

git log --oneline --graph --decorate

  

git log can also only show local branches that match a certain naming convention. i.e.

 

 git log --branches=*pattern*

 

 The --branches option can be replaced with --remotes to only show remote tracking branches that match a pattern.

 

 A specific commit has three pieces of information. The first is the commit’s tree (i.e. the files and directories contained in the commit), the second is the commit’s message (the commit message issued when creating the commit), and the third is a list of the commit’s parents. There can be multiple parents for a single commit, which would imply the commit was a merge commit. Git stores the order of the parents, with the first parent always being the commit the merge was initiated from. This is useful to note, because within our workflow first parent commits on master should always be merge commits bringing in features or bug fixes. git log can be used to narrow your view to only see first parent commits as well, via:

 

git log --first-parent

 

 As you develop a new feature, you might migrate files from one place to another, or even rename the file. In order to have git show you all renames of a file, you can use the following command:

  

git log --name-only --follow -- path/to/file

 Any of these options can be combined to give more flexibility to exploring the history of a git repository and make the history more useful.

Advanced: Incorporating another branch on to your branch.

When developing a new feature, another developer might have created a feature you require for your work, or another developer might have made large changes to the interfaces you make use of. This section will describe how to incorporate those changes into your branch.


NOTE: 
This action can have negative consequences and cause issues in the future, so it is important to know what you’re doing prior to doing any of these.

Cherry-pick method

The first option for incorporating changes from another development line into your development history is via a cherry-pick. A cherry-pick will copy individual commits (or a range of commits) from one point in history onto the HEAD of your current development line. It should be used when the number of commits your future development efforts depend on are small (order 1-10). It can be used as follows:

git cherry-pick <commit-ish>


This method is the easiest to use and one of the least likely to create issues in future development, and allows the most flexible review modifications. One downside to this method, however, is that the commits that are cherry-picked will now occur twice in the history.

Merge Method

The second option for incorporating changes into your development history is via a merge. A merge will attempt to merge an arbitrary number of other commits (or branches) into your current branch. It can be used as follows:

git merge [commit-ish1] [commit-is2] …


In this case, you can list how ever many commits you want on this line, and git will attempt to merge them all into the commit you are currently working on (into your working directory).

The merge will allow you to enter a commit message. The commit message should be very descriptive. It should explain what you’re merging, and why you’re merging it.

NOTE:

A merge creates a fixed point in history. The merge commit fixes all history before it, which limits possibilities for cleaning up history during a code review. A merge should be avoided if at all possible, but in some cases it is necessary. If it is necessary, try to clean up all history prior to the merge before beginning the merge.

Rebase Method

The third and final option for incorporating changes into your development history is via a rebase. A rebase allows you to modify commits. This includes operation such as squashing, re-ordering, deleting, etc. When you create a branch, the base of the branch is the commit you branched off of. A rebase allows you to modify what the base of your branch is. The following diagram can be used to visualize a rebase:



When performing a rebase, you specify the new base for your work. Rebase will then replay your work onto the new base. For example:

git rebase -i new-base

Will replay the history from the current branch on top of the new-base. The -i flag in this case allows interactive modification of the commits to replay. It should be used to verify what is going to happen after the rebase.


NOTE:

Commits that occur prior to a merge cannot be modified, or rebased. In general, if you previously merged within your branch (i.e. to incorporate an “external feature”) or if your branch was previously merged into another branch, you should not rebase anymore, because it will cause conflicts during future work.

Finishing Up

Submitting a pull request (PR)

Once you think development is finished on your branch, you should push the branch to the shared repository (if you haven't already), and submit a pull request (PR) for the integration of your feature into master. 

A pull request initiates a process to merge your branch into master. The process will be documented with the subsequent code review via discussion and comments on the PR.

  1. Commit all your changes locally.
  2. Clean up any "checkpoint" commits.   After development is done, may have commits that look like:
    f8ef2b5f75 fix
    d2e3616d55 progress
    e58c74745f more progress
    0be335ead0 almost working
    We do NOT want these in our commit history.  Use git rebase (see https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History) to rewrite the history of your branch and group these in to substantive commits that follow our Commit and PR message template
  3. Push the cleaned-up branch to the github repo.
  4. go to https://github.com/E3SM-Project/E3SM/branches and find your branch.  Click on  "New pull request".
  5. You should verify base and compare of your pull request are correct.  The "base" is master and "compare" is your branch.
    1. When updating submodules:
      1. submit a PR to submodule's repo first: e.g. a PR to https://github.com/E3SM-Project/kokkos
        1. if you can't submit a PR, request "push" access from the developers of the submodule
      2. after the PR is merged, submit a PR to https://github.com/E3SM-Project/E3SM updating submodule hash
  6. Enter a PR Title:  Make sure the title is 70 characters or less and explains the PR in a "verb noun" format like "add cool new feature". Do not just use the branch name or what is filled in by default.  (If your branch has a single commit, the title of that commit will be the default, but if your branch has multiple commits, the branch name will be the default.)  You should edit the title to make sure it is descriptive enough. DO NOT include github issue #'s or JIRA tasks in the title. Hyperlinking does not work from the title.
  7. Write a PR Description.  
    1. In the "Write" field, provide a descriptive message of what all the commits in the PR will do together.  This description will be used as the commit message when the branch is merged so follow the Commit and PR message template guidelines.
    2. If a bug fix:  after the description, close any Github issue numbers for the bugs this commit fixes using keywords like "Fixes #123".  See https://help.github.com/articles/closing-issues-using-keywords/
    3. After the description and any issue numbers,  add a keyword indicating how the PR might change answers.  
      1. [BFB] or [non-BFB] or [CC]:  Add ONE of these keys to indicate if this commit will affect testing results to roundoff [non-BFB] or climate changing [CC]. Use [BFB] if-and-only-f commit is bit-for-bit and you know all the tests will pass without regenerating baselines.
      2. [FCC]:   Add [FCC] if the commit will change climate if a flag is activated
      3. [NML]:  Add [NML] if the commit introduces changes to the namelist.
    4. The reviewer will review the description as part of the pull request. The description will be used in the commit message used when the PR is merged to next and master.     The integrator may work with you to edit the PR description.  Do not include Confluence URL's in the PR description.
  8. Give PR label(s) ('Label' pull down menu).  Add a label for the component this PR involves.  Add the "bug fix" label if this is a bug fix.  Add the label(s) for BFB, non-BFB, CC, FCC, NML as appropriate.
  9. Assign a single Integrator to manage the PR ( use the 'Assignee' pull down menu)
  10. If you want additional reviewers, add them using the Reviewer pull-down menu.  Anyone can be asked to review.
  11. When complete, click on "Create pull request".  This will start the code review and the process of moving this feature to master.  
  12. After pull request is created, add  the following in the Comments section:
    1. In the first comment, provide a link to the Design Document governing this PR.   See /wiki/spaces/CNCL/pages/25231511.
    2. In subsequent comments.  
      1. Provide information to aid the Integrator in running, testing, and validating the feature (but that is too specific to be included in the general PR description). 
      2. Say how the feature was tested.  Example:  "e3sm_developer on Titan passed".  If a test is expected to fail and required redoing baselines, state that and list the tests that fail.

Your PR is not finished until it has been merged to master by the Integrator.

This document can be used to help with pull request related issues.

Integrator Code Review (Phase 3 of /wiki/spaces/CNCL/pages/25231511)

Issuing a PR and review by an integrator is done in Phase 3 of ACME's /wiki/spaces/CNCL/pages/25231511.

Phase 3 code reviews are conducted online on GitHub using comments on the pull request.

Integrator code review steps

  1. Check the github entry for the PR and make sure it has a good title and description, correct labels and a comment with a link to the Design Document.   A PR can not be merged to next or master unless it has a Design Document with Phase 1 and 2 completed. See /wiki/spaces/CNCL/pages/25231511.
  2. Look at the code changes either on github or using:  git log --reverse -p master.. on your checked out copy of the branch.
    1. Does new code hold up to visual inspection for code quality?    Look over code changes for glaring mistakes or code style issues (e.g. useful comments, reasonable subroutine lengths, new code in an existing file follows conventions of that file).
    2. Check to see if the description of the code changes in the PR match the actual changes.  Make sure nothing unrelated to the PR was committed accidentally.
    3. Although they can't be changed, see if commit messages on the branch follow the Commit and PR message template and let the developer know if they can not.  Consider asking the developer to squash commits to clean up the history.
    4. Have tests been added or suggested that exercise this feature?
    5. Does code run on all platforms after integration into next?

If there are any problems,