Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Remove complicated instructions for updating fork

This document provides details on E3SM development conventions and practices.

NOTE:

This document does not discuss the use of forks. If a developer understands and prefers to use a fork, it is recommended that they make use of forks.

 

Table of Contents

Introduction

...

important -- Items colored in red mean they are important, and should not be ignored.

one time -- Items colored in green are commands to be issued once per machine.

repo once -- Items colored in orange are commands to be issued once per local repository.

...

git checkout -b github-username/component/feature master

Utilizing the repository history

 

Seeing as all of the E3SM developers are stewards of the repository history, it would be useful for developers to understand how to make use of the history they are maintaining. 

The most useful command for making use of the history is the git log command. This command has an abundance of optional arguments and this second can be used to get an idea of what you can do with git log.

The basic usage of this command gives the equivalent of an svn log. Where you get a list of all commits and their commit messages. By default, git log only shows commits that are reachable in the history of the HEAD. Optional arguments can be used to look at different (or all) branches.

 One extremely useful version of git log gives a command line graph of the history.

 

git log --oneline --graph --decorate

  

git log can also only show local branches that match a certain naming convention. i.e.

 

 git log --branches=*pattern*

 

 The --branches option can be replaced with --remotes to only show remote tracking branches that match a pattern.

 

 A specific commit has three pieces of information. The first is the commit’s tree (i.e. the files and directories contained in the commit), the second is the commit’s message (the commit message issued when creating the commit), and the third is a list of the commit’s parents. There can be multiple parents for a single commit, which would imply the commit was a merge commit. Git stores the order of the parents, with the first parent always being the commit the merge was initiated from. This is useful to note, because within our workflow first parent commits on master should always be merge commits bringing in features or bug fixes. git log can be used to narrow your view to only see first parent commits as well, via:

 

git log --first-parent

 

 As you develop a new feature, you might migrate files from one place to another, or even rename the file. In order to have git show you all renames of a file, you can use the following command:

  

git log --name-only --follow -- path/to/file

 Any of these options can be combined to give more flexibility to exploring the history of a git repository and make the history more useful.

Committing new files and changes as you go

While working on a feature or a bug fix, you will be editing source code files. After editing a file, you might want to commit those changes to your local repository in order to checkpoint your work, or track the change you made. By default, git doesn’t track any files in the repository unless they are explicitly added to the repository. In order to add a file to the repository, you can use the following command:

git add path/to/file/in/repo

In addition to causing git to track the file, this will also stage all changes in the file. The next commit that is created will contain all changes in the stage by default. In order to commit only the changes in the staging area, the following command can be used:

git commit

NOTE:   The git add command will only stage the changes in the file at the time of the git add. If the file is subsequently modified after a git add, and then a git commit is issued, the commit will not contain changes between the git add and the git commit.   Use git status frequently to see the status of staged and un-staged files.

git commit will open up an editor where a commit message can be written.   Please refer to the Commit and PR message template for more instructions on the E3SM commit message guidelines.

An alternative way of committing all changes to any files that are currently tracked by the repository is:

git commit -a

This will again open an editor for a commit message to be entered and the Commit and PR message template should again be followed.
 
 

Advanced: Incorporating another branch on to your branch.

When developing a new feature, another developer might have created a feature you require for your work, or another developer might have made large changes to the interfaces you make use of. This section will describe how to incorporate those changes into your branch.

NOTE: 
This action can have negative consequences and cause issues in the future, so it is important to know what you’re doing prior to doing any of these.

Cherry-pick method

The first option for incorporating changes from another development line into your development history is via a cherry-pick. A cherry-pick will copy individual commits (or a range of commits) from one point in history onto the HEAD of your current development line. It should be used when the number of commits your future development efforts depend on are small (order 1-10). It can be used as follows:

git cherry-pick <commit-ish>

This method is the easiest to use and one of the least likely to create issues in future development, and allows the most flexible review modifications. One downside to this method, however, is that the commits that are cherry-picked will now occur twice in the history.

Merge Method

The second option for incorporating changes into your development history is via a merge. A merge will attempt to merge an arbitrary number of other commits (or branches) into your current branch. It can be used as follows:

git merge [commit-ish1] [commit-is2] …

In this case, you can list how ever many commits you want on this line, and git will attempt to merge them all into the commit you are currently working on (into your working directory).

The merge will allow you to enter a commit message. The commit message should be very descriptive. It should explain what you’re merging, and why you’re merging it.

NOTE:

A merge creates a fixed point in history. The merge commit fixes all history before it, which limits possibilities for cleaning up history during a code review. A merge should be avoided if at all possible, but in some cases it is necessary. If it is necessary, try to clean up all history prior to the merge before beginning the merge.

Rebase Method

...


Committing new files and changes as you go

While working on a feature or a bug fix, you will be editing source code files. After editing a file, you might want to commit those changes to your local repository in order to checkpoint your work.  When you make a change that you want to commit you "add" it to the git stage with this command:

git add path/to/file/in/repo

The next commit that is created will contain all changes in the stage by default.  If you create a new file, also use git add to make the repo aware of it.

A commit can have changes from multiple files. In order to commit only the changes in the staging area, the following command can be used:

git commit

NOTE:   The git add command will only stage the changes in the file at the time you issue the command. If the file is subsequently modified you need to run git add again if you want those changes to be committed.   Use git status frequently to see the status of staged and un-staged files.

git commit will open up an editor where a commit message can be written.   Please refer to the Commit and PR message template for more instructions on the E3SM commit message guidelines.

It is tempting to do development with lots of little commits like this:

f8ef2b5f75 fix
d2e3616d55 progress 
e58c74745f more progress
0be335ead0 almost working

If you make "checkpoint" commits like this, be prepared to rewrite the commit history using "git rebase" (See https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History) so that each commit has a substantive message as required in Commit and PR message template.  See 

Utilizing the repository history

 

Seeing as all of the E3SM developers are stewards of the repository history, it would be useful for developers to understand how to make use of the history they are maintaining. 

The most useful command for making use of the history is the git log command. This command has an abundance of optional arguments and this second can be used to get an idea of what you can do with git log.

The basic usage of this command gives the equivalent of an svn log. Where you get a list of all commits and their commit messages. By default, git log only shows commits that are reachable in the history of the HEAD. Optional arguments can be used to look at different (or all) branches.

 One extremely useful version of git log gives a command line graph of the history.

 

git log --oneline --graph --decorate

  

git log can also only show local branches that match a certain naming convention. i.e.

 

 git log --branches=*pattern*

 

 The --branches option can be replaced with --remotes to only show remote tracking branches that match a pattern.

 

 A specific commit has three pieces of information. The first is the commit’s tree (i.e. the files and directories contained in the commit), the second is the commit’s message (the commit message issued when creating the commit), and the third is a list of the commit’s parents. There can be multiple parents for a single commit, which would imply the commit was a merge commit. Git stores the order of the parents, with the first parent always being the commit the merge was initiated from. This is useful to note, because within our workflow first parent commits on master should always be merge commits bringing in features or bug fixes. git log can be used to narrow your view to only see first parent commits as well, via:

 

git log --first-parent

 

 As you develop a new feature, you might migrate files from one place to another, or even rename the file. In order to have git show you all renames of a file, you can use the following command:

  

git log --name-only --follow -- path/to/file

 Any of these options can be combined to give more flexibility to exploring the history of a git repository and make the history more useful.

Advanced: Incorporating another branch on to your branch.

When developing a new feature, another developer might have created a feature you require for your work, or another developer might have made large changes to the interfaces you make use of. This section will describe how to incorporate those changes into your branch.


NOTE: 
This action can have negative consequences and cause issues in the future, so it is important to know what you’re doing prior to doing any of these.

Cherry-pick method

The first option for incorporating changes from another development line into your development history is via a cherry-pick. A cherry-pick will copy individual commits (or a range of commits) from one point in history onto the HEAD of your current development line. It should be used when the number of commits your future development efforts depend on are small (order 1-10). It can be used as follows:

git cherry-pick <commit-ish>


This method is the easiest to use and one of the least likely to create issues in future development, and allows the most flexible review modifications. One downside to this method, however, is that the commits that are cherry-picked will now occur twice in the history.

Merge Method

The second option for incorporating changes into your development history is via a rebase. A rebase allows you to modify commits. This includes operation such as squashing, re-ordering, deleting, etc. When you create a branch, the base of the branch is the commit you branched off of. A rebase allows you to modify what the base of your branch is. The following diagram can be used to visualize a rebase:Image Removed

When performing a rebase, you specify the new base for your work. Rebase will then replay your work onto the new base. For example:

git rebase -i new-base

Will replay the history from the current branch on top of the new-base. The -i flag in this merge. A merge will attempt to merge an arbitrary number of other commits (or branches) into your current branch. It can be used as follows:

git merge [commit-ish1] [commit-is2] …


In this case, you can list how ever many commits you want on this line, and git will attempt to merge them all into the commit you are currently working on (into your working directory).

The merge will allow you to enter a commit message. The commit message should be very descriptive. It should explain what you’re merging, and why you’re merging it.

NOTE:

A merge creates a fixed point in history. The merge commit fixes all history before it, which limits possibilities for cleaning up history during a code review. A merge should be avoided if at all possible, but in some cases it is necessary. If it is necessary, try to clean up all history prior to the merge before beginning the merge.

Rebase Method

The third and final option for incorporating changes into your development history is via a rebase. A rebase allows you to modify commits. This includes operation such as squashing, re-ordering, deleting, etc. When you create a branch, the base of the branch is the commit you branched off of. A rebase allows you to modify what the base of your branch is. The following diagram can be used to visualize a rebase:


Image Added


When performing a rebase, you specify the new base for your work. Rebase will then replay your work onto the new base. For example:

git rebase -i new-base

Will replay the history from the current branch on top of the new-base. The -i flag in this case allows interactive modification of the commits to replay. It should be used to verify what is going to happen after the rebase.

...

A pull request initiates a process to merge your branch into master. The process will be documented with the subsequent code review via discussion and comments on the PR.

  1. Commit all your changes and push the branch to the shared repository, go to https://github.com/E3SM-Project/E3SM/branches and find your branch.  Click on  locally.
  2. Clean up any "checkpoint" commits.   After development is done, may have commits that look like:
    f8ef2b5f75 fix
    d2e3616d55 progress
    e58c74745f more progress
    0be335ead0 almost working
    We do NOT want these in our commit history.  Use git rebase (see https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History) to rewrite the history of your branch and group these in to substantive commits that follow our Commit and PR message template
  3. Push the cleaned-up branch to the github repo.
  4. go to https://github.com/E3SM-Project/E3SM/branches and find your branch.  Click on  "New pull request".
  5. You should verify base and compare of your pull request are correct.  The "base" is master and "compare" is your branch.
    1. When updating submodules:
        Enter
          1. submit a PR
        Title:  Make sure the title is 70 characters or less and explains the PR in a "verb noun" format like "
          1. to submodule's repo first: e.g. a PR to https://github.com/E3SM-Project/kokkos
            1. if you can't submit a PR, request "push" access from the developers of the submodule
          2. after the PR is merged, submit a PR to https://github.com/E3SM-Project/E3SM updating submodule hash
      1. Enter a PR Title:  Make sure the title is 70 characters or less and explains the PR in a "verb noun" format like "add cool new feature". Do not just use the branch name or what is filled in by default.  (If your branch has a single commit, the title of that commit will be the default, but if your branch has multiple commits, the branch name will be the default.)  You should edit the title to make sure it is descriptive enough. DO NOT include github issue #'s or JIRA tasks in the title. Hyperlinking does not work from the title.
      2. Write a PR Description.  
        1. In the "Write" field, provide a descriptive message of what all the commits in the PR will do together.  This description will be used as the commit message when the branch is merged so follow the Commit and PR message template guidelines.
        2. If a bug fix:  after the description, close any Github issue numbers for the bugs this commit fixes using keywords like "Fixes #123".  See https://help.github.com/articles/closing-issues-using-keywords/
        3. After the description and any issue numbers,  add a keyword indicating how the PR might change answers.  
          1. [BFB] or [non-BFB] or [CC]:  Add ONE of these keys to indicate if this commit will affect testing results to roundoff [non-BFB] or climate changing [CC]. Use [BFB] if-and-only-f commit is bit-for-bit and you know all the tests will pass without regenerating baselines.
          2. [FCC]:   Add [FCC] if the commit will change climate if a flag is activated
          3. [NML]:  Add [NML] if the commit introduces changes to the namelist.
        4. The reviewer will review the description as part of the pull request. The description will be used in the commit message used when the PR is merged to next and master.     The integrator may work with you to edit the PR description.  Do not include Confluence URL's in the PR description.
      3. Give PR label(s) ('Label' pull down menu).  Add a label for the component this PR involves.  Add the "bug fix" label if this is a bug fix.  Add the label(s) for BFB, non-BFB, CC, FCC, NML as appropriate.
      4. Assign a single Integrator to manage the PR ( use the 'Assignee' pull down menu)
      5. If you want additional reviewers, add them using the Reviewer pull-down menu.  Anyone can be asked to review.
      6. When complete, click on "Create pull request".  This will start the code review and the process of moving this feature to master.  
      7. After pull request is created, add  the following in the Comments section:
        1. In the first comment, provide a link to the Design Document governing this PR.   See /wiki/spaces/CNCL/pages/25231511.
        2. In subsequent comments.  
          1. Provide information to aid the Integrator in running, testing, and validating the feature (but that is too specific to be included in the general PR description). 
          2. Say how the feature was tested.  Example:  "e3sm_developer on Titan passed".  If a test is expected to fail and required redoing baselines, state that and list the tests that fail.

      Your PR is not finished until it has been merged to master by the Integrator.

      This document can be used to help with pull request related issues.

      ...

      1. Check the github entry for the PR and make sure it has a good title and description, correct labels and a comment with a link to the Design Document.   A PR can not be merged to next or master unless it has a Design Document with Phase 1 and 2 completed. See /wiki/spaces/CNCL/pages/25231511.
      2. Look at the code changes either on github or using:  git log --reverse -p master.. on your checked out copy of the branch.
        1. Does new code hold up to visual inspection for code quality?    Look over code changes for glaring mistakes or code style issues (e.g. useful comments, reasonable subroutine lengths, new code in an existing file follows conventions of that file).
        2. Check to see if the description of the code changes in the PR match the actual changes.  Make sure nothing unrelated to the PR was committed accidentally.
        3. Although they can't be changed, see if commit messages on the branch follow the Commit and PR message template and let the developer know if they can not.  Consider asking the developer to squash commits to clean up the history.
        4. Have tests been added or suggested that exercise this feature?
        5. Does code run on all platforms after integration into next?

      ...

      If they do happen to run into this situation, you may be requested to help with the process. The easiest way to help is to create a new branch at the HEAD of the branch your pull request was submitted from. This branch should have the same name as the other branch with -resolved appended to the end. After the branch is created, you can merge E3SM-Climate/E3SM/master into it, and push the resolved version onto the shared repository. This gives the reviewer a version of the code that is merged and resolved, but allows the reviewer to ensure the history maintains the standards.

       

      Never use the github automerge!   DO NOT PRESS THE GREEN BUTTON!

       

      Image Removed

      Merges to master will be done locally by integrators.  See Integrator Guide for more info.

      Additional Topics

      ...

      The list of "nevers":

      • Never routinely merge from master to your feature branch.  See below for more info.
      • Never use "cherry-pick" on "next"
      • Never do production simulations using "next"
      • Never commit directly to "master" or "next".  Only commit to your feature branch.
      • Never rebase commits on your branch after it is merged to next (unless told to do so by integrator).
      • Never start new development from "next".

      Changing the url for your remote

      After the rename of ACME-Climate/ACME to E3SM-Project/E3SM you should change url or "origin" in each clone.

      First verify that "origin" is still pointing to ACME-Climate/ACME.  It will unless you changed it yourself but you can check by running "git remote -v".

      You'll see output like:

      origin git@github.com:ACME-Climate/ACME.git (fetch)
      origin git@github.com:ACME-Climate/ACME.git (push)

      Change the URL for origin with this command:

      git remote set-url origin git@github.com:E3SM-Project/E3SM.git

      ...

      git remote set-url origin git@github.com:E3SM-Project/<repo name>.git

      Working with Remote repositories

       

      Git is a distributed content versioning system (DCVS). When you clone a repository, you make a local repository that is essentially a backup of whatever you cloned. When you commit, the commit only occurs in your local repository. In order to allow other people to make use of the commits you make, or to get your commits incorporated into master they need to be “pushed” to a remote. 

      A remote is a repository that you would like to communicate with, that is not in the local working directory. When you clone a repository, a default remote is created named “origin” that points to whatever you cloned. A remote is really just an alias to the location of another repository you either want to read or write with. 

      Before communicating with remotes, you might want to add or remove remotes. In order to add and remove remotes the git remote command can be used. The two uses are as follows:

      git remote add remote-name protocol:address/to/repo # Creates a remote

      git remote remove remote-name # Removes a remote

        

      In order to communicate with remotes, there are three actions. pushpull, and fetch.

      The push command can be used to write some commits from your local repository to a remote repository. For example: 

      git push <remote-name> <local-branch>:<remote-branch>

      can be used to write a local branch to a remote branch on the remote pointed to by the name provided. An explicit example would be: 

      git push E3SM-Project/E3SM feature:github-username/component/feature

       

      In this case, we’re writing the branch feature to the remote origin and changing its name to core/feature.

       git push is the command used when writing history to remotes. In order to read history from remotes, the git pull and git fetch commands can be used.

       The git fetch command will update the local repository with the most recent history from a specific remote, without modifying any local branches in the repository. It can be used as follows: 

      git fetch <remote-name>

        

      The git pull command is a combination of two operations. The first is a git fetch in order to update the history in the local repository of the branches that exist on the remote repository. The second is a git merge which merges the changes into the current working directory. It is only recommended to use this if you know what you’re doing. It can easily cause unexpected problems. It can be used as follows:

      ...

      Keeping your fork up-to-date.

      You will need to keep the master branch within your fork of E3SM up to date with main version.  Update as often as you want but always before you start new development or if you want to try a test PR on your fork.

      We'll refer to the version of E3SM at https://github.com/E3SM-Project/e3sm as the "upstream" version.

      First you'll need to add the upstream repo as a remote in a local clone of your fork.

      Go to any local clone of your fork and do the following:

      ...

      Code Block
      MacBook-Air.local[104]: git remote -v
      origin	git@github.com:rljacob/E3SM.git (fetch)
      origin	git@github.com:rljacob/E3SM.git (push)
      upstream	git@github.com:E3SM-Project/E3SM.git (fetch)
      upstream	git@github.com:E3SM-Project/E3SM.git (push)

      Now you can proceed to update your local copy of master.

      First fetch the branches (including master) from the upstream

      git fetch upstream

      Checkout out your fork's version of master.

      git checkout master

      Merge in the upstream version of master

      git merge --ff-only upstream/master

      There should be NO conflicts because you should never add separate PR's to your own fork's version of master.  Your fork's master should only be changed by this procedure.

      Finally, push your updated master to your fork on github

      ...

      allows the reviewer to ensure the history maintains the standards.

       


      Never use the github automerge!   DO NOT PRESS THE GREEN BUTTON!

       

      Image Added


      Merges to master will be done locally by integrators.  See Integrator Guide for more info.

      Additional Topics


      General information for developers.

      The list of "nevers":

      • Never routinely merge from master to your feature branch.  See below for more info.
      • Never use "cherry-pick" on "next"
      • Never do production simulations using "next"
      • Never commit directly to "master" or "next".  Only commit to your feature branch.
      • Never rebase commits on your branch after it is merged to next (unless told to do so by integrator).
      • Never start new development from "next".

      Changing the url for your remote

      After the rename of ACME-Climate/ACME to E3SM-Project/E3SM you should change url or "origin" in each clone.

      First verify that "origin" is still pointing to ACME-Climate/ACME.  It will unless you changed it yourself but you can check by running "git remote -v".

      You'll see output like:

      origin git@github.com:ACME-Climate/ACME.git (fetch)
      origin git@github.com:ACME-Climate/ACME.git (push)

      Change the URL for origin with this command:

      git remote set-url origin git@github.com:E3SM-Project/E3SM.git
      verify with "git remote -v"

      origin git@github.com:E3SM-Project/E3SM.git (fetch)
      origin git@github.com:E3SM-Project/E3SM.git (push)

      If you have clones of other repositories that were on ACME-Climate, you'll need to update their URL's as well with similar command:
      git remote set-url origin git@github.com:E3SM-Project/<repo name>.git

      Working with Remote repositories

       

      Git is a distributed content versioning system (DCVS). When you clone a repository, you make a local repository that is essentially a backup of whatever you cloned. When you commit, the commit only occurs in your local repository. In order to allow other people to make use of the commits you make, or to get your commits incorporated into master they need to be “pushed” to a remote. 

      A remote is a repository that you would like to communicate with, that is not in the local working directory. When you clone a repository, a default remote is created named “origin” that points to whatever you cloned. A remote is really just an alias to the location of another repository you either want to read or write with. 

      Before communicating with remotes, you might want to add or remove remotes. In order to add and remove remotes the git remote command can be used. The two uses are as follows:

      git remote add remote-name protocol:address/to/repo # Creates a remote


      git remote remove remote-name # Removes a remote

        

      In order to communicate with remotes, there are three actions. pushpull, and fetch.

      The push command can be used to write some commits from your local repository to a remote repository. For example: 

      git push <remote-name> <local-branch>:<remote-branch>


      can be used to write a local branch to a remote branch on the remote pointed to by the name provided. An explicit example would be: 

      git push E3SM-Project/E3SM feature:github-username/component/feature

       

      In this case, we’re writing the branch feature to the remote origin and changing its name to core/feature.

       git push is the command used when writing history to remotes. In order to read history from remotes, the git pull and git fetch commands can be used.

       The git fetch command will update the local repository with the most recent history from a specific remote, without modifying any local branches in the repository. It can be used as follows: 

      git fetch <remote-name>

        

      The git pull command is a combination of two operations. The first is a git fetch in order to update the history in the local repository of the branches that exist on the remote repository. The second is a git merge which merges the changes into the current working directory. It is only recommended to use this if you know what you’re doing. It can easily cause unexpected problems. It can be used as follows:


      git pull <remote-name> <branch-name>


      Keeping your fork up-to-date.

      You will need to keep the master branch within your fork of E3SM up to date with main version.  Update as often as you want but always before you start new development or if you want to try a test PR on your fork.

      Use the "Sync" button on your fork.


      Avoid Routine Merges From Master

      ...

      • "frequent pulling of the \[master] into a development branch will add a certain amount of randomness to that branch; this randomness is not particularly helpful for somebody who is trying to get a feature working. It also increases the chances that another developer who ends up in the middle of the series while running a bisect operation will encounter unrelated bugs."
      • A branch has a specific purpose. A topic branch 'add-frotz' would be about adding a new 'frotz' feature and shouldn't do anything else.  When you merge from master, you declare that all the other unrelated changes done on 'master' in preparation for the next release somehow bring 'add-frotz' closer to the goal of the 'frotz' topic. That is usually not true
      • Unnecessary merges and similar repository clutter reduces the ability to summarize, audit, notice bugs in review, and find bugs after the fact.  Keeping clean history is not difficult.  It requires a little bit of discipline on the part of integrators and developers, but it's a small price to pay for the time saved and improved quality/reliability.

      There are some cases where merging from master (or better, a tagged version of master) can make sense.  The rule of thumb for any merge is that if you can clearly describe what you are merging and why that merge is necessary for your branch to be completed, then it's fine to merge.    For example, you might need a feature on master (some crucial functionality, not just a build system updates) to complete the feature on your branch.   Integrators may ask you to merge from master to help resolve conflicts during a PR integration.  But before merging from master, try one of the other ways to get features from other peoples development described in Incorporatinganotherbranchontoyourbranch.

      ...