2016-06-09 ACME All-Hands Speed-Dating Sessions

Please take notes from questions and discussions and post in on this page

Session TimeDurationStart - End TimeSession / Confluence PagePlenary RoomBreakOut #1BreakOut #2

Poster Room

Thursday, Jun 9th, 2016
~ 3h 8:00 - 11:05 amCross-Group Speed Dating

Session Chair: Shaocheng Xie 

   
 30 min + 5 min change rooms8:00 - 8:30 am 

A-P: Atmosphere - Performance Groups

W-L: Workflow - Land Groups

O-S: Ocean/Ice - SE/CPL Groups

Posters Take Down
 30 min + 5 min change rooms8:35 - 9:05 am 

A-L: Atmosphere - Land Groups

W-S: Workflow - SE/CPL Groups

O-P: Ocean/Ice - Performance Groups

 
 30 min + 5 min change rooms9:10 - 9:40 am 

A-S: Atmosphere - SE/CPL Groups

W-O: Workflow - Ocean/Ice Groups

L-P: Land - Performance Groups

 
20 min 9:40 - 10:00 amBreak    
 30 min + 5 min change rooms10:00 - 10:30 am 

A-O: Atmosphere - Ocean/Ice Groups

L-S: Land - SE/CPL Groups

P-W: Performance -Workflow Groups

 
 30 min + 5 min change rooms10:35 - 11:05 am 

A-W: Atmosphere - Workflow Groups

L-O: Land - Ocean/Ice Groups

P-S: Performance -SE/CPL Groups

 

 

Workflow Group Notes:

Workflow - Land

  • EDEN - was developed for CESM - exists in ORNL, not worked on currently – need to scope the funding for it, see if it can be worked as a part of workflow
  • LMWG Diagnostic should be accessible through the web browser, Brian Smith left, not all diagnostics were implemented, someone needs to check if the existing diagnoscis 
  • ILAM - bit bucket repo, open source, no license yet, has DOI. Would like to integrate ILAM to Workflow and Classic Viewer
  • Land - need user added new diagnostics and displayed as default for this user – need description of that in the Documentation space 
  • let's have - Documentation - in Workflow space and aggregate all docs for running diagnostics
  • Peter Thornton would like to get the URL for the diagnostics from Workflow group, even if it is not complete to play with what is there.
  • Workflow needs obd data from land group, Land group will point Workflow to the subversion repo that hold the data
  • Land does not have a way to check if the obs are CF compliant, workflow should check, Land will then correct whats needed
  • Charlie Zender -- there is a CF checker called quick_and_dirty_cf_compliance_checker in the PreAndPostProcessingScripts repository (look in utils subdirectory) created by Susannah Burrows and Phil Rasch (pnl.gov).

Workflow - SE/CPL

  • SE/CPL use cdash originally for nightly tests,  was flushing the test after a night, cdash corrected that , display on the dashboard, works very well.  Workflow also happy with cdsh.
  • ctest works with batch?  Might need an additional script.
  • Use Jenkins
  • Workflow is working on pushing test to dashboard and using Jenkins
  • SE/CPL - 'create-test' - it is custom, comes with the system, being reworked to by parallel to run tests (>70 system tests) .  UVCDAT has 600 tests.
  • Workflow uses 'ctest' and cmake, Andy thinks they will work on that for ACME in the summer
  • cdash can now capture timing info and flag "fail" if it take longer then before.
  • kitware working on buildbot and github integration.
  • We could have our own cdash server for more customization.
  • SE - every test has its own executable ,  Anshu Dubey (Unlicensed)  says 'flashtest' has also dashboad  (reference to the FLASH paper: 

    http://onlinelibrary.wiley.com/doi/10.1002/spe.2220/abstract;jsessionid=5702A8A123ACBCE707C0721C192D9DB2.f04t03?userIsAuthenticated=false&deniedAccessCustomisedMessage= )

  • ACME license discussion - it will be modified BSD, need a discussion on how to license the data as well and restrictions, also need to settle on who holds copyright.
  • Workflow is moving into using Anaconda (python package/container manager) and Docker container for both UV-CDAT and ESGF. SE is looking at Docker. You can run docker using MPI on HPC.  Need the right Linux distro.

Workflow - Ocean/Ice

  • Ocean wants to use Anaconda/Conda , Milena asks on how to use Classic Viewer
  • UV-CDAT is now build through Anaconda, Ocean analysis package is installable through and sits under Anaconda.
  • Ashish there is Anaconda and Conda, Conda is only for installation
  • Ocean – they do not have the name for the package yet, they will have it as a package in Anaconda
  • Ocean will have their interactive python notebooks that will be in the repo as scripts, they will be then exported and distributed through anaconda
  • Ocean do not have it under repo yet, they are working on it
  • Ocean package will distribute both notebooks and python scripts
  • Workflow asked if Ocean would like to leverage and use classic viewer to display the analysis plots
  • Samuel Fries (Unlicensed) presented a more generic viewer. Todd would like a database on top of every analysis created and run through the viewer, to keep track of the different runs and analysis as a searchable database  ('cinema' through Kitware funded through Oscar very similar) 
  • obs data - they do not have it in repo yet, not a lot of data so far ~ a GB (Ashish - d not use githab for that then)
  • Rob should be able to help to put the obs into the repo
  • publishing Ocean analysis data to ESGF, use triage hub to for publication needs

Workflow - Performance

  • provenance in implemented in CIME, fires up along the run, so even if the job dies you will get something, creates a world read/writable but some things are not, so someone has to clean it up (only works on ORNL but they will clean it up on Luster so it does need to be archived) on NERSC and MIRA it needs to be cleaned up manually users case and run directories
  • Ben parses this data and puts into a database, Mark – can we then delete all other, Pat - this is not sufficient, there is more info there
  • Pat you want stuff I am not collecting and I am collecting stuff you may not need
  • send Pat email on confluence to update documentation on the provenance
  • there are timers, profile timers per component, talk to Noel Keen about it, there are log files per simulated days it is captured in the coupler, profile timers are summaries - one text log file in case directory, only exist if the job finishes. When the jog is running you need to look at the coupler files in the run directory monitoring the run of the job.
  • Monitor log file, if it does not update for some time (depending on run, could be minute,  actually 15 min should be always enough) we could notify people after 15 min, so they can kill the job, instead of waiting for it to die by itself because it is hanged and waste the the time allocation. There could be external job that can monitor the run, so it can kill the job even if the job is hanged. 
  • for the future – resubmitting a job as a capability in CIME with requesting more nodes and having many jobs so that you can launch another job after one dies without waiting in the queue again.
  • there are other log files to look at, 5 or 6 that log the running job, should look also at stderr and stdout
  • provenance on how long does it take to compile, sit in a queue,  all collected as provenance in the same place, some is written in the beginning of the job, some at checkpoints and some at the end of the job then its done. All LCF has priority jobs and also may keep tract of the allocations.

Workflow - Atmosphere

  • Classic Viewer runs in ORNL only for now, we have a viewer that you can download to your machine to look at the diagnostics
  • Kate Evans (Unlicensed) has some different diagnostics to be also displayed in the Classic Viewer, contact John to add that to the viewer
  • Sam's offline viewer can display other data, needs a jason file to 
  • Chris Golaz – we will have 200 years of data non Edison next week, it would be great if workflow will take that to exercise the diagnostic on it and show to others how to work on it, document on confluence
  • Kate has a top 10 plots to do for coupled, that needs to be added and created for the edison run– this is python scripts (contact Marcia Branstetter (Unlicensed))
  • colormap is now flexible in uvcdat, workflow priority is to make Chris happy, then they will check with Atmosphere for next priorities, Chris can be a person to interface atmosphere and workflow for setting next priorities
  • Chris likes the plots to be at NERSC so that he can share them to others, Dean we need to have Classic Viewer at NERSC
  • Phil - it does not need to go to NERSC, but it can be done in other ways
  • Classic Viewer is at ORNL, metadiags will automatically push the data and plots to CADES and enable Classic Viewer to display them, anyone can then view them, moving data is part of the script, it is done automatically, Kate – there is an authentication issue.
  • Question on checking if the data is good enough to share with others, Chris is doing it on NERSC, he creates climo files and uses AMWG to display results, then shares with others through NERSC. We would then use metadiags to move the data to CADES and publish data to ESGF.
  • There is a webinar and links to other online documentation on how to use publishing services on Workflow, see – /wiki/spaces/WORKFLOW/pages/66519289

 

 

Atmosphere Group Notes

/wiki/spaces/ATM/pages/71336556

Atmosphere-Land Notes

  • Anthropogenic non-CO2 emissions: Capture the workflow. J-F Lamarque has done this for iESM cases in the past. How do we do this moving forward?

    • Phil R.: irregular grids make this more complicated. Interpolation on the fly supported in the atm code, but not well documented, maybe not always conservative. Could make this more rigorous.

    • Phillip C-S.: Haven't paid enough attention to new scenarios for ACME. Need more attention here. Maintain compatibility with CESM in terms of file formats, etc.

    • Andy Jones: what is happening for CMIP6 to translate the emissions data for new scenarios?
  • Ruby: what is the requirement for LULC?
    • Peter: we need to look at the way LULC datasets are used in transient SP cases.
  • Latent heat of fusion and ice
    • Peter Thornton: land will add this as a task to make sure we know what the land is doing with these fluxes. Compare with atmosphere.
  • clear-sky albedo
    • Phit R. is working on this, will get information to the land group for next step sensitivity testing.
    • Shaocheng Xie: is this a high priority for V1?
  • Isotopes:
    • Bill R.: oxygen isotopes and deuterium are in BeTR and CLM5, are we bringing it into ACME?
    • Peter: Also we have carbon isotopes currently active in 
  • Precip over topography:
    • both riverflow and hgih resolution precip datasets can be useful in evaluating the model precip.

Land-SE

Land needs a way to work with external libraries.   Some will have joint development within ACME.   We don't want a ton more code in our ACME git repo.   On the other hand, making a separate library means a separate build system and some of these need to be re-built daily with development.   Possibly multiple compilers (for Fortran-C interfacing)

Under iESM code base, coupling between GCAM and ALM is not in coupler.  GCAM is not really "of" the land because it provides emissions over ocean and can be run in a case without land.

Land code is getting convoluted in terms of what can be switched on and off.   Anshu:  standardize your API between the pieces.   Build system can help.

Also land is having programming style divergence.  Makes it hard to read.   Need to stick to one style.   Land should enforce internally until there is an ACME-wide style guide.

mksurfdat is getting slow.

SE needs ne16 and even ne8 versions of all IC/BC files for fast-running tests.

SE-Ocean:

Andy summarized the CMDV-software proposal.

3D field transfer is on critical path for v2 ocean-ice plans.

metis offline vs. online.  Would be good to have a "dumb" mode for online metis.  Doing everything metis can do online would require lots of MPAS mods so still want offline metis files.

MPAS uses buildbot for testing.  Still need to work on code coverage.

Performance Group Notes

/wiki/spaces/PERF/pages/73793546