Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • SE/CPL use cdash originally for nightly tests,  was flushing the test after a night, they cdash corrected that , display on the dashboard, works very well, .  Workflow also happy with cdsh.
  • ctest works with batch?  Might need an additional script.
  • Use Jenkins
  • Workflow is working on pushing test to dashboard and using Jenkins
  • SE/CPL - 'create-test' - it is custom, comes with the system, being reworked to by parallel to run tests (>600 >70 system tests) .  UVCDAT has 600 tests.
  • Workflow uses 'ctest' and cmake, Andy thinks they will work on that for ACME in the summer
  • cdash can now capture timing info and flag "fail" if it take longer then before.
  • kitware working on buildbot and github integration.
  • We could have our own cdash server for more customization.
  • SE - every test has its own executable ,  Anshu Dubey (Unlicensed)  says 'flashtest' has also dashboad  (reference to the FLASH paper: 

    http://onlinelibrary.wiley.com/doi/10.1002/spe.2220/abstract;jsessionid=5702A8A123ACBCE707C0721C192D9DB2.f04t03?userIsAuthenticated=false&deniedAccessCustomisedMessage= )

  • ACME license discussion - it will be modified BSD, need a discussion on how to license the data as well and restrictions,  also need to settle on who holds copyright.
  • Workflow is moving into using Anaconda (python package/container manager) and Docker container for both UV-CDAT and ESGF. SE is looking at Docker. You can run docker using MPI on HPC.  Need the right Linux distro.

Workflow - Ocean/Ice

  • Ocean wants to use Anaconda/Conda , Milena asks on how to use Classic Viewer
  • UV-CDAT is now build through Anaconda, Ocean analysis package is installable through and sits under Anaconda.
  • Ashish there is Anaconda and Conda, Conda is only for installation
  • Ocean – they do not have the name for the package yet, they will have it as a package in Anaconda
  • Ocean will have their interactive python notebooks that will be in the repo as scripts, they will be then exported and distributed through anaconda
  • Ocean do not have it under repo yet, they are working on it
  • Ocean package will distribute both notebooks and python scripts
  • Workflow asked if Ocean would like to leverage and use classic viewer to display the analysis plots
  • Samuel Fries (Unlicensed) presented a more generic viewer. Todd would like a database on top of every analysis created and run through the viewer, to keep track of the different runs and analysis as a searchable database  ('cinema' through Kitware funded through Oscar very similar) 
  • obs data - they do not have it in repo yet, not a lot of data so far ~ a GB (Ashish - d not use githab for that then)
  • Rob should be able to help to put the obs into the repo
  • publishing Ocean analysis data to ESGF, use triage hub to for publication needs

Workflow - Performance

  • provenance in implemented in CIME, fires up along the run, so even if the job dies you will get something, creates a world read/writable but some things are not, so someone has to clean it up (only works on ORNL but they will clean it up on Luster so it does need to be archived) on NERSC and MIRA it needs to be cleaned up manually users case and run directories
  • Ben parses this data and puts into a database, Mark – can we then delete all other, Pat - this is not sufficient, there is more info there
  • Pat you want stuff I am not collecting and I am collecting stuff you may not need
  • send Pat email on confluence to update documentation on the provenance
  • there are timers, profile timers per component, talk to Noel Keen about it, there are log files per simulated days it is captured in the coupler, profile timers are summaries - one text log file in case directory, only exist if the job finishes. When the jog is running you need to look at the coupler files in the run directory monitoring the run of the job.
  • Monitor log file, if it does not update for some time (depending on run, could be minute,  actually 15 min should be always enough) we could notify people after 15 min, so they can kill the job, instead of waiting for it to die by itself because it is hanged and waste the the time allocation. There could be external job that can monitor the run, so it can kill the job even if the job is hanged. 
  • for the future – resubmitting a job as a capability in CIME with requesting more nodes and having many jobs so that you can launch another job after one dies without waiting in the queue again.
  • there are other log files to look at, 5 or 6 that log the running job, should look also at stderr and stdout
  • provenance on how long does it take to compile, sit in a queue,  all collected as provenance in the same place, some is written in the beginning of the job, some at checkpoints and some at the end of the job then its done. All LCF has priority jobs and also may keep tract of the allocations.

Workflow - Atmosphere

  • Classic Viewer runs in ORNL only for now, we have a viewer that you can download to your machine to look at the diagnostics
  • Kate Evans (Unlicensed) has some different diagnostics to be also displayed in the Classic Viewer, contact John to add that to the viewer
  • Sam's offline viewer can display other data, needs a jason file to 
  • Chris Golaz – we will have 200 years of data non Edison next week, it would be great if workflow will take that to exercise the diagnostic on it and show to others how to work on it, document on confluence
  • Kate has a top 10 plots to do for coupled, that needs to be added and created for the edison run– this is python scripts (contact Marcia Branstetter (Unlicensed))
  • colormap is now flexible in uvcdat, workflow priority is to make Chris happy, then they will check with Atmosphere for next priorities, Chris can be a person to interface atmosphere and workflow for setting next priorities
  • Chris likes the plots to be at NERSC so that he can share them to others, Dean we need to have Classic Viewer at NERSC
  • Phil - it does not need to go to NERSC, but it can be done in other ways
  • Classic Viewer is at ORNL, metadiags will automatically push the data and plots to CADES and enable Classic Viewer to display them, anyone can then view them, moving data is part of the script, it is done automatically, Kate – there is an authentication issue.
  • Question on checking if the data is good enough to share with others, Chris is doing it on NERSC, he creates climo files and uses AMWG to display results, then shares with others through NERSC. We would then use metadiags to move the data to CADES and publish data to ESGF.
  • There is a webinar and links to other online documentation on how to use publishing services on Workflow, see – /wiki/spaces/WORKFLOW/pages/66519289

 

 

Atmosphere Group Notes

/wiki/spaces/ATM/pages/71336556

...

Land needs a way to work with external libraries.   Some will have joint development within ACME.   We don't want a ton more code in our ACME git repo.   On the other hand, making a separate library means a separate build system and some of these need to be re-built daily with development.   Possibly multiple compilers (for Fortran-C interfacing)

Under iESM code base, coupling between GCAM and ALM is not in coupler.  GCAM is not really "of" the land because it provides emissions over ocean and can be run in a case without land.

Land code is getting convoluted in terms of what can be switched on and off.   Anshu:  standardize your API between the pieces.   Build system can help.

Also land is having programming style divergence.  Makes it hard to read.   Need to stick to one style.   Land should enforce internally until there is an ACME-wide style guide.

mksurfdat is getting slow.

SE needs ne16 and even ne8 versions of all IC/BC files for fast-running tests.

SE-Ocean:

Andy summarized the CMDV-software proposal.

3D field transfer is on critical path for v2 ocean-ice plans.

metis offline vs. online.  Would be good to have a "dumb" mode for online metis.  Doing everything metis can do online would require lots of MPAS mods so still want offline metis files.

MPAS uses buildbot for testing.  Still need to work on code coverage.

Performance Group Notes

/wiki/spaces/PERF/pages/73793546

...