...
Page Properties | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Title
Machine learning approaches for surrogate modeling in the E3SM land model
Authors
Daniel Ricciuto, Dan Lu, Khachik Sargsyan, Vishagan Ratnaswamy Ratnaswamy, Cosmin Safta (Unlicensed)
Abstract
There are a variety of different methods in machine learning that can be applied to create surrogate models. Traditional feed-forward neural networks or a multilayer perceptron (MLP) can be used to build approximations to quantities of interest (QoI) for complex physical models, for example, carbon fluxes in the E3SM land model. A single model output variable (e.g. the gross primary productivity GPP) is spatially gridded and therefore contains a large number of QoIs for a surrogate model to reproduce. Here we demonstrate this high-dimensional GPP output can be accurately represented with a small number of singular values when singular value decomposition (SVD) is applied. A relatively An accurate surrogate model can then be trained using a MLP with a relative relatively small ensemble. Temporal variations in model outputs present additional challenges for creating accurate surrogate models. Thus, the use of a recurrent neural network (RNN) is also suited for the land model. Using a vanilla RNN comes with its own set of issues such as exploding and vanishing gradients; however, those issues can be mitigated with gradient clipping or commonly gates. One common gated method is long short-term memory (LSTM). While the gated-RNN can handle temporal data, it is typically done in a sequential fashion, i.e. it ignores the connected (hierarchical) nature of the QOIs. To make a more physics-based model, we employ a hierarchical NN, specifically a Tree-LSTM that incorporates the hierarchical nature of the land model. We compare how well the Tree-LSTM RNN predicted the QOIs of the land model in one representative grid cell, namely for carbon cycle variables compared with LSTM-RNN and MLP. We find that the Tree-LSTM outperforms MLP and LSTM-RNN, confirming the intuition that physics-based neural network architecture improves the predictive accuracy compared to vanilla methods.