Jordi Bolíbar
jordi.bolibar@univ-grenoble-alpes.fr
Institute of Environmental Geosciences (Université Grenoble Alpes)
This project is no longer maintained. You are of course free to use the code, but there are no plans to continue nor provide support on its usage.
ALPGM is a fully parameterized glacier evolution model based on data science. Glacier-wide surface mass balance (SMB) are simulated using a deep artificial neural network (i.e. deep learning) or Lasso (i.e. regularized multilinear regression). Glacier dynamics are parameterized using glacier-specific delta-h functions (Huss et al. 2008). The model has so far been implemented with a dataset of French alpine glaciers, using climate forcings for past (SAFRAN, Durand et al. 1993) and future (ADAMONT, Verfaillie et al. 2018) periods.
The machine learning SMB modelling approach is built upon widely used Python libraries (Keras, Scikit-learn and Statsmodels).
For more details regarding ALPGM and the deep learning SMB modelling approach, I encourage you to read the Bolibar et al. (2020) paper in The Cryosphere: https://www.the-cryosphere.net/14/565/2020/
ALPGM's workflow can be controlled via the alpgm_interface.py file. In this file, different settings can be configured, and each step can be run or skipped with a boolean flag. The default workflow runs as it follows:
(1) First of all, the meteorological forcings are pre-processed (safran_forcings.py / adamont_forcings.py) in order to extract the necessary data closest to each glacier’s centroid. The meteorological features are stored in intermediate files in order
to reduce computation times for future runs, automatically skipping this preprocessing step when the files are already generated.
(2) The SMB machine learning module retrieves the pre-processed meteorological features and assembles the spatio-temporal training dataset, comprised by both climatic and topographical data. An algorithm is
chosen for the SMB model, which can be loaded from a previous training or it can be trained again with the training dataset (smb_model_training.py). These model(s) are stored in intermediate files, allowing to skip this step for future runs.
(3) The performances of these SMB models can be evaluated performing a leave-one-glacier-out (LOGO) cross-validation (smb_validation.py). This step can be skipped when using already established models. Basic statistical performance
metrics are given for each glacier and model, as well as plots with the simulated cumulative glacier-wide SMBs compared to their reference values with uncertainties for each of the glaciers from the training dataset.
(4) The Glacier Geometry Update module starts with the generation of the glacier specific parameterized functions, using the difference of the two pre-selected digital elevation model (DEM) rasters covering the
whole study area for two separate dates, as well as the glacier contours (delta_h_alps.py). These parameterized functions are then stored in individual files to be used in the final simulations.
(5) Once all the previous steps have been run and the glacier-wide SMB models as well as the parameterized functions for all the glaciers are obtained, the final simulations are launched (glacier_evolution.py).
For each glacier, the initial ice thickness raster and the parameterized function are retrieved. The meteorological data at the glaciers’ centroid is re-computed with an annual time step based on each glacier’s evolving topographical
characteristics. These forcings are used to simulate the annual glacier-wide SMB using the machine learning model. Once an annual glacier-wide SMB value is obtained, the changes in geometry are computed using the
parameterized function, thus updating the glacier’s DEM and ice thickness rasters. If all the ice thickness raster pixels of a glacier become zero, the glacier is considered as disappeared and is removed from the
simulation pipeline. For each year, multiple results are stored in data files as well as the raster DEM and ice thickness values for each glacier.
ALPGM simulates glacier-wide SMBs using topographical and climate data at the glacier. This repository comes with some pre-trained SMB models, but they can be retrained again at will with new data.
Retraining is important when working with a different region (outside the European Alps in this case), or when expanding the training dataset in order to improve the model's performance.
Two main models can be chosen for the SMB simulations:
Deep Artificial Neural Network: A deep ANN, also know as deep learning, is a complex nonlinear statistical model optimized by gradient descent. The SMB ANN models are trained with the glacier_neural_network_keras.py script in the scripts folder. ALPGM comes with already trained glacier-wide SMB models which can be used for multiple spatiotemporal simulations. Sample weights can be used in order to balance SMB datasets to better represent extreme values. As explained in Bolibar et al. (2020), this comes at the cost of a RMSE/variance tradeoff. In order to use it for simulations, choose the "ann_weights" or "ann_no_weights" models in alpgm_interface.py
Lasso: The Lasso (Least absolute shrinkage and selection operator) (Tibshirani, 1996), is a shrinkage method which attempts to overcome the shortcomings of the simpler step-wise and all-possible regressions.
In these two classical approaches, predictors are discarded in a discrete way, giving subsets of variables which have the lowest prediction error. However, due to its discrete selection, these different subsets can exhibit high variance,
which does not reduce the prediction error of the full model. The Lasso performs a more continuous regularization by shrinking some coefficients and setting others to zero, thus producing more interpretable models (Hastie et al., 2009).
Because of its properties, it strikes a balance between subset selection (like all-possible regressions) and Ridge regression (Hoerl and Kennard, 1970)
All the data needed to run the French alpine glaciers case study simulations is available in this repository: the topographical and SMB data for the glaciers, the glacier-specific delta-h parameterized functions, and the initial glacier ice thickness for the all the glaciers in the region (Farinotti et al. 2019). With the exception of the SAFRAN (Durand et al. 2009) climate data preprocessed files, which can be [downloaded here](https://www.dropbox.com/s/2kisbxk2ajaunh2/SAFRAN_meteo_data.rar?raw=1) separately due to their size.
Dependencies are specified in the dependency graph of this repository: https://github.com/JordiBolibar/ALPGM/network/dependencies