Project

General

Profile

Introduction

General Introduction

Snowpack models suffer for large errors and uncertainties which limit their use, in particular for spatialised applications. CrocO is an ensemble data assimilation system designed to tackle this issue. In this framework, an ensemble of models quantifies snowpack modelling errors. These errors are reduced by assimilating snowpack observations, using a Particle Filter (PF). Several innovative versions of the PF are developed within CrocO, to solve for spatialisation issues [1].

Spatialised applications of large ensembles are computationally intensive, and require the parallelization of the ensemble members and an optimization of resource fluxes. For this reason, CrocO is tailored to Météo-France’s research HPC system (beaufix/belenos-hendrix).

This page provides a quick description of Croco assimilation sequence, a guide for installation on Météo-France HPC system and a user guide to launch Croco simulations.
For a technical documentation (Code version, main technical developments, new files and options), have a look at CrocO_technical_doc.
Finally, a guide for developers can be downloaded for further details on the implementation.

CrocO assimilation sequence

CrocO is a sequential data assimilation system: observations are assimilated date after date, as the ensemble advances over time.
  • observation files and dates are known and prepared beforehand
  • an ensemble of simulations (OFFLINE executables) is launched between each observation dates using ESCROC ensemble version of Crocus snowpack model (Lafaysse et al.; 2017).
  • on each observation date, a Particle Filter (SODA executable) is used to correct the ensemble simulation with the observations (see Fig. 1).

On the example on Fig. 1, ESCROC ensemble is used to propagate 3 particles (full state vector of the ensemble members) until observation date t1. At t1, the Particle Filter resamples the particles, in order to bring the ensemble closer to the observation. The ensemble is then initialized at t1 with these new initial states and launched until next observation date (t2).

More details

  • Snowpack modelling errors are accounted for by combining a meteorological ensemble (stochastic perturbations) with an ensemble of snowpack models (ESCROC) as described in [2] and Fig.2:

    - a run is a unique combination of a forcing F* and an ESCROC configuration M*
    The configuration F* - M*, is fixed during the experiment
    - all the runs are initialized by the same spinup X_0
    - the total number of runs is defined by the parameter nmembers
    - nmembers also defines the number of different ESCROC configurations M*
    - nforcing (<=nmembers) defines the number of different forcings F* to use
    If nforcing < nmembers, the forcings are repeated until all runs have a forcing.
  • In spatialised applications, the PF has to ingest a large number of observations, which leads the PF to replicate only one particle, an issue called degeneracy (Snyder et al., 2009). Within SODA two alternatives are developed to tackle this issue : inflation of the observation errors (inspired by Larue et al., HESS; 2018) and k-localisation which uses the ensemble background correlation patterns to localise the Particle Filter.

A thorough description of CrocO is given in [1].
A guide for developers is in progress.

References

- [1]. Cluzet, B., Lafaysse, M., Cosme, E., Albergel, C., Meunier L.-F., and Dumont, M. CrocO_v1.0 : a Particle Filter to assimilate snowpack observations in a spatialised framework, (submitted) https://doi.org/10.5194/gmd-2020-130.
- Deschamps-Berger et al., (in prep)
- Revuelto et al., (in prep)
- [2]. Cluzet, B., Revuelto, J., Lafaysse, M., Tuzet, F., Cosme, E., Picard, G., Arnaud, L., and Dumont, M.: Towards the assimilation of satellite reflectance into semi-distributed ensemble snowpack simulations, Cold Regions Science and Technology, 170, 102 918, 2020.

Installation on Météo-France HPC systems

Dependencies

CrocO depends on several open-source codes distributed by the CNRM (Centre National de Recherches Metéorologiques) using git :

More details on these libraries can be found in the technical documentation : CrocO technical doc

  • Optionally, CrocO_toolbox features tools to prepare the observations, pre/post-process simulations and locally launch CrocO (use of Météo-Frace HPC system is however highly recommended as simulations are quite computationally expensive).
    https://github.com/bertrandcz/CrocO

Prerequisites (not exclusive):

Standard users only need to install SURFEX on beaufix/belenos. snowtools_git and VORTEX are already installed for them on beaufix/belenos.

Install SURFEX

  • If you're not developping in SURFEX, download the cen branch directly on beaufix/belenos.
    Otherwise, you'd better install it locally and synchronize your local modifications to belenos with rsync command (inspiring on rsync_SURFEX_V81_beaufix).

    Install_SURFEX

  • Compile it in NOMPI-O2 configuration (sequential ensemble application case in the following link) :

    belenos :Compile_SURFEX_on_Belenos

Don't forget to frequently update your code version doing git pull in your code repository.

Set CrocO environment

Install (developers)

Developers in snowtools_git and VORTEX (for CrocO) must properly install snowtools_git and vortex (eventually with rsync inspiring on rsync_snowtools_git and rsync_vortex) following:
Install and Install_VORTEX

Install (standard users)

Standard users don't need to install VORTEX nor snowtools_git. They will use versions of snowtools_git and VORTEX maintained by the main developer.
  • To use snowtools and VORTEX, they just need to modify their .bash_profile on belenos :
# vortex
export MTOOLDIR=$WORKDIR
export VORTEX=$HOME/common/vortex/vortex-cen
export PYTHONPATH=$VORTEX
export PYTHONPATH=$PYTHONPATH:$VORTEX/bin
export PYTHONPATH=$PYTHONPATH:$VORTEX/site
export PYTHONPATH=$PYTHONPATH:$VORTEX/src
export PYTHONPATH=$PYTHONPATH:$VORTEX/project

# snowtools_git
export SNOWTOOLS_CEN=$HOME/common/snowtools_git
export PYTHONPATH=$PYTHONPATH:$SNOWTOOLS_CEN/snowtools
alias s2m="python $SNOWTOOLS_CEN/snowtools/tasks/s2m_command.py" 

  • In order to upload files directly to hendrix and sxcen, standard users also must configure file transfers with hendrix archive and sxcen (see Install_VORTEX).

Setting new geometries (developers and standard users)

Any CrocO experiment should be associated with a given VORTEX geometry (location). In addition, they are used in the archive path and filenames. Some of them are defined in $VORTEX/conf/geometries.ini. If you need to define a new geometry (region), you must define it in a new file $HOME/.vortexrc/geometries.ini which contains the following lines. For example, Grandes-Rousses massif has a region_id of 12 and its region_name is grandes_rousses:

[region_id]
info       = Describe here your new geometry
kind       = unstructured
area       = region_name

Fetching simulation outputs to sxcen.cnrm (developers and standard users)

For post-processing and dev purposes, it is possible to fetch simulation outputs directly to sxcen.cnrm, in the NO_SAVE directory (see --writesx argument in CrocO_user_doc), thus avoiding to manually download them from hendrix.
In order to configure that, build the following symbolic link on sxcen.cnrm:

cd /cnrm/cen/users/NO_SAVE/
mkdir <username>
cd <username>
mkdir vortex
ln -s /cnrm/cen/users/NO_SAVE/<username>/vortex/ $HOME/vortex

How to use CrocO

Once the installation is done, you're almost ready to perform a CrocO experiment on belenos/taranis, Météo-France's super computer.
CrocO experiments are launched by calling s2m command on belenos/taranis. Before looking at this command, let's see what it can do, and how to prepare an experiment.

Definition of a CrocO experiment

3 different types of experiments can be launched with CrocO:

  • openloop : no assimilation. Useful as a reference or to generate synthetic observations
  • synthetic : assimilation of synthetic observations (e.g. from a previous openloop run)
    (!) In that case, if the ensemble setup is the same as the openloop, you MUST remove the synthetic truth member from your ensemble (see below).
  • real : assimilation of real observations

Prepare an experiment

  • initial conditions (PGD, and PREP), forcings ensemble and observations files : Remind that CrocO doesn't handle the generation of you need to generate it and archive it on hendrix beforehand.
    Detailed instructions can be found in CrocO technical doc.
  • namelist: Prepare a basic namelist. It will be used to configure the PF algorithm, and SURFEX I/O behavior. s2m will parse it and populate it with its relevant arguments (e.g. -b). This namelist will then be used as a mother namelist for ESCROC's multiphysics scheme (see Multiphysics).
    • (!) Carefully check the list of output variables in PRO files (see ex. below):
      Bear in mind that the number of output variables considerably increases the time spent in writing (during computation) and transfer as well as storage needs.
&NAM_WRITE_DIAG_SURFn
CSELECT = 'time','ASNOW_VEG','TALB_ISBA','TS_ISBA','WSN_T_ISBA','DSN_T_ISBA','SNOWDZ','WSN_VEG','SPECMOD','SNOWSSA','SNOWIMP1','SNOWIMP2'
  • assimilation configuration file: following CrocO technical doc, prepare a configuration file used to set the assimilation dates and eventually the ids of the ESCROC members to run.
    (!) If the ESCROC membersId are not specified in the configuration file (default cases for ESCROC subensemble E1*), they will be randomly drawn at the beginning of the simulation and written in a copy of the configuration file (which will be archived on hendrix to ensure traceability). You must specify it if you need to ensure reproducibility (for twin experiments for example).

Launch a CrocO experiment with s2m

CrocO assimilation sequence is ran through a single s2m command (snowtools_git command) on belenos/taranis.

First of all, if you're not familiar with s2m command and vortex read the following page :
Run_a_SURFEX-Crocus_experiment_without_vortex

Among them, you will need only the following. Optional are in ():

-m safran                  : set -m to safran.
-r <region_id> : your geometry, consistent with your vortex path and PGD filenames.
-b
-e
-f <xpid_forcing> or <xpid_forcing@username> : vortex xpid where the forcings are stored on hendrix
-o <xpid>                  : name of the repertory where the outputs will be stored
-n <path_to_your_namelist> : default namelist is not provided
(-x <yyyymmddhh>           : date of the spinup PREP (if it is not equal to -b))

A few supplementary arguments are necessary to run CrocO.

--croco=<your_path_to_assimilation_configuration_file> : give the path to the assimilation configuration file.
--escroc=<escrocsubensemble>                             : specify the ESCROC subensemble to use 
                                                           ("E1tartes", "E1notartes", "E2")
--obsxpid= <xpid_obs> or <xpid_obs@username> : vortex xpid where observations are stored on hendrix
--nmembers=N                                           : Number of members to run/draw among the subensemble
--nforcing=Nf                                         : Number of different forcings to use
--nnodes                                                 : number of nodes on which to parallelize
--walltime                                               : estimate of the time duration of the parallelized experiment (minutes). 
                                                           Your simulation will be terminated past that duration
--sensor                                                 : Name of the observations sensor/synthetical xp. (free, default is MODIS)

--openloop                                               : activate openloop mode
  OR
--synth <mbid>                                           : assimilation of synthetic data. Remove and replace the <mbid> member (synthetic truth)
  OR
--real                                                   : assimilation of real data

(--writesx                                               : activate output to sxcen.cnrm in NO_SAVE/)
(--grid                                                  : specify if your're performing gridded simulations)

Example :

s2m -n ~lafaysse/croco/OPTIONS_MOTHER_DEP.nam -r postes_12_csv -b 2013080106 -e 2014063006 -x 20160801 --escroc=E1notartes -o test0l --nmembers=35 --nforcing=35 --croco=~lafaysse/croco/conf.ini -f forcing_20132014B_31D_11_t1500_160@fructusm -m safran --real -s /home/cnrm_other/cen/mrns/lafaysse/SURFEX/cen/exe_mpi --obsxpid=obs@fructusm --sensor=12

Simulation outputs

Once the simulation has finished, simulation outputs are stored in the vortex path on hendrix.
(see Simulation outputs storing in CrocO technical doc)
Now it's up to you to post-process it ! :)