Cerfacs Enter the world of high performance ...

PhD: Hybrid data assimilation and deep learning system for Earth monitoring

   |   |  ,

Required Education : Master 2
Start date : 2 September 2024
Mission duration : 36 mois
Deadline for applications : 15 March 2024



Essential Variables (EVs) are key indicators to adequately describe and monitor the evolution of Earth’s changing climate. Accordingly, the regular and accurate mapping of variables such as Leaf Area Index, Land Surface Temperature, Evapotranspiration or Soil Moisture are becoming increasingly important for environmental applications. Satellite imagery is an efficient tool to effectively predict EVs trends, define present conditions, and inform about their future evolution. Specifically, Sentinel missions offer incredible opportunities to provide accurate, timely and easily accessible information of the Earth surface. However, extracting useful information from raw satellite images acquired by multimodal sensors, which are nonstationary, not dense, multi-variate and have temporal gaps is challenging. The retrieval of EV from noisy satellite observations is traditionally seen as an inverse modelling problem.

In recent years, physics-constrained data-driven methodologies have been proposed to infer values of the physical model parameters from observations. Without relying on any simulation data, these methods have been proposed to train a deep neural network to map satellite observations into the physical model parameters in an unsupervised manner. To perform it, these methods consider that the parameters of the inverse problem are the latent variables of a semantic variational autoencoder in which the decoder part is a physical model incorporating knowledge about data generation. With the increasing availability of large-scale datasets, this unsupervised framework is an attractive choice for modelling and forecasting the space-time EV dynamics from satellite observations.


This PhD aims to predict EVs' future states by combining machine learning (ML) and data assimilation (DA) strategies from noisy and sparse satellite observations. On one side ML can benefit from DA, which accommodates real observations that are noisy, sparse, and only indirectly related to the physical state of interest like EV. DA follows the Bayesian approach to represent uncertainties of the observations, and retains existing physical knowledge. On the other side, DA could benefit from ML in handling more complex models and error distributions.

Given the unknown dynamical time propagation model for EVs, we will first focus on a fully data-driven physical model for forecasting EVs on hypothetical situations which are obtained by forcing specific assumptions about future weather scenarios. Once a dynamical model is developed, data assimilation strategies will be integrated to use real-time observations to progressively adjust the trajectories of EVs over time. Within this project we will investigate different strategies to have better estimate of EVs.

From a methodological point of view, this Phd proposes to study the combination of machine learning and data assimilation strategies which is one of one of the hottest topics in Artificial Intelligence.


The research conducted here will be integrated in ANITI (Artificial and Natural Intelligence Toulouse Institute) which CESBIO and CERFACS are actively involved. PhD funding will be partly from CNES and partly from Thales Services Numériques.


We are looking for enthusiastic people to join our interdisciplinary research group. The candidates must have a Master's level in computer science (data science or similar). Applicants should preferably have a strong background in mathematics, signal and image processing and machine learning. A good knowledge of English and scientific programming is required.


Please send your CV and motivation letter to

·       Silvia Valero-Valbuena: silvia.valero-valbuena@iut-tlse3.fr

·       Selime Gürol: gurol@cerfacs.fr