🎓 PhD Defense: Théo DEFONTAINE
Friday 24 May 2024From 10h00 at 13h00
Phd Thesis JCA room, Cerfacs, Toulouse, France
Flood forecast lead time extension with machine learning on a scarce and heterogeneous dataset
👉https://youtube.com/live/4veabZeo77o?feature=share
In France, flood forecasting services are a fairly recent entity. The data they have available and the models they use are also recent. A number of initiatives are being taken at local, regional and national levels to meet the different needs of the territory. These include efforts to harmonize forecasting models. But also efforts to renew and monitor techniques. In this way, local public services work jointly with research organizations to improve flood forecasting at different scales.
In flood forecasting, most of the models used are based on physical models. These can take many forms. The most accurate ones will solve the Shallow Water equations in great detail. This requires precise data on the study area. Others, less precise, simplify or replace all or part of these equations. All these models are empirically calibrated by the hydrologist for each catchment. Some are very simple, with few parameters to calibrate. They are, however, limited in their representation of the catchment. This is the case for empirical models based on single hydraulic reach. These models are calibrated empirically, for each new case study. This is what the Garonne-Tarn-Lot Flood Forecasting Service uses for flood forecasting at the Toulouse Pont Neuf station. These models work with information from upstream stations, for 4 h, 6 h and 8 h lead-times.
As in many fields, data-driven machine learning approaches are becoming increasingly popular. Hydrology and flood forecasting are no special cases. This thesis discusses the use of machine learning models for short-term flood forecasting. For flood forecasting in Toulouse, each forecast period requires the calibration of a new model. Machine-learning models free us from these expert-supervised processes.
The choice of possible machine learning models is determined here by the small size of the database. We only work with flood events. There are only a few events numerically available. The scarcity of the dataset forces us to take adapted measures to ensure a more robust approach.
We therefore only use temporal chronicles of data (no spatialized data). Machine learning models are used with the same data as empirical models. The learning models are a linear regression, a gradient reinforcement regressor and a multilayer perceptron, each of which cannot take ordered data as inputs. Flow and rainfall data are therefore pre- processed (hydrograph shifts, moving averages, etc.) to incorporate temporal information before being passed to the models.
The models are first tested with a 6 h lead-time, with different input data configura- tions. With the same configuration as the flood forecasting service model, performance was better. The addition of rainfall data has a positive, but less significant effect. At 8 h lead-time, when no reference is used, the models achieve decent performances. The con- tribution of rainfall data is more difficult to evaluate, but more significant. The transfer of the approach to a new, more complex case is here considered successful. There is still much room for improvement, and other more flexible approaches could be explored.
Jury
- Mme. Borrell Estupina Valérie, Maîtresse de conférence, Rapporteur
- M. Lucor Didier, DR, Rapporteur
- M. El Moçayd Nabil, Assistant Professeur, Rapporteur
- M. Thual Olivier, DR émérite, Examinateur
- M. Bousquet Nicolas, CR, Examinateur
- Mme. Ricci Sophie, Chercheuse senior CERFACS, encadrement
- M. Lapeyre Corentin J., CR, Invité
- M. Marchandise Arthur, Ingénieur, Invité