Cerfacs Enter the world of high performance ...

Robust generalization methods for AI techniques in hybrid high-fidelity PDE solvers

  |   |  

Required Education : Master
Start date : 1 January 2023
Mission duration : 6 months
Deadline for applications : 1 January 2023
Salary : 600 €/month net

The Centre Européen de Recherche et de Formation Avancée en Calcul Scientifique (CERFACS) specializes in the numerical simulation of scientific problems requiring powerful computing tools. It associates synergistically for research and for training purposes mathematics and physics specialists with numerics experts and engineers. This work will be performed in the Algo-COOP team [https://cerfacs.fr/coop/](https://cerfacs.fr/coop/). This team develops and improves codes for numerical simulation in parallel computers. Numerical simulation of complex configurations is routinely used by industrial partners (Airbus, Safran, EDF, Total, CNES…) for design purposes. The progress achieved both in computing efficiency and numerical algorithms makes CFD an indispensable tool for the aerospace industry. For additional information regarding CERFACS please refer to [https://www.cerfacs.fr](https://www.cerfacs.fr/).


Rapid developments and impressive achievements have occurred in artificial intelligence in recent years, in various domains like natural language processing, speech recognition and synthesis, computer vision and image synthesis, and more. Comparable applications in engineering and design remain minimal, however. This can be explained by several factors, e.g. the prior availability of precise and robust numerical methods, the expectation for comparable robustness and reliability, or the lack of high-quality diverse datasets.

One straightforward way to use machine learning (ML) in physics problems is for so-called surrogate models, i.e. when looking for a fast and cheap approximator of a phenomenon, and indeed much work has been undertaken in this direction. For now, however, these approximations cannot reach the robustness and reliability of many numerical solvers, which directly evaluate solutions to the underlying equations that govern a given problem. But numerical solvers are poorly suited to exploit data, relying instead mostly on the knowledge encoded in equations. Another use of ML comes in so-called hybrid systems, which seek to combine both prior knowledge of a problem and a technique to leverage ever-growing datasets. The goal of these techniques is to aim for a “best of both worlds” approach, by alleviating the shortcomings of numerical models and leveraging datasets, all while keeping the guarantees of numerical solvers and encoded physical knowledge (e.g. accuracy, conservation of physical quantities, temporal stability). This is, in fact, analogous to what data scientists do on a regular basis, where the choice of e.g. neural network architectures is meant to better restrict the search space to functions compatible with the data's underlying generator functions. Encoding physical knowledge is a mostly separate and challenging endeavor however, and underlines the emerging field of hybrid physics solvers.

Training a hybrid system often involves generating data with a so-called “high-fidelity” simulator, which directly solves all of the relevant physical equations governing a problem of reduced size. This data is then processed to emulate a coarser description of the problem, and supervised training is performed to model the phenomenon. This is called an offline model. But to be useful, this model must then be integrated into a simulation code, and run in a hybrid manner on much larger physical cases. The simulator solves part of the physics and relies on the pre-trained model for the rest. The *online* exercise compares the performance of the solver with different models for this unsolved part with a reference, such as a very large simulation or experimental data.

This offline-to-online exercice is pervasive in computational physics, and is often performed with physics-inspired coefficient tuning on very-low complexity models. The possibility of training much more complex models with a large capacity — including deep neural networks — for this task is an opportunity to push this strategy further and significantly increase the predictive power of these systems. But it comes with a central difficulty: the offline training is not necessarily representative of the final task, the online run. This can be due to the difficulty in emulating the online problem for prior simulations, or to the impossibly large space of physics that can be addressed with a given solver, which cannot be densely mapped by the training dataset. The larger the models trained for the task, the higher the risk of overlearning the offline problem or dataset, and of poor generalization to the online problem. Generalization from one to the other becomes more difficult as models become susceptible to overlearning the a priori problem.


This problem is increasingly mentioned in the literature, but few studies report systematic attempts to quantify it, let alone remedy it to date. However, in the ML community, work on non-supervised or few-shot generalization is common, and recently applications to physics topics have appeared. In this internship, we will explore how the ML work could be repurposed for a hybrid physics solver, and seek to set up principled ways of addressing the generalization problems that hybrid approaches face. This will include exploring the literature on unsupervised domain adaptation, few-shot adaptation, and domain generalization techniques robust to distribution changes. Simplified test cases will be selected, and data collected or produced with an in-house compressible LES/DNS solver at CERFACS, AVBP. Algorithms from the literature will be trained on the resulting datasets, and benchmarked on their generalization capacity from offline to online.

The internship will be performed in the Algo-COOP team, which specializes in numerical algorithms, and houses the AI initiative Helios.


Currently in your last year of a Master’s degree, specializing in numerical physics or a related field, you have a little experience with ML, or a strong taste for these technologies and the desire to learn about them. Alternatively, you have a major in computer science and ML, and are interested in physical modeling applications. This position requires active reading of the scientific literature in the domain and fast learning. In the research lab environment, initiative, autonomy, creativity, and synthetic thinking are highly valued. Experience with CFD solvers, as well as a data processing language (Python, R, Matlab) is a plus.


Corentin Lapeyre (lapeyre@cerfacs.fr)