Build your hybrid AI/AVBP solver with PhyDLL

Disclaimer : This tutorial is done using the software AVBP, a Cerfacs software available for academic research by requesting a Non Disclosure Agreement . See https://avbp-wiki.cerfacs.fr/ for more information on the AVBP software.

In this tutorial, you will learn how to use PhyDLL to couple a Deep Learning (DL) model with AVBP. First of all, for the sake of reproducibility, checkout to the AVBP source code commit this tutorial was built upon:

cd $AVBP_HOME
git checkout caebb9e2be3c1f2f0842223b6e570a562a422a63

0. Install PhyDLL version 0.1

0.1 On Kraken

In order to fully enjoy PhyDLL capabilities, we need to first install the code coupler CWIPI, a software co-developed by ONERA and CERFACS. CWIPI is used by PhyDLL to handle data exchange between a voxel-based AI model (e.g., a Convolutional Neural Network, a U-net) and an unstructured solver such as AVBP. The use of CWIPI is related to the Interpolation Scheme (IS) described in Serhani et al. (2024).

First download the version 0.12.0 of CWIPI, unzip it and create/enter the folder CWIPI:

wget https://w3.onera.fr/cwipi/sites/default/files/2023-10/cwipi-0.12.0.tgz
tar -zxvf cwipi-0.12.0.tgz
mkdir CWIPI
cd CWIPI

Make sure the right compilers are available. A call to module list delivers the following at the time this post was written (for reference only, since the default compilers might change over time):

 1) compiler/gcc/11.2.0   2) compiler/intel/23.2.1(default)   3) mpi/intelmpi/2021.10.0(default)

Then set the compilers:

export CC=mpiicc
export CXX=mpiicpc
export FC=mpiifort

You will also need to set the Python variables:

export PYINTERP=/softs/anaconda3-2020.07/envs/tf2.6-cuda11.2/bin/python
export CYINTERP=/softs/anaconda3-2020.07/envs/tf2.6-cuda11.2/bin/cython
export PYLIB=/softs/anaconda3-2020.07/envs/tf2.6-cuda11.2/lib/libpython3.so

Then run the following commands:

CC=$CC CXX=$CXX FC=$FC cmake -DCWP_ENABLE_PYTHON_BINDINGS=ON -DCWP_ENABLE_Fortran=ON -DCWP_BUILD_DOCUMENTATION=OFF -DPYTHON_EXECUTABLE=$PYINTERP -DCYTHON_EXECUTABLE=$CYINTERP -DPYTHON_LIBRARY=$PYLIB -DCMAKE_INSTALL_PREFIX=$PWD -DCMAKE_CWP_INSTALL_PYTHON_DIR=$PWD ../cwipi-0.12.0

make

make install

It is good practice to create a virtual environment for your Python installation of PhyDLL. If you don’t know what is a virtual environment, you can check these amazing webpages: here and here.

First load the module python/3.9.5:

module load python/3.9.5

Then create your virtual environment (do not forget to change /path/to/virtualenv in the following to the path where you want your virtual environment):

python3 -m venv --system-site-packages /path/to/yourvirtualenv

Note the --system-site-packages is to give the virtual environment access to the system site-packages directory. Thus, we will not need to install again mpi4py for example.

Finally, you can activate your virtual environment:

source /path/to/yourvirtualenv/bin/activate

For this tutorial, install PyTorch and h5py following the instructions available here and here, respectively.

You can finally install PhyDLL version 0.1 on Kraken. Move to the directory where you want to install PhyDLL and run the following commands:

git clone https://gitlab.com/cerfacs/phydll
cd phydll
git checkout release/0.1
mkdir ../PHYDLL
export CWIPI_DIR=/path/to/CWIPI
make FC=mpiifort BUILD=../PHYDLL
pip install -U -e .

Now, open your .bashrc in your $HOME and add the following lines:

export PHYDLL_DIR="path/to/your/PHYDLL"
export CWIPI_DIR="path/to/your/CWIPI"
export PYTHONPATH="$CWIPI_DIR/lib/python3.9/site-packages"

as these variables will be useful to compile AVBP with CWIPI and PhyDLL and will also be useful to run the jobs with PHYDLL. Adapt the paths accordingly.

0.2 On Calypso

We detail here how to install CWIPI and PhyDLL on the partition grace of Calypso. Make sure that you have read the previous subsection on the installation on Kraken (even if you are not planning to make it run on this machine) to learn more about CWIPI and virtual environments.

First download the version 0.12.0 of CWIPI, unzip it and create/enter the folder CWIPI:

wget https://w3.onera.fr/cwipi/sites/default/files/2023-10/cwipi-0.12.0.tgz
tar -zxvf cwipi-0.12.0.tgz
mkdir CWIPI
cd CWIPI

The installation of CWIPI and PhyDLL on Calypso is a little bit more tedious, since the architecture of the processors on the partition grace is different than the ones on the login nodes. So you first have to connect to grace nodes before installing software through:

salloc -p grace --gres=gpu:0

This will connect you to a grace node. Note the --gres=gpu:0 is to make a compilation using only CPUs.

Load the following modules with:

module load python/3.12.10 avbp/7.X_gcc124_ompi417 tools/cmake

To install CWIPI version 0.12.0, you will need to change in the cwipi-0.12.0/Cython/setup.py the lines:

from distutils.core import setup
from distutils.sysconfig import get_python_lib

for

from setuptools import setup

Also, in cwipi-0.12.0/Cython/cwipi.pyx.in, you will need to add noexcept to some functions. Here is the output of diff cwipi.pyx.in cwipi.pyx.in.ori where cwipi.pyx.in.ori designs the original file and the cwipi.pyx.in is the modified file for this tutorial:

1310c1310
<                    void *distant_field) noexcept:
---
>                    void *distant_field):
1490c1490
<                       void *distant_field) noexcept:
---
>                       void *distant_field):
1668c1668
<                           double *weights) noexcept:
---
>                           double *weights):
1709c1709
<                             double *projected_uvw) noexcept:
---
>                             double *projected_uvw):

Add the noexcept to the corresponding lines: 1310, 1490, 1668 and 1709.

Once it is done, export the following variables:

export CC=mpicc
export CXX=mpicxx
export FC=mpif90
export PYINTERP=/softs/local_arm/python/3.12.10/bin/python3
export CYINTERP=/softs/local_arm/python/3.12.10/bin/cython
export PYLIB=/softs/local_arm/python/3.12.10/lib/libpython3.so

You can then install CWIPI through

 CC=$CC CXX=$CXX FC=$FC cmake -DCWP_ENABLE_PYTHON_BINDINGS=ON -DCWP_ENABLE_Fortran=ON -DCWP_BUILD_DOCUMENTATION=OFF -DPYTHON_EXECUTABLE=$PYINTERP -DCYTHON_EXECUTABLE=$CYINTERP -DPYTHON_LIBRARY=$PYLIB -DCMAKE_INSTALL_PREFIX=$PWD -DCMAKE_CWP_INSTALL_PYTHON_DIR=$PWD ../cwipi-0.12.0
make
make install

You can now create your virtual environment using the instructions here. For this tutorial, we used the singularity pytorch25.02.sif, so you will need to change:

singularity run --nv -B /scratch/ /softs/local_arm/singularity/images/pytorch24.05.sif /bin/bash

for

singularity run --nv -B /scratch/ /softs/local_arm/singularity/images/pytorch25.02.sif /bin/bash

In order to install PhyDLL, you will need to install the version 3.1.6 of mpi4py. Once you are in your virtual environment, run:

pip3 install mpi4py==3.1.6

You will also need h5py:

pip3 install h5py

You can then proceed to the installation of PhyDLL:

git clone https://gitlab.com/cerfacs/phydll
cd phydll
git checkout release/0.1
mkdir ../PHYDLL
export CWIPI_DIR=/path/to/CWIPI
make FC=mpif90 BUILD=../PHYDLL
pip install -U -e .

Now, open your .bashrc in your $HOME and add the following lines:

export PHYDLL_DIR="path/to/your/PHYDLL"
export CWIPI_DIR="path/to/your/CWIPI"
export PYTHONPATH="$CWIPI_DIR/lib/python3.12/site-packages"

1. Implementing PhyDLL in AVBP

If you want to learn more about how to couple a Fortran solver with a Python Deep Learning model before diving into AVBP-tailored coupling, you can look at this page or look at the minimal example here. Now the fun begins, fasten your seat belts ladies and gentlemen !

The first task to implement PhyDLL in AVBP is to create a variable that we call enable_phydll. If set to yes, this variable will send and receive fields with PhyDLL. Open with your favorite text editor the file SOURCES/COMMON/mod_parallel_defs.f90 and replace the line

  INTEGER :: enable_avbpdl = NO_                                        !> Activate the neural net (it is automatically activated by AVBP-DL coupling)

for

! PhyDLL
  INTEGER :: enable_phydll = NO_                                        !> Activate PhyDLL coupling

AVBPDL was the way of coupling Deep Learning and AVBP in the past. We are changing it so we can use PhyDLL instead.

Note that by default, PhyDLL will not be used and only triggered through your bash script to submit your job.

Once you have done that, open the file SOURCES/MAIN/slave.f90 and replace the two occurences of enable_avbpdl with enable_phydll. I am not an expert in MPI processes but from what I understood, the lines:

! Global MPI barrier for AVBP-DL
  IF ( enable_phydll==YES_ ) THEN
    CALL MPI_BARRIER ( global_comm, ierror )
  END IF

makes the processes wait until all of them have finished the previous instruction.

Now, open SOURCES/MAIN/metric.f90 and add in the subroutine metric before IMPLICIT NONE (line 79 in the commit of this tutorial) the following lines:

#ifdef PHYDLL
    USE phydll,                      ONLY: phydll_define, phydll_what_is_dlmesh
    USE mod_parallel_defs,           ONLY: enable_phydll
#endif

Also add to the local variables of this subroutine:

INTEGER :: phydll_dlmesh
INTEGER :: i
INTEGER, DIMENSION(grid%data_n%nnode) :: local_node_to_global
INTEGER, DIMENSION(grid%ncell) :: local_element_to_global

Still, in the same file write after CALL send_coupling_meshes and before IF ( imove>=0 ) THEN:

#ifdef PHYDLL
    IF ( enable_phydll == YES_ ) THEN
      CALL phydll_what_is_dlmesh(phydll_dlmesh)

      SELECT CASE (phydll_dlmesh)
      CASE (0) ! Non-context aware phydll coupling
        CALL phydll_define(field_size=grid%data_n%nnode)

      CASE (1) ! dMPI phydll coupling
        DO i= 1, grid%data_n%nnode
          local_node_to_global(i) = grid%pmesh%LocalNodeToGlobal(i)
        END DO
        DO i= 1, grid%ncell
          local_element_to_global(i) = grid%pmesh%LocalElmntToGlobal(i)
        END DO
        CALL phydll_define(dim=ndim, ncell=grid%ncell, nnode=grid%data_n%nnode, &
          nvertex=grid%nvert_max, ntcell=grid%ntcell, ntnode=grid%ntnode, &
          element_to_node=grid%pmesh%element_to_node_ptr, node_coords=grid%pmesh%node_coords, &
          local_node_to_global=local_node_to_global, local_element_to_global=local_element_to_global)

      CASE(2) ! CWIPI phydll coupling
          CALL phydll_define(dim=ndim, nnode=grid%data_n%nnode, ncell=grid%ncell, nvertex=grid%nvert_max, node_coords=grid%pmesh%node_coords, element_to_node=grid%pmesh%element_to_node_ptr)
      END SELECT
    END IF
#endif

These lines are to define the kind of coupling between AVBP and the DL model.

Now open SOURCES/COUPLING/interf_mpi_init.f90 and add in the subroutine interf_mpi_init before IMPLICIT NONE:

  USE mod_parallel_defs,        ONLY: enable_phydll
#ifdef PHYDLL
  USE phydll,                   ONLY: phydll_init
#endif

Still in the same file, add before #ifndef CWIPI:

! Detect phydll coupling
#ifdef PHYDLL
  CALL phydll_init(global_comm, avbp_mpi_comm, enable_phydll)
#endif

This will enable to initialize the phydll instance. Finally, in the same file change

#ifndef CWIPI
  IF ( ncoupling>0 ) THEN
    message = "This AVBP exec does not have coupling support. Please recompile with CWIPI"
    CALL print_error_and_quit ( err_message=message )
  END IF
  avbp_mpi_comm = MPI_COMM_WORLD
#endif

for

#ifndef CWIPI
  IF ( enable_phydll == NO_ ) THEN
    IF ( ncoupling>0 ) THEN
      message = "This AVBP exec does not have coupling support. Please recompile with CWIPI"
      CALL print_error_and_quit ( err_message=message )
    END IF
  avbp_mpi_comm = MPI_COMM_WORLD
  END IF
#endif

So, we made the initialization of the phydll instance. We have now to also finalize it. Go to SOURCES/COUPLING/interf_mpi_end.f90 and add in the subroutine interf_mpi_end before IMPLICIT NONE:

#ifdef PHYDLL
  USE phydll,            ONLY: phydll_finalize
  USE mod_parallel_defs, ONLY: enable_phydll
#endif

At the beginning of the subroutine, add:

USE mod_param_defs
USE mod_coupling,      ONLY: finalize_couplings

Also delete USE mod_coupling, ONLY: finalize_couplings after #ifdef CWIPI.

Still in the same file, change

#ifdef CWIPI
  IF ( color==1 ) THEN
    CALL miscog_end
    CALL cwipi_finalize_f ()
  END IF

  CALL finalize_couplings
#endif

for

#ifdef CWIPI
  IF ( color==1 ) THEN
    CALL miscog_end
    CALL cwipi_finalize_f ()
  END IF
#endif

#ifdef PHYDLL
  IF ( enable_phydll == YES_ ) THEN
    CALL phydll_finalize()
  END IF
#endif

  CALL finalize_couplings

You will then have to change all the occurences of enable_avbpdl with enable_phydll. VSCode users can easily do that using the “Search” tool (shortcut ⌘⇧F) combined with “Replace All”. Here is the output of the command grep -ri enable_avbpdl in the folder SOURCES:

MAIN/compute.f90:  USE mod_parallel_defs,                         ONLY: enable_avbpdl
MAIN/compute.f90:    IF ( kstage==1 .AND. enable_avbpdl==YES_ ) THEN
MAIN/compute.f90:        message = "kstage==1 .AND. enable_avbpdl==YES_ not supported by with openACC"
COUPLING/mod_avbpdl_couple.f90:    USE mod_parallel_defs, ONLY: avbp_mpi_comm, enable_avbpdl
COUPLING/mod_avbpdl_couple.f90:    enable_avbpdl = YES_
LES/efcy_neuralnet.f90:    USE mod_parallel_defs,    ONLY: enable_avbpdl
LES/efcy_neuralnet.f90:    IF ( enable_avbpdl==NO_ ) THEN
PARSER/control_hpc.f90:  USE mod_parallel_defs,    ONLY: ncproc, enable_avbpd`
PARSER/control_hpc.f90:        IF ( enable_avbpdl==YES_ ) THEN
PARSER/control_hpc.f90:    IF ( enable_avbpdl==YES_ ) THEN

Almost there ! We now have to change the makefiles according to every host. We will only do it for Kraken and Calypso in this tutorial. Open HOST/KRAKENGNU/makefile.h and add after the CWIPI paragraph:

# PHYDLL
PHYDLL_INC  = $(PHYDLL_DIR)/include
PHYDLL_LIB  = $(PHYDLL_DIR)/lib
PHYDLL_LINK = -L$(PHYDLL_LIB) -Wl,-rpath=$(PHYDLL_LIB) -lphydll -Wl,-rpath=$(cwpdir)/lib
ifdef PHYDLL_DIR
  FFLAGS += -DPHYDLL -I$(PHYDLL_INC)
  LDFLAGS += $(PHYDLL_LINK)
endif

Add the same lines to HOST/KRAKEN/makefile.h.

If you are running on Calypso, do the following:

cd $AVBP_HOME/HOST
mkdir -p CALYPSOARM/BIN
cp  /scratch/share/makefile.h_ARM CALYPSOARM/makefile.h

You can now add the lines that include PhyDLL in CALPYSOARM/makefile.h.

And now the final step, open SOURCES/MakeFile and add:

phydll:
        @if [ ! -z  $(PHYDLL_DIR) ]; then echo    "      With PhyDLL support: $(PHYDLL_DIR)" ; fi

2. Implementing the fields to send and to receive

Congrats for reaching this point ! You have now implemented PhyDLL parts in the code that you will not have to touch again ! However, it is not the case for the fields that you want to exchange between AVBP and your DL model as these are case-dependent. You will have to implement it yourself in AVBP.

As an example, we will show here how to call a DL model that returns values of turbulent viscosity at each grid point. The reference values of turbulent viscosity which will compose the training dataset come from the Smagorinsky model. The fields to send from AVBP will then be the 9 velocity gradients, the volume of the node and the mass density. These fields will be the 11 inputs of the DL model that will output the turbulent viscosity. AVBP will then receive the values of turbulent viscosity returned by the DL model.

2.1 Implementing the fields to send from AVBP

In this tutorial, we will need the velocities and its gradients. However, by default, these quantities are not allocated in AVBP. We need to add it in the subroutine data_n_allocate_les of SOURCES/COMMON/mod_data_defs.f90. After, the lines:

        SELECT CASE ( iles )
        CASE ( 11 )
!         Filetred Smagorinsky model
          CALL alloc_check ( list_array, self%w_filt, "w_filt", neq, self%nnode, already_allocated )  ! Use alloc_check here because this array is potentially already used by other models
        CASE ( 12 )
!         Dynamic Smagorinsky model
          CALL alloc ( list_array, self%csmagdyn, "csmagdyn", self%nnode )
          CALL alloc ( list_array, self%velfilt, "velfilt", ndim, self%nnode )
          CALL alloc ( list_array, self%mij, "mij", ndimsij, self%nnode )
          CALL alloc ( list_array, self%mmlm, "mmlm", 2, self%nnode )
          CALL alloc_check ( list_array, self%sij, "sij", ndimsij, self%nnode, already_allocated )    ! Use alloc_check here because this array is potentially already used by other models
          CALL alloc_check ( list_array, self%vel, "vel", ndim, self%nnode, already_allocated )       ! Use alloc_check here because this array is potentially already used by other models
          CALL alloc_check ( list_array, self%volfilt, "volfilt", self%nnode, already_allocated )     ! Use alloc_check here because this array is potentially already used by other models

add:

        CASE (100)
!         Neural net
          CALL alloc ( list_array, self%velocity, "velocity", ndim, self%nnode )
          CALL alloc ( list_array, self%grad_velocity, "grad_velocity", ndim, ndim, self%nnode )

We will also need to deallocate them later in the subroutine data_n_deallocate_les. After the line:

      IF ( ALLOCATED(self%T_wall_les) )          CALL dealloc ( list_array, self%T_wall_les, "T_wall_les" )

add:

      IF ( ALLOCATED(self%velocity) )            CALL dealloc ( list_array, self%velocity, "velocity" )
      IF ( ALLOCATED(self%grad_velocity) )       CALL dealloc ( list_array, self%grad_velocity, "grad_velocity" )

Finally, since these quantities are also allocated when in the run.params, we have save_solution.additional = postproc, we need to change the lines:

      IF ( ionlinepost .OR. ipostproc_advanced ) THEN

        CALL alloc_check ( list_array, self%vorticity, "vorticity", self%nnode, already_allocated )   ! Use alloc_check here because this array is potentially already used by other models
        IF ( .NOT.already_allocated ) THEN
          !$ACC ENTER DATA CREATE(self%vorticity) IF(alloc_acc)
        END IF
        CALL alloc ( list_array, self%velocity, "velocity", ndim, self%nnode )
        !$ACC ENTER DATA CREATE(self%velocity) IF(alloc_acc)
        IF ( ipostproc_advanced ) THEN
          CALL alloc ( list_array, self%grad_velocity, "grad_velocity", ndim, ndim, self%nnode )

for

      IF ( ionlinepost .OR. ipostproc_advanced ) THEN

        CALL alloc_check ( list_array, self%vorticity, "vorticity", self%nnode, already_allocated )   ! Use alloc_check here because this array is potentially already used by other models
        IF ( .NOT.already_allocated ) THEN
          !$ACC ENTER DATA CREATE(self%vorticity) IF(alloc_acc)
        END IF
        CALL alloc_check ( list_array, self%velocity, "velocity", ndim, self%nnode, already_allocated )
        !$ACC ENTER DATA CREATE(self%velocity) IF(alloc_acc)
        IF ( ipostproc_advanced ) THEN
          CALL alloc_check ( list_array, self%grad_velocity, "grad_velocity", ndim, ndim, self%nnode, already_allocated )

Go now to SOURCES/MAIN, open a file that we will name neuralnet_fields_to_send.f90 and copy/paste the following code:

!--------------------------------------------------------------------------------------------------
!     Copyright (c) CERFACS (all rights reserved)
!--------------------------------------------------------------------------------------------------


!--------------------------------------------------------------------------------------------------
!     MODULE mod_neuralnet_fields_to_send
!>    @brief   Set the fields to send from AVBP to your Deep Learning model through PhyDLL
!!    @details Set the fields to send from AVBP to your Deep Learning model through PhyDLL
!!    @authors A. Larroque
!!    @date    2025-02-25
!--------------------------------------------------------------------------------------------------
MODULE mod_neuralnet_fields_to_send

  USE mod_param_defs

  CONTAINS

!--------------------------------------------------------------------------------------------------
!     SUBROUTINE neuralnet_set_fields_to_send
!>    @brief   Set the fields to send from AVBP to your Deep Learning model through PhyDLL
!!    @authors A. Larroque, L. Drozda
!!    @date    2025-02-25
!--------------------------------------------------------------------------------------------------
  SUBROUTINE neuralnet_set_fields_to_send ( grid )

    USE mod_grid,                    ONLY: grid_t
    USE mod_input_param_defs,        ONLY: ndim, itimers
    USE mod_timers,                  ONLY: timer_start, timer_stop
    USE mod_misc_defs,               ONLY: avbpdl_timer_send_pre
    USE mod_ngrad,                   ONLY: ngrad
    USE mod_pmesh_transfer_combines, ONLY: OPERATION_ADD
#ifdef PHYDLL
    USE phydll,                      ONLY: phydll_set_phy_field, phydll_send_phy_fields
    USE mod_parallel_defs,           ONLY: enable_phydll
#endif

    IMPLICIT NONE

!   IN/OUT
    TYPE(grid_t) :: grid

!   LOCAL
    INTEGER :: kgroup
    INTEGER :: nvert
    INTEGER :: ng1
    INTEGER :: ng2
    INTEGER :: ng3
    INTEGER :: nglen
    INTEGER :: ierror
    INTEGER :: n

    CALL timer_start ( avbpdl_timer_send_pre, itimers )
    CALL fill_zero ( ndim*ndim*grid%nnode,grid%data_n%grad_velocity(1,1,1) )

    !$ACC PARALLEL LOOP DEFAULT(PRESENT) IF ( using_acc )
    DO n = 1, grid%nnode
      grid%data_n%rhoinv(n) = one / grid%data_n%w(1,n)
    END DO

    !$ACC PARALLEL LOOP DEFAULT(PRESENT) IF(using_acc)
    DO n = 1, grid%nnode
      grid%data_n%velocity(1,n) = grid%data_n%w(2,n) * grid%data_n%rhoinv(n)
      grid%data_n%velocity(2,n) = grid%data_n%w(3,n) * grid%data_n%rhoinv(n)
    IF ( ndim==3 ) THEN
      grid%data_n%velocity(3,n) = grid%data_n%w(4,n) * grid%data_n%rhoinv(n)
    END IF
    END DO

    DO kgroup=1,grid%ngroup

      nvert = grid%igtype(2,kgroup)
      ng1   = grid%ngbegin(1,kgroup)
      ng2   = grid%ngbegin(2,kgroup)
      ng3   = grid%ngbegin(3,kgroup)
      nglen = grid%nglength(kgroup)

!     Compute the velocity gradient
      CALL ngrad ( grid,ndim,nvert,nglen,grid%ielno(ng2),grid%bnd%snc(ng3),grid%data_n%velocity,grid%data_n%grad_velocity )

    END DO

!   declare that we are to exchange the gradients at 'shared' interfaces:
    CALL partition_declare_update ( grid,ndim*ndim,grid%data_n%nnode,grid%data_n%grad_velocity,operation_add )

!   exchange messages
    CALL partition_update ( grid,ierror )

    CALL cons_tens ( grid%data_n%nnode,ndim,ndim,grid%data_n%volninv,grid%data_n%grad_velocity,grid%data_n%grad_velocity )

#ifdef PHYDLL
    IF (enable_phydll == YES_) THEN
      CALL phydll_set_phy_field(field=grid%data_n%grad_velocity(1,1,:), label="grad_ux", index=1)
      CALL phydll_set_phy_field(field=grid%data_n%grad_velocity(1,2,:), label="grad_uy", index=2)
      CALL phydll_set_phy_field(field=grid%data_n%grad_velocity(1,3,:), label="grad_uz", index=3)
      CALL phydll_set_phy_field(field=grid%data_n%grad_velocity(2,1,:), label="grad_vx", index=4)
      CALL phydll_set_phy_field(field=grid%data_n%grad_velocity(2,2,:), label="grad_vy", index=5)
      CALL phydll_set_phy_field(field=grid%data_n%grad_velocity(2,3,:), label="grad_vz", index=6)
      CALL phydll_set_phy_field(field=grid%data_n%grad_velocity(3,1,:), label="grad_wx", index=7)
      CALL phydll_set_phy_field(field=grid%data_n%grad_velocity(3,2,:), label="grad_wy", index=8)
      CALL phydll_set_phy_field(field=grid%data_n%grad_velocity(3,3,:), label="grad_wz", index=9)
      CALL phydll_set_phy_field(field=grid%data_n%w(1,:), label="rho", index=10)
      CALL phydll_set_phy_field(field=grid%data_n%volw(:), label="nodal volume", index=11)
      CALL phydll_send_phy_fields(loop=.true.)
    END IF
#endif

    CALL timer_stop ( avbpdl_timer_send_pre, itimers )

  END SUBROUTINE neuralnet_set_fields_to_send

END MODULE mod_neuralnet_fields_to_send

As these are the fields that will be sent from AVBP to the DL model, you will have to change this subroutine to adapt it to your needs everytime you want to send other fields. You can find some variables in SOURCES/COMMON/mod_data_defs.f90 and in SOURCES/COMMON/mod_grid.f90. You will then have to change the phydll_set_phy_field accordingly. As an important note, the gradients here are not computed as in the Smagorinsky subroutine that go through gather and scatter operations.

Now, open SOURCES/MAIN/compute.f90, replace at the beginning of the subroutine compute:

USE mod_efcy_neuralnet,                        ONLY: efcy_neuralnet_set_fields_to_send

for

USE mod_neuralnet_fields_to_send,              ONLY: neuralnet_set_fields_to_send

and change at the beginning the paragraph:

! ----------------------------------------------------------------------
! Sends species-based progress variable to Python (AVBP-DL)
! ----------------------------------------------------------------------
    IF ( kstage==1 .AND. enable_phydll==YES_ ) THEN
      IF ( using_acc ) THEN
        message = "kstage==1 .AND. enable_phydll==YES_ not supported by with openACC"
        CALL print_error_and_quit ( err_routine=routine, err_message=message )
      END IF
      CALL efcy_neuralnet_set_fields_to_send ( grid )
      CALL send_coupling ( coupling(avbpdl) )
    END IF

for

! ----------------------------------------------------------------------
! Sends species-based progress variable to Python (PhyDLL)
! ----------------------------------------------------------------------
    IF ( kstage==1 .AND. enable_phydll==YES_ ) THEN
      IF ( using_acc ) THEN
        message = "kstage==1 .AND. enable_phydll==YES_ not supported by with openACC"
        CALL print_error_and_quit ( err_routine=routine, err_message=message )
      END IF
      CALL neuralnet_set_fields_to_send ( grid )
    END IF

That’s it, your fields are ready to be sent from AVBP to your DL model.

2.2 Implementing the fields to receive from the DL model

That’s the hard part and there is at my knowledge no magical formula to change easily the part of AVBP for your DL model. In our case, the Smagorinsky model is triggered by the run.params in the RUN-CONTROL section through LES_model = smago. The run.params is parsed in AVBP in SOURCES/PARSER/control_run.f90. By looking for the smago occurence, we find the following block:

          CASE ( 'no' )
            iles = 0
          CASE ( 'smago', 'smagorinsky' )
            iles = 1
          CASE ( 'smagofilt' )
            iles = 11
          CASE ( 'smagodyn', 'dynamic_smagorinsky' )
            iles = 12
          CASE ( 'wale' )
            iles = 2
          CASE ( 'k-equation' )
            iles = 3
          CASE ( 'sigma' )
            iles = 40

This is where the LES model is chosen. Thus, we add the possibility to change the LES_model by a DL model with:

          CASE ( 'no' )
            iles = 0
          CASE ( 'smago', 'smagorinsky' )
            iles = 1
          CASE ( 'smagofilt' )
            iles = 11
          CASE ( 'smagodyn', 'dynamic_smagorinsky' )
            iles = 12
          CASE ( 'wale' )
            iles = 2
          CASE ( 'k-equation' )
            iles = 3
          CASE ( 'sigma' )
            iles = 40
          CASE ( 'neuralnet' )
            iles = 100

The iles parameter is chosen arbitrarly here but we will have to remember the value corresponding to the case neuralnet. Thus, in the future, it will be possible to change the LES model by a DL model by setting in the run.params: LES_model = neuralnet.

Just after, we add neuralnet to the message:

            message = "Input for this flag should be either 'no', 'smago', 'smagofilt', 'smagodyn', 'wale', 'k-equation', 'sigma' or 'neuralnet'."

Still, in the same file, we add to the block:

    SELECT CASE ( iles )
    CASE ( 0 )
      WRITE ( 50,FMT='(T3,A)' ) "LES_model = no"
    CASE ( 1 )
      WRITE ( 50,FMT='(T3,A)' ) "LES_model = smago"
    CASE ( 2 )
      WRITE ( 50,FMT='(T3,A)' ) "LES_model = wale"
    CASE ( 3 )
      WRITE ( 50,FMT='(T3,A)' ) "LES_model = k-equation"
    CASE ( 11 )
      WRITE ( 50,FMT='(T3,A)' ) "LES_model = smagofilt"
    CASE ( 12 )
      WRITE ( 50,FMT='(T3,A)' ) "LES_model = smagodyn"
    CASE ( 40 )
      WRITE ( 50,FMT='(T3,A)' ) "LES_model = sigma"

the new case created:

    CASE ( 100 )
      WRITE ( 50,FMT='(T3,A)' ) "LES_model = neuralnet"

To let the program totally clean, still in the same file, we change:

    CALL add_keyword ( queue_1, key_name='LES_model',                               key_type='character',    key_status=required_key, key_comment='no/smago/smagofilt/smagodyn/wale/k-equation/sigma' )

for

    CALL add_keyword ( queue_1, key_name='LES_model',                               key_type='character',    key_status=required_key, key_comment='no/smago/smagofilt/smagodyn/wale/k-equation/sigma/neuralnet' )

We now create the subroutine to receive the fields, we open a new file named in visturb_neuralnet.f90 in SOURCES/LES/ and copy/paste the following code:

!--------------------------------------------------------------------------------------------------
!     Copyright (c) CERFACS (all rights reserved)
!--------------------------------------------------------------------------------------------------


!--------------------------------------------------------------------------------------------------
!     MODULE mod_visturb_neuralnet
!>    @brief   Compute the turbulent viscosity 'vis_turb' using a Deep Learning model.
!!    @details Compute the turbulent viscosity 'vis_turb' using a Deep Learning model.
!!    @authors A. Larroque, L. Drozda
!!    @date    14-11-2024
!--------------------------------------------------------------------------------------------------
MODULE mod_visturb_neuralnet

  USE mod_param_defs

  CONTAINS

  !--------------------------------------------------------------------------------------------------
!     SUBROUTINE visturb_neuralnet
!>    @brief   Compute the turbulent viscosity 'vis_turb' using the standard visturb_neuralnet model.
!!    @authors A. Larroque, L. Drozda
!!    @date    14-11-2024
!--------------------------------------------------------------------------------------------------
  SUBROUTINE visturb_neuralnet (grid)

    USE mod_grid,                    ONLY: grid_t
    USE mod_error,                   ONLY: print_error_and_quit
    USE mod_parallel_defs,           ONLY: enable_phydll
#ifdef PHYDLL
    USE phydll,                      ONLY: phydll_recv_dl_fields, phydll_apply_dl_field
#endif

    IMPLICIT NONE

!   IN/OUT
    TYPE(grid_t) :: grid

!   LOCAL
    INTEGER :: ierror
    CHARACTER(LEN=strl) :: message

!   ERROR: phydll=OFF and LES_model=neuralnet
    IF ( enable_phydll==NO_) THEN
      message = "You are using les_neuralnet (LES_model=neuralnet) without enabling PhyDLL."
      CALL print_error_and_quit ( err_message=message )
    END IF


#ifdef PHYDLL
    IF (enable_phydll == YES_) THEN
      CALL phydll_recv_dl_fields()
      CALL phydll_apply_dl_field(field=grid%data_n%vis_turb, label="vis_turb", index=1)
    END IF
#endif

  END SUBROUTINE visturb_neuralnet

END MODULE mod_visturb_neuralnet

And now, we have to call this subroutine at the adequate moment and place in the code. In order to know where to implement it, we look at the call path of the subroutine smagorinsky in SOURCES/LES/smagorinsky.f90. This subroutine computes the turbulent viscosity through the smagorinsky model. It is called in SOURCES/MAIN/COMPUTE/compute_smagorinsky.f90 in the subroutine compute_smagorinsky where the efficiency model is also computed. This subroutine compute_smagorinsky is then called in SOURCES/MAIN/compute_visturb_efcy.f90 in the subroutine compute_visturb_efcy, where we can notably remark the following block:

  IF ( iles==1 ) THEN

    CALL compute_smagorinsky ( grid )

  ELSE IF ( iles==2 ) THEN

    CALL compute_wale ( grid )

  ELSE IF ( iles==3 ) THEN
    IF ( using_acc ) THEN
      message = "iles==3 not supported with openACC"
      CALL print_error_and_quit ( err_routine=routine, err_message=message )
    END IF

    CALL k_equation ( grid )

  ELSE IF ( iles==40 ) THEN

    CALL compute_sigma ( grid )

  ELSE IF ( iles==11 ) THEN
    IF ( using_acc ) THEN
      message = "iles==11 not supported with openACC"
      CALL print_error_and_quit ( err_routine=routine, err_message=message )
    END IF

    CALL compute_filtered_smagorinsky ( grid )

  ELSE IF ( iles==12 ) THEN
    IF ( using_acc ) THEN
      message = "iles==12 not supported with openACC"
      CALL print_error_and_quit ( err_routine=routine, err_message=message )
    END IF

    CALL compute_smagorinsky_dyn ( grid )

  END IF

This is where the turbulent viscosity is computed according to the iles number. Remember when we entered in SOURCES/PARSER/control_run.f90 to define an iles number associated to neuralnet ? Then, now is the moment to use this iles ! By reasoning by analogy with the smagorinsky we open in MAIN/COMPUTE a file named compute_visturb_neuralnet.f90 and copy/paste the following code:

!     ===============================================================
!     Copyright (c) CERFACS (all rights reserved)
!     ===============================================================

SUBROUTINE compute_visturb_neuralnet ( grid )

  USE mod_param_defs
  USE mod_efcy_dyn,                ONLY: efcy_dyn
  USE mod_efcy_fct,                ONLY: efcy_fct
  USE mod_efcy_men,                ONLY: efcy_men
  USE mod_error,                   ONLY: print_error_and_quit
  USE mod_grid,                    ONLY: grid_t
  USE mod_input_param_defs,        ONLY: efcy_model
  USE mod_visturb_neuralnet,       ONLY: visturb_neuralnet

  IMPLICIT NONE

! IN/OUT
  TYPE(grid_t) :: grid

! LOCAL
  CHARACTER(LEN=strl) :: message
  CHARACTER(LEN=strl), PARAMETER :: routine = "compute_visturb_neuralnet"

!
! visturb_neuralnet model
!
  CALL visturb_neuralnet ( grid )


! Efficiency function
  SELECT CASE ( efcy_model )
  CASE ( EFCY_MODEL_COLIN1 )
    IF ( using_acc ) THEN
      message = "EFCY_MODEL_COLIN1 not supported with openACC"
      CALL print_error_and_quit ( err_routine=routine, err_message=message )
    END IF
!   Model of Colin et al.
    CALL efcy_fct ( grid )
  CASE ( EFCY_MODEL_COLIN2 )
!   Model of Colin et al.
    CALL efcy_dyn ( grid )
  CASE ( EFCY_MODEL_CHARLETTE, EFCY_MODEL_CHARLETTE_DYN )
    IF ( using_acc ) THEN
      message = "EFCY_MODEL_CHARLETTE not supported with openACC"
      CALL print_error_and_quit ( err_routine=routine, err_message=message )
    END IF
!   Model of Charlette et al.
    CALL efcy_men ( grid )
  CASE ( EFCY_MODEL_CHARLETTE_DYN_SAT )
    IF ( using_acc ) THEN
      message = "EFCY_MODEL_CHARLETTE_DYN_SAT not supported with openACC"
      CALL print_error_and_quit ( err_routine=routine, err_message=message )
    END IF
!   Model of Charlette et al. (saturated)
    CALL efcy_men_saturated ( grid )
  CASE ( EFCY_MODEL_CHARLETTE_DYN_SAT_MOURIAUX )
    IF ( using_acc ) THEN
      message = "EFCY_MODEL_CHARLETTE_DYN_SAT_MOURIAUX not supported with openACC"
      CALL print_error_and_quit ( err_routine=routine, err_message=message )
    END IF
!   Model of Charlette et al. (saturated), Corrected (S. Mouriaux)
    CALL efcy_men_saturated_mouriaux ( grid )
  END SELECT


END SUBROUTINE compute_visturb_neuralnet

Then in SOURCES/MAIN/COMPUTE/compute_visturb_efcy.f90 add the following block juste before the final END IF:

  ELSE IF ( iles==100 ) THEN
    IF ( using_acc ) THEN
      message = "iles==100 not supported with openACC"
      CALL print_error_and_quit ( err_routine=routine, err_message=message )
    END IF

    CALL compute_visturb_neuralnet ( grid )

Congrats, you are now able to send AVBP fields and receive them from a DL model. As a small tip, you can use valgrind to check the path of a subroutine. More information here.

2.3 Implementing the gradients in the solution files (normally optional but required for this tutorial)

In order to train our DL model, we will use the solution files .h5 outputed in SOLUT. However, on the AVBP version of this tutorial, the velocity gradients are not stored in the solution files. Thus, we will have to modify AVBP for this purpose.

Go to SOURCES/COMMON/mod_data_defs.f90 and add after the line

    REAL(pr), DIMENSION(:,:,:), ALLOCATABLE :: grad_velocity            !< Velocity Gradient

the lines:

    REAL(pr), DIMENSION(:), ALLOCATABLE :: grad_ux                      !< Velocity Gradient of u wrt x
    REAL(pr), DIMENSION(:), ALLOCATABLE :: grad_uy                      !< Velocity Gradient of u wrt y
    REAL(pr), DIMENSION(:), ALLOCATABLE :: grad_uz                      !< Velocity Gradient of u wrt z
    REAL(pr), DIMENSION(:), ALLOCATABLE :: grad_vx                      !< Velocity Gradient of v wrt x
    REAL(pr), DIMENSION(:), ALLOCATABLE :: grad_vy                      !< Velocity Gradient of v wrt y
    REAL(pr), DIMENSION(:), ALLOCATABLE :: grad_vz                      !< Velocity Gradient of v wrt z
    REAL(pr), DIMENSION(:), ALLOCATABLE :: grad_wx                      !< Velocity Gradient of w wrt x
    REAL(pr), DIMENSION(:), ALLOCATABLE :: grad_wy                      !< Velocity Gradient of w wrt y
    REAL(pr), DIMENSION(:), ALLOCATABLE :: grad_wz                      !< Velocity Gradient of w wrt z

Still in the same file, after the lines

        IF ( ipostproc_advanced ) THEN
          CALL alloc_check ( list_array, self%grad_velocity, "grad_velocity", ndim, ndim, self%nnode, already_allocate )

add the lines:

          CALL alloc ( list_array, self%grad_ux, "grad_ux", self%nnode )
          CALL alloc ( list_array, self%grad_uy, "grad_uy", self%nnode )
          CALL alloc ( list_array, self%grad_uz, "grad_uz", self%nnode )
          CALL alloc ( list_array, self%grad_vx, "grad_vx", self%nnode )
          CALL alloc ( list_array, self%grad_vy, "grad_vy", self%nnode )
          CALL alloc ( list_array, self%grad_vz, "grad_vz", self%nnode )
          CALL alloc ( list_array, self%grad_wx, "grad_wx", self%nnode )
          CALL alloc ( list_array, self%grad_wy, "grad_wy", self%nnode )
          CALL alloc ( list_array, self%grad_wz, "grad_wz", self%nnode )

Finally, after the lines

    IF ( ionlinepost .OR. ipostproc_advanced ) THEN
      IF ( ALLOCATED(self%vorticity) )          CALL dealloc ( list_array, self%vorticity, "vorticity" )
      IF ( ALLOCATED(self%velocity) )           CALL dealloc ( list_array, self%velocity, "velocity" )
      IF ( ALLOCATED(self%grad_velocity) )      CALL dealloc ( list_array, self%grad_velocity, "grad_velocity" )

add the lines:

      IF ( ALLOCATED(self%grad_ux) )            CALL dealloc ( list_array, self%grad_ux, "grad_ux" )
      IF ( ALLOCATED(self%grad_uy) )            CALL dealloc ( list_array, self%grad_uy, "grad_uy" )
      IF ( ALLOCATED(self%grad_uz) )            CALL dealloc ( list_array, self%grad_uz, "grad_uz" )
      IF ( ALLOCATED(self%grad_vx) )            CALL dealloc ( list_array, self%grad_vx, "grad_vx" )
      IF ( ALLOCATED(self%grad_vy) )            CALL dealloc ( list_array, self%grad_vy, "grad_vy" )
      IF ( ALLOCATED(self%grad_vz) )            CALL dealloc ( list_array, self%grad_vz, "grad_vz" )
      IF ( ALLOCATED(self%grad_wx) )            CALL dealloc ( list_array, self%grad_wx, "grad_wx" )
      IF ( ALLOCATED(self%grad_wy) )            CALL dealloc ( list_array, self%grad_wy, "grad_wy" )
      IF ( ALLOCATED(self%grad_wz) )            CALL dealloc ( list_array, self%grad_wz, "grad_wz" )

Now, open SOURCES/CFD/mod_postproc_compute.f90 and just before the END IF associated with the IF ( ipostproc_advanced ) THEN, add:

      grid%data_n%grad_ux = grid%data_n%grad_velocity(1,1,:)
      grid%data_n%grad_uy = grid%data_n%grad_velocity(1,2,:)
      grid%data_n%grad_uz = grid%data_n%grad_velocity(1,3,:)
      grid%data_n%grad_vx = grid%data_n%grad_velocity(2,1,:)
      grid%data_n%grad_vy = grid%data_n%grad_velocity(2,2,:)
      grid%data_n%grad_vz = grid%data_n%grad_velocity(2,3,:)
      grid%data_n%grad_wx = grid%data_n%grad_velocity(3,1,:)
      grid%data_n%grad_wy = grid%data_n%grad_velocity(3,2,:)
      grid%data_n%grad_wz = grid%data_n%grad_velocity(3,3,:)

Finally, open SOURCES/IO/output_solut_list_variables.f90 and just after

   IF ( ipostproc_advanced ) THEN

add the lines:

      CALL add_1dvar2solut ( "grad_ux" )                                                 ! Grad of u wrt x
      CALL add_1dvar2solut ( "grad_uy" )                                                 ! Grad of u wrt y
      CALL add_1dvar2solut ( "grad_uz" )                                                 ! Grad of u wrt z
      CALL add_1dvar2solut ( "grad_vx" )                                                 ! Grad of v wrt x
      CALL add_1dvar2solut ( "grad_vy" )                                                 ! Grad of v wrt y
      CALL add_1dvar2solut ( "grad_vz" )                                                 ! Grad of v wrt z
      CALL add_1dvar2solut ( "grad_wx" )                                                 ! Grad of w wrt x
      CALL add_1dvar2solut ( "grad_wy" )                                                 ! Grad of w wrt y
      CALL add_1dvar2solut ( "grad_wz" )                                                 ! Grad of w wrt z

That’s it ! You have implemented all you need in AVBP for this tutorial. Congrats and virtual high five ! At this point, you should compile AVBP following the installation instructions available here. If you have errors or did not manage to compile the code, you can also take a look at the AVBP branch WIP/phydll_from_dev to compare. There is however additional work related to PhyDLL on that branch.

Note that on the partition grace of Calypso, you will have to do the following so that the AVBP version is compatible with the OpenMPI version of the pytorch container:

salloc -p grace --gres=gpu:0
module load avbp/7.X_gcc124_ompi417
export ACC_MODE=FALSE
cd $AVBP_HOME/SOURCES
make -j 8

3. Building a DL model from the AVBP solutions

As a test case, we will take Homogeneous Isotropic Turbulence (HIT) with the Smagorinsky LES model. The mesh contains 35937 points, so we have 35937 samples. You can get the case on the AVBP branch tuto_phydll here. Copy/paste this folder in your scratch (or wherever you would like to run your jobs). Enter the folder $SCRATCH/HITLES/RUN.

Go to run.params, set simulation_end_iteration = 2, and run your AVBP job. We will train a DL model on a dataset composed of the two AVBP-generated solutions.

Let’s create a PyTorch-compatible dataset using these solutions. We recommend handling the data using a Jupyter Notebook, which can be created from CERFACS JupyterHub platform. CERFACS JupyterHub addresses can be found here (you have to select the corresponding machine) and instructions on how to use the Python virtual environment you created in Sec. 0 as the Notebook’s kernel are available here.

import h5py
import numpy as np
import torch

from torch.utils.data import Dataset

class CustomDataset(Dataset):
    def __init__(self, previous_solution, next_solution):
        with h5py.File(previous_solution, "r") as solut:
            grad_ux = solut["Additionals"]["grad_ux"][:] / 10**6
            grad_uy = solut["Additionals"]["grad_uy"][:] / 10**6
            grad_uz = solut["Additionals"]["grad_uz"][:] / 10**6
            grad_vx = solut["Additionals"]["grad_vx"][:] / 10**6
            grad_vy = solut["Additionals"]["grad_vy"][:] / 10**6
            grad_vz = solut["Additionals"]["grad_vz"][:] / 10**6
            grad_wx = solut["Additionals"]["grad_wx"][:] / 10**6
            grad_wy = solut["Additionals"]["grad_wy"][:] / 10**6
            grad_wz = solut["Additionals"]["grad_wz"][:] / 10**6
            rho = solut["GaseousPhase"]["rho"][:]
            volw = solut["Additionals"]["volw"][:] * 10**16

        with h5py.File(next_solution, "r") as solut:
            vis_turb = solut["Additionals"]["vis_turb"][:]

        inputs = np.c_[grad_ux, grad_uy, grad_uz, grad_vx, grad_vy, grad_vz, grad_wx, grad_wy, grad_wz, rho, volw]
        outputs = vis_turb*10**5

        self.inputs = torch.from_numpy(inputs).float()
        self.outputs = torch.from_numpy(outputs).float().unsqueeze(1)

    def __len__(self):
        return len(self.inputs)

    def __getitem__(self, idx):        
        return self.inputs[idx], self.outputs[idx]

dataset = CustomDataset("RUN/SOLUT/solut_00000001.h5", "RUN/SOLUT/solut_00000002.h5")

You may need to change the paths of the solutions accordingly. Note that we changed the scales of the gradients grad_..., the volume of the node volw and the turbulent viscosity vis_turb to stabilize the DL model training later. To rapidly know where the quantities of interest are stocked in the AVBP solutions, you can use h5nav

That’s it ! You can now do as the rest of the classical Pytorch framework here. However, since I am in a good day, I will also give you the rest of the code and the parameters that we used to train our model.

from torch.utils.data import DataLoader
from torch import nn

dataloader = DataLoader(dataset, batch_size=256)

device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {device} device")

class NeuralNetwork(nn.Module):
    def __init__(self):
        n_inputs = 11
        n_outputs = 1
        n_neurones = 256
        super().__init__()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(n_inputs, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_outputs),
        )

    def forward(self, x):
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

learning_rate = 1e-3
loss_fn = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

epochs = 100
timely_loss_fcn = []
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(dataloader, model, loss_fn, optimizer)
print("Done!")

We can take a look at the results of our model:

from matplotlib import pyplot as plt

x_test = dataset.inputs.to(device)
y_test = dataset.outputs.to(device)

model.eval()
with torch.no_grad():
    y_pred = model(x_test)

y_pred_np = y_pred.cpu().detach().numpy() / 10**5
y_test_np = y_test.cpu().detach().numpy() : 10**5

min_y = np.min(np.r_[y_pred_np, y_test_np])
max_y = np.max(np.r_[y_pred_np, y_test_np])


plt.scatter(y_test_np, y_pred_np, color="b", alpha=0.05)
plt.plot([min_y, max_y], [min_y, max_y], "k")
plt.xlabel("Vis turb AVBP")
plt.ylabel("Vis turb NN")

Personnally, I had something like this: Turbulent viscosity AVBP vs AI

Note that the predictions y_pred_np and the test values y_test_np are divided by 10**5 here due to the scaling for the DL training. Well, that’s far away from perfect (an euphemism) but it will be good enough for this tutorial. Save the weights of your model now:

torch.save(model.state_dict(), 'model.pth')

You will now be able to use your model with PhyDLL. Note that you may also be interested in the AVBP dataloader created by none other than Fernando Gonzalez here.

4. Running your hybrid AI/AVBP solver

Copy/paste your run folder: cp -r RUN RUN_DL and enter the RUN_DL folder with: cd RUN_DL. Open the run.params and set LES_model = neuralnet. Move or copy/paste the model.pth created in the previous section in RUN_DL.

We now need to write the python script to load the model and send the fields from your DL model to AVBP. Open a main.py and copy/paste the following:

import numpy as np
import torch
from torch import nn
from phydll.phydll import PhyDLL

class NeuralNetwork(nn.Module):
    def __init__(self):
        n_inputs = 11
        n_outputs = 1
        n_neurones = 256
        super().__init__()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(n_inputs, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_outputs),
        )

    def forward(self, x):
        logits = self.linear_relu_stack(x)
        return logits

class MLP():

    def __init__(self, model_path):
        device = ("cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu")
        print(f"Using {device} device")

        model = NeuralNetwork()
        model.load_state_dict(torch.load(model_path, weights_only=True, map_location=device))
        model.eval()
        self.model = model
        self.model.to(device)
        self.device = device

    def predict(self, fields):
        fields[0:9,:] /= 10**6
        fields[-1,:] *= 10**16

        X = torch.from_numpy(fields.T).float().to(self.device)

        with torch.no_grad():
            Y = self.model(X)

        Y = Y.cpu().detach().numpy().astype(np.float64)
        Y /= 10**5

        return Y.T

model_path = "model.pth"

def main():
    """
    Main coupling routine
    """
    phydll = PhyDLL(coupling_scheme="DS",
                    mesh_type="phymesh",
                    phy_nfields=11,
                    dl_nfields=1)
    # Temporal
    phydll.mpienv.glcomm.Barrier()
    use_temporal_loop = True

    model = MLP(model_path)

    if use_temporal_loop:
        phydll.set_predict_func(model.predict)
        phydll.temporal()
    else:
        while phydll.fsignal:
            phy_fields = phydll.receive_phy_fields()
            dl_fields = model.predict(phy_fields)
            phydll.send_dl_fields(dl_fields)

    # Finalize
    phydll.finalize()

if __name__ == "__main__":
    main()

NeuralNetwork is obviously the model we created. You might need to change it when you change your model. MLP is the class to manage this model. Note that in the predict we scale back to the original inputs/outputs due to the scaling we used for the DL model. Finally, in the main is executed the inference with AVBP. Note that in a future, you will have to change the phy_nfields and dl_fields according to your case.

This script might not be the best (yes, I am not perfect, sorry about that !). You can however find examples of python scripts for your models in phydll/scripts. In these scripts, there are methods like set_gpus that I do not master (I told you that I was not perfect!) so up to you to investigate this.

PhyDLL also relies on a phydll.yml for various parameters. So open a file named phydll.yml and copy/paste the following:

Output:
    debug_level : 3

Coupling:
  coupling_frequency : 1
  save_fields_frequency : 1        # (optional, default=0) Frequency to save exchanged fields in ./cwipi

The debug_level indicates the verbosity, the coupling_frequency is the frequence of coupling between AVBP and the DL fields, and the save_fields_frequency is the frequency to save the fields sent and received by PhyDLL.

Almost there ! If you are on Kraken, open a script that we name batch_job and copy paste the following:

#!/bin/bash
#SBATCH --job-name=first_run
#SBATCH --partition=debug
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=36
#SBATCH --time=00:02:00
#SBATCH --output=output_test.o
#SBATCH --error=error_test.e
#SBATCH --exclusive

# LOAD MODULES ##########
module purge
module load avbp
module list
#########################
exec="/path/to/your/avbp/exec"

# NUMBER OF TASKS #######
export PHY_TASKS_PER_NODE=35
export DL_TASKS_PER_NODE=1
export TASKS_PER_NODE=$(($PHY_TASKS_PER_NODE + $DL_TASKS_PER_NODE))
export NP_PHY=$(($SLURM_NNODES * $PHY_TASKS_PER_NODE))
export NP_DL=$(($SLURM_NNODES * $DL_TASKS_PER_NODE))
#########################

# EXTRA COMMANDS ########
source /path/to/your/virtual/environment/bin/activate
#########################

# ENABLE PHYDLL #########
export ENABLE_PHYDLL=TRUE
#########################

# PLACEMENT FILE ########
python $PHYDLL_DIR/scripts/placement4mpmd.py --Run impi --NpPHY $NP_PHY --NpDL $NP_DL
#########################

# MPMD EXECUTION ########
mpirun -l --machinefile machinefile_$SLURM_NNODES-$NP_PHY-$NP_DL -np $NP_PHY $exec : -np $NP_DL python main.py
#########################

If you are on Calypso and planning to run on the grace partition, the script batch_job looks like this:

#!/bin/bash
#SBATCH --job-name=first_run
#SBATCH --gres=gpu:1
#SBATCH --partition=grace
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=72
#SBATCH --time=00:01:00
#SBATCH --output=output_test.o
#SBATCH --error=error_test.e
#SBATCH --exclusive

# LOAD MODULES ##########
module purge
module load avbp/7.X_gcc124_ompi417
module list
#########################
exec="path/to/your/avbp/exec"

# NUMBER OF TASKS #######
export PHY_TASKS_PER_NODE=71
export DL_TASKS_PER_NODE=1
export TASKS_PER_NODE=$(($PHY_TASKS_PER_NODE + $DL_TASKS_PER_NODE))
export NP_PHY=$(($SLURM_NNODES * $PHY_TASKS_PER_NODE))
export NP_DL=$(($SLURM_NNODES * $DL_TASKS_PER_NODE))
#########################

# ENABLE PHYDLL #########
export ENABLE_PHYDLL=TRUE
#########################

# MPMD EXECUTION ########
mpirun -mca coll ^ucc --tag-output -np $NP_PHY $exec : -np $NP_DL singularity exec --nv -B /scratch/softs/local_arm/singularity/images/pytorch25.02.sif /path/to/your/virtual/environment/bin/python main.py

#########################

Note that on Calypso, there is no machinefile or rankfile involved as I did not manage to make it run with. Obviously, in both cases, you will need to change the path/to/your/avbp/exec and path/to/your/virtual/environment/ accordingly. These scripts can also be generated with PhyDLL like here.

You are now ready to submit your job:

sbatch batch_job

If everything went well, you normally should have at the end of output_test.o the line: (PhyDLL) -----> Finalizing. If that’s not the case … oh dear ! Well, I assume that you have this line. You can now inspect your two solutions: the one generated with Smagorinsky and the one generated by your DL model. Open paraview on the machine (once again, if you do not know how to do it, have a look at the intranet of CSG here and select the corresponding machine). Load your two solutions: RUN/solut_00000002.h5 and RUN_DL/solut_00000002.h5. Personnally, side by side, that looks something like this:

Turbulent viscosity Paraview

On the left, you have the viscosity turbulent computed with Smagorinsky and on the right, you have the one computed with the DL model. They are pretty similar in the end.

You can also check if your a priori inference is the same as your posteriori inference. In your jupyter notebook, copy/paste the following:

import torch
import h5py
import numpy as np

from torch import nn
from matplotlib import pyplot as plt

# We select the device
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {device} device")

# We create again the same model
class NeuralNetwork(nn.Module):
    def __init__(self):
        n_inputs = 11
        n_outputs = 1
        n_neurones = 256
        super().__init__()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(n_inputs, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_neurones),
            nn.ReLU(),
            nn.Linear(n_neurones, n_outputs),
        )

    def forward(self, x):
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork()
model.load_state_dict(torch.load('RUN_DL/model.pth', weights_only=True, map_location=device))
model.to(device)

# A priori
solut_priori = "INIT/solut_00000001.h5"
with h5py.File(solut_priori, "r") as solut:
    grad_ux = solut["Additionals"]["grad_ux"][:] / 10**6
    grad_uy = solut["Additionals"]["grad_uy"][:] / 10**6
    grad_uz = solut["Additionals"]["grad_uz"][:] / 10**6
    grad_vx = solut["Additionals"]["grad_vx"][:] / 10**6
    grad_vy = solut["Additionals"]["grad_vy"][:] / 10**6
    grad_vz = solut["Additionals"]["grad_vz"][:] / 10**6
    grad_wx = solut["Additionals"]["grad_wx"][:] / 10**6
    grad_wy = solut["Additionals"]["grad_wy"][:] / 10**6
    grad_wz = solut["Additionals"]["grad_wz"][:] / 10**6
    rho = solut["GaseousPhase"]["rho"][:]
    volw = solut["Additionals"]["volw"][:] * 10**16

x_prio = np.c_[grad_ux, grad_uy, grad_uz, grad_vx, grad_vy, grad_vz, grad_wx, grad_wy, grad_wz, rho, volw]
x_prio = torch.from_numpy(x_prio).float().to(device)

model.eval()
with torch.no_grad():
    y_prio = model(x_prio)
y_prio_np = y_prio.cpu().detach().numpy() / 10**5

# A posteriori
solut_posteriori = "RUN_DL/SOLUT/solut_00000002.h5"
with h5py.File(solut_posteriori, "r") as solut:
    y_post_np = solut["Additionals"]["vis_turb"][:]

print(y_prio_np.shape, y_post_np.shape)

min_y = np.min(np.r_[y_prio_np.flatten(), y_post_np])
max_y = np.max(np.r_[y_prio_np.flatten(), y_post_np])

plt.scatter(y_prio_np, y_post_np, color="blue", alpha=0.05)
plt.plot([min_y, max_y], [min_y, max_y], "k")
plt.xlabel("Vis turb à priori (from model)")
plt.ylabel("Vis turb à posteriori (from inference in AVBP)")

Personnally, I had:

Viscosity a priori, a posteriori

You managed to reach this point ! Virtual high five ! You deserve it ! Scream to the world that you managed to make a coupling between AVBP and your DL model with PhyDLL ! Now, it is up to you to make your own couplings between AVBP and DL models.

Anthony Larroque is a postdoctoral researcher in computer science engineering with a focus on CFD related topics.

Luciano Drozda is a research scientist working on scientific Machine Learning (SciML).

Build your hybrid AI/AVBP solver with PhyDLL

0. Install PhyDLL version 0.1

0.1 On Kraken

0.2 On Calypso

1. Implementing PhyDLL in AVBP

2. Implementing the fields to send and to receive

2.1 Implementing the fields to send from AVBP

2.2 Implementing the fields to receive from the DL model

2.3 Implementing the gradients in the solution files (normally optional but required for this tutorial)

3. Building a DL model from the AVBP solutions

4. Running your hybrid AI/AVBP solver

Keep Reading

Published

Category

Tags

Stay in Touch