The OASIS Coupler Forum

  HOME

No nout.000000 and debug.01.0000? file during running tutorial coupled model(Example)

Up to Starting with OASIS3-MCT (first steps, tutorial, ...)

Posted by Anonymous at May 26 2025

Hi,
I'm new to oasis3-mct. I have compiled oasis3_mct version 5.2 with ifort and mpich.
Now I am trying to run the tutorial coupled model(tutorial_oa), but I didn't get the "debug" file. 
I checked the namecouple file, and the $NLOGPRT is 30 3, i.e.
$NLOGPRT
# Amount of information written to OASIS3-MCT log files (see User Guide)
  30 3

Posted by Anonymous at May 26 2025

Hi,
I assume that, your atmos has 1 rank, ocean has 2 rank but you do not used Rank 0 for OASIS. (maybe due to $NBMODEL). then you need to run mpirun -np 3 ?

The OASIS3-MCT debug files debug.01.00000? and debug.02.00000? (where ? goes from 0 to 3, one for each process) are written by each process of ocean and atmos respectively.

https://gitlab.com/cerfacs/oasis3-mct/-/blob/OASIS3-MCT_5.0/examples/tutorial_communication/tutorial_communication.pdf

:)

Best,

Posted by Anonymous at May 27 2025

Hi,
Thanks for your replying.
But I'm not sure if I understand it.
I fellow "https://gitlab.com/cerfacs/oasis3-mct/-/blob/OASIS3-MCT_5.0/examples/tutorial_communication/tutorial_communication.pdf" step by step.
But I didn't get the debug files. I only got the ocean.out.100 ocean.out.101 ocean.out.102 ocean.out.103 and atmos.out.100 atmos.out.101 atmos.out.102 atmos.out.103.
Here is the "run_tutorial_oa" file. I'm using "yhrun", maybe it's bad?

#
############### User's section #######################################
#
## - Define architecture and coupler 
arch=intel  # training, belenos, nemo_lenovo, mac 
              # kraken, gfortran_openmpi_openmp_linux
              # pgi_openmpi_openmp_linux, 
              # pgi20.4_openmpi_openmp_linux (not work with 4.0)
              # gnu1020_openmpi_openmp_linux (not work with 4.0)
#
# - Define number of processes to run each executable
    nproc_exe1=4
    nproc_exe2=4
#
############### End of user's section ################################
#
# - Define rundir
    rundir=${srcdir}/work_${casename}_${nproc_exe1}_${nproc_exe2}_oa
#
echo '*****************************************************************'
echo '*** '$casename' : '$run
echo ''
echo 'Rundir       :' $rundir
echo 'Architecture :' $arch
echo 'Host         : '$host
echo 'User         : '$user
echo ''
echo $exe1' runs on '$nproc_exe1 'processes'
echo $exe2' runs on '$nproc_exe2 'processes'
echo ''
######################################################################
### 1. Create rundir and copy everything needed
#
\rm -fr $rundir
mkdir -p $rundir
cp -f $datadir/*nc  $rundir/.
cp -f $srcdir/$exe1 $rundir/.
cp -f $srcdir/$exe2 $rundir/.
cp -f $datadir/namcouple_LAG $rundir/namcouple
cd $rundir
######################################################################
### 2. Definition of mpirun command and batch script
#
if [ $arch == training ]; then
    MPIRUN=/usr/local/intel/impi/2018.1.163/bin64/mpirun
elif [ $arch == gfortran_openmpi_openmp_linux ]; then
    MPIRUN=/usr/lib64/openmpi/bin/mpirun
elif [ $arch == pgi_openmpi_openmp_linux ]; then
    MPIRUN=/usr/local/pgi/linux86-64/18.7/mpi/openmpi-2.1.2/bin/mpirun
elif [ $arch == gnu1020_openmpi_openmp_linux ]; then
    MPIRUN=/usr/local/openmpi/4.1.0_gcc1020/bin/mpirun
elif [ $arch == pgi20.4_openmpi_openmp_linux ]; then
    MPIRUN=/usr/local/pgi/linux86-64/20.4/mpi/openmpi-3.1.3/bin/mpirun
elif [ $arch == intel ]; then
    MPIRUN=/usr/bin/yhrun
elif [ $arch == belenos ] ; then
   (( nproc = $nproc_exe1 + $nproc_exe2 ))
  cat < $rundir/run_$casename.$arch
#!/bin/bash
#SBATCH --exclusive
#SBATCH --partition=normal256
#SBATCH --time=00:10:00
#SBATCH --job-name=spoc     # job name
#SBATCH -N 1                # number of nodes
#SBATCH -n $nproc                # number of procs
#SBATCH -o $rundir/$casename.o
#SBATCH -e $rundir/$casename.e
ulimit -s unlimited
cd $rundir
module load intelmpi/2018.5.274
module load intel/2018.5.274
module load netcdf-fortran/4.5.2_V2
#
export KMP_STACKSIZE=1GB
export I_MPI_WAIT_MODE=enable
#
time mpirun -np $nproc_exe1 ./$exe1 : -np $nproc_exe2 ./$exe2
#
EOF
#
elif [ ${arch} == nemo_lenovo ] ; then
  MPIRUN=mpirun
  (( nproc = $nproc_exe1 + $nproc_exe2 ))
  cat < $rundir/run_$casename.$arch
#!/bin/bash -l
# Nom du job
#SBATCH --job-name spoc
# Temps limite du job
#SBATCH --time=00:10:00
#SBATCH --partition debug
#SBATCH --output=$rundir/$casename.o
#SBATCH --error=$rundir/$casename.e
# Nombre de noeuds et de processus
#SBATCH --nodes=1 --ntasks-per-node=$nproc
#SBATCH --distribution cyclic
cd $rundir
ulimit -s unlimited
#SPOC module purge
#SPOC module -s load compiler/intel/2015.2.164 mkl/2015.2.164 mpi/intelmpi/5.0.3.048
#
time $MPIRUN -np $nproc_exe1 ./$exe1 : -np $nproc_exe2 ./$exe2
#
EOF
elif [ ${arch} == kraken ] ; then
  (( nproc = $nproc_exe1 + $nproc_exe2 ))
  cat < $rundir/run_$casename.$arch
#!/bin/bash -l
#SBATCH --partition debug
# Nom du job
#SBATCH --job-name spoc
# Temps limite du job
#SBATCH --time=00:10:00
#SBATCH --output=$rundir/$casename.o
#SBATCH --error=$rundir/$casename.e
# Nombre de noeuds et de processus
#SBATCH --nodes=1 --ntasks-per-node=$nproc
#SBATCH --distribution cyclic

cd $rundir

ulimit -s unlimited
module purge
module load compiler/intel/23.2.1
module load mpi/intelmpi/2021.10.0
module load lib/netcdf-fortran/4.4.4_phdf5_1.10.4

time mpirun -np $nproc_exe1 ./$exe1 : -np $nproc_exe2 ./$exe2
EOF

fi

######################################################################
### 3. Model execution or batch submission
#
if [ $arch == training ] || [ $arch == gfortran_openmpi_openmp_linux ] || [ $arch == gnu1020_openmpi_openmp_linux ] || [ $arch == pgi_openmpi_openmp_linux ] || [ $arch == pgi20.4_openmpi_openmp_linux ]; then
    export OMP_NUM_THREADS=1
    echo 'Executing the model using '$MPIRUN 
    $MPIRUN -oversubscribe -np $nproc_exe1 ./$exe1 : -np $nproc_exe2 ./$exe2
elif [ $arch == belenos ]; then
    echo 'Submitting the job to queue using sbatch'
    sbatch $rundir/run_$casename.$arch
    squeue -u $user
elif [ ${arch} == nemo_lenovo ] || [ ${arch} == kraken ]; then
    echo 'Submitting the job to queue using sbatch'
    sbatch $rundir/run_$casename.$arch
    squeue -u $user
elif [ ${arch} == mac ]; then
    echo 'Executing the model using mpirun'
    ulimit -s unlimited
    mpirun --oversubscribe -np $nproc_exe1 ./$exe1 : -np $nproc_exe2 ./$exe2
elif [ ${arch} == intel ]; then
    echo 'Executing the model using yhrun'
    yhrun -N 1 -n $nproc_exe1 -p deimos ./$exe1 : -N 1 -n $nproc_exe2 -p deimos ./$exe2 --partition debug
fi
echo $casename 'is executed or submitted to queue.'
echo 'Results are found in rundir : '$rundir 
#
######################################################################

Posted by Anonymous at May 27 2025

Hi,
What are the last lines of your ocean.out.100 ocean.out.101 ocean.out.102 ocean.out.103 and atmos.out.100 atmos.out.101 atmos.out.102 atmos.out.103 files? 
I suspect our run has not finished properly and the debug files are not created yet.
  Regards,
 Sophie

Posted by Anonymous at May 27 2025

Hi,
I don't think it's related to yhrun etc. as you mention here. 
Can you just add this line in your script.
export OASIS_DEBUG=3
export CPL_LOG=cplout_your_debug

Please check you get cplout_your_debug.01.00000 or others or not.
Best,

Posted by Anonymous at May 28 2025

Dear Sophie,
Here is the ocean.out.103 file:

I am ocean process with rank :           3
 in my local communicator gathering            8 processes
 ----------------------------------------------------------
 Local partition definition
 il_extentx, il_extenty, il_size, il_offsetx, il_offsety, il_offset =
         182          18        3276           0          54        9828
 End of initialisation phase
 Timestep, field min and max value
           0   1.00010726631575        2.85860918572124
        3600   2.00021453263151        5.71721837144248
        7200   3.00032179894726        8.57582755716372
       10800   4.00042906526301        11.4344367428850
 End of the program

And atmos.out.104
I am atmos process with rank :           4
 in my local communicator gathering            8 processes
 ----------------------------------------------------------
 Local partition definition
 il_extentx, il_extenty, il_size, il_offsetx, il_offsety, il_offset =
          96           9         864           0          36        3456
 End of initialisation phase
 Timestep, field min and max value
           0   1.00067973297724        2.84700434729930
        1800   2.00135946595447        5.69400869459860
        3600   3.00203919893170        8.54101304189791
        5400   4.00271893190894        11.3880173891972
        7200   5.00339866488618        14.2350217364965
        9000   6.00407839786341        17.0820260837958
       10800   7.00475813084064        19.9290304310951
       12600   8.00543786381788        22.7760347783944
 End of the program
Best,

Posted by Anonymous at May 28 2025

Hi,
I'm not sure where to add "export OASIS_DEBUG=3 export CPL_LOG=cplout_your_debug".

But I tried to add them to the .bashrc file and run_tutorial_oa file.

Could you please let me know which file to add?

Best,

Posted by Anonymous at May 28 2025

Hi,
you added it in ~/.bashrc and source ~/.bashrc it? 
you don't get cplout_your_debug.01.00000 right?
I will check it tomorrow with mpiifort and mpiifx.
show your namcouple.
and 
export KMP_WARNINGS=TRUE 

export KMP_VERBOSE=1 
export KMP_AFFINITY=verbose 

export OMP_NUM_THREADS=1

Best,

Posted by Anonymous at May 29 2025

Hi,
I added it in ~/.bashrc and source ~/.bashrc it and don't get cplout_your_debug.01.00000.

Here is my namecouple file.

# This is a typical input file for OASIS3-MCT.
#
# Any line beginning with # is ignored.
#
#########################################################################
 $NFIELDS
# The number of fields described in the second part of the namcouple.
             2
###########################################################################
 $RUNTIME
# The total simulated time for this run in seconds
  14400
###########################################################################
 $NLOGPRT
# Amount of information written to OASIS3-MCT log files (see User Guide)
  30 3 1
###########################################################################
 $STRINGS
#
# Everything below has to do with the fields being exchanged.
#
######################################################
#
# Field 1: ocean to atmos 
#
#   First line:
# 1) and 2) Symbolic names for the field in the source and target component models
# 3) Not used anymore but still required for parsing
# 4) Exchange frequency for the field in seconds
# 5) Number of transformation to be performed by OASIS3-MCT
# 6) Coupling restart file names
# 7) Field status: EXPORTED, EXPOUT, INPUT, OUTPUT
FIELD_SEND_OCN FIELD_RECV_ATM 1 3600  1  fdocn.nc EXPOUT
#
#   Second line:
# 1)-2) and 3)-4) Source and target grid first and 2nd dimensions (optional)
# 5) and 6) Source and target grid prefix (4 characters)
# 7) LAG index if needed
182 149 96 72 torc  lmdz  LAG=+3600
#
#   Third line:
# Overlap (P or R) and nbr of overlap grid points for source and target grids.
P  2  P  0
#
# List of analyses (here only MAPPING)
MAPPING
#
# Specific parameters for each analysis (here only the name of the remapping file for MAPPING)
my_remapping_file_bilinear.nc
#
######################################################
#
# Field 2: atmos to ocean
#
FIELD_SEND_ATM FIELD_RECV_OCN  1 7200  1  fdatm.nc EXPOUT
#
96 72 182 149 lmdz torc LAG=+1800
#
P  0  P  2
#
# List of analyses (here only SCRIPR)
SCRIPR
#
# Specific parameters for SCRIPR, here specifying the parameter of the BLINEAR interpolation to be used
BILINEAR LR SCALAR LATLON 1
#

Posted by Anonymous at June 10 2025

Hi,
you have not fixed it yet, DEBUG for Sophie.
What compiler flags is she using?
Best,
Subhadeep
Reply to this