Cerfacs Enter the world of high performance ...

Multi-architecture parallelism using Kokkos/C++ library

  From Monday 9 December 2024 to Wednesday 11 December 2024

  Training    

Cerfacs is Qualiopi certified for its training activities

Duration : 2.5 days / 17 hours

Face to face training session

Satisfaction index

In December 2023, 90% of the participants were satisfied or very satisfied

(results collected from 11 respondents out of 14 participants, a response rate of 78%)

Testimony

Qualified trainer, comprehensive theoretical support, varied practical exercises (R., 2023)

Abstract

About 15 years ago, Nvidia introduces the CUDA architecture, i.e. the use of graphics of processing units for doing general purpose high performance applicative tasks.  Today, it is still a very challenging task to write new applications or simply refactor existing ones with portability and performance in mind, that is for being able to use efficiently available hardware HPC ressources, either multicore CPU (x86, ARM64, …) or GPU (Nvidia /AMD / Intel).

This training aims at providing a dedicated introduction to the open source Kokkos C++ library, which is mainly developped in US (Sandia and OakRidge labs) for about a decade, and funded by the Departement of Energy, under the Exascale Computing Project (ECP). We will present a theoretical and practical introduction to the Kokkos programming model, illustrating the advantages over other alternative shared memory programming model like OpenMP/OpenACC.

The participants of the training are expected to be able to integrate Kokkos library in their own HPC projects.

Objective of the training

Provide a theoretical and practical introduction to Kokkos programming model from abstract concepts, i.e. parallel programming patterns (parallel for, reduce and scan loops), data containers to an overview of the ecosystem (profiling tools, linear algebra, python bindings, …).

Learning outcomes

On completion of this course, you will be able to :

  • build, install and integrate Kokkos into an existing application or write a kokkos application from scratch
  • write Kokkos computing kernels, manage memory data container in heterogeneous platform (CPU/GPU).
  • Able to access performance by using profiling tools
  • Refactor existing code for performance portability

Teaching methods

The training is an alternation of theoretical presentations and practical work. A multiple choice question allows the final evaluation. The training room is equipped with computers, the work can be done in sub-groups of two people.

Referent teacher: Pierre KESTENER

Target participants

This course is for anyone wishing to learn about writing HPC parallel code for running across a variety of hardware architecture, and achieving performance portability using a modern C++  library (Kokkos).

Prerequisites

  • Be an employee of a European company; a certificate from the employer is required
  • Have at least 5 years of high education or Master 2 trainee
  • Knowledge of C++ language
  • Some knowledge of parallel programing: multithreading and/or OpenMP basics
  • Interest for HPC application development
  • The training can take place in French or English depending on the audience, level B2 of the CEFR is required.

In order to verify that the prerequisites are satisfied, the following questionnaires must be completed. You need to get at least 75% of correct answers in order to be authorized to follow this training session. If you don’t succeed it, your subscription will not be validated. You only have two chances to complete it.

Questionnaire : https://forms.gle/HZwCM9Jbt2bM8LT

Registration

I certify that I obtained at least 75% of correct answers, I register

Deadline for registration: 15 days before the starting date of each training

Before signing up, you may wish to report us any particular constraints (schedules, health, unavailability…) at the following e-mail address : training@cerfacs.fr

Fee

This training course, financed as part of the European EuroCC2 project, is free of charge and reserved for employees of European Union member companies. It normally costs 1360 € excluding VAT.

However, your registration is subject to the payment of a deposit of 200 €. This sum will be returned to you at the end of the course if your participation has been effective. If not, it will be retained as compensation for the prejudice caused by leaving people unnecessarily on the waiting list.

Program

December, Monday 11, from 2pm to 5pm

  • Refresher on hardware architecture basics (CPU / GPU), on performance measurements (memory bandwidth, FLOPs), all that is needed to understand the difficulty of writing portable and performant code. Practical exercise.
  • General introduction to Kokkos c++ library, its origin, overview of concepts and software abstractions: abstract machine model (hCst/device), the kokkos parallel programing model
  • Examples of production codes using c++/kokkos
  • Pratical exercise: how to build and install Kokkos C++ library, for using  CPU OpenMP backend and GPU/CUDA backend; how to chose compiler, how to integrate Kokkos in a Cmake based project; how to write a modulefile to simplify the use of the library.

December, Tuesday 12, from 2pm to 5pm

  • Writing C++/Kokkos computing kernels: parallel programing patterns: for loops, reduce loops, scan loops.
  • Overview of execution space and memory space concepts : why do we need them, how to use them. Concepts of execution policy
  • Using hardware-aware data containers : multidimensional arrays (Kokkos::View) and hash maps (Kokkos::UnorderedMap)
  • profiling tools (Kokkos::Tools)

December, Tuesday 13, from 2pm to 5pm

  • Advanced use of  Kokkos/C++: hierarchical parallelism (about using teams of threads). Examples of use.
  • Coupling MPI and Kokkos for distributed applications. Introduction the Kokkos/RemoteSpace (experimental feature, accessing remote nodes memory)
  • Overview of Kokkos ecosystem: linear algebra (KokkosKernels), pykokkos (python bindings)
  • Kokkos for Fortran users: Kokkos/FLCL (Fortran Language Compatibility Layer)

Final examination

A final exam will be conducted during the training.

CALENDAR

Monday

13

May

2024

Implementation and use of Lattice Boltzmann Method

Monday 13 May 2024

  Training    

Tuesday

14

May

2024

Advanced Lattice Boltzmann Methods

Tuesday 14 May 2024

  Training    

Friday

17

May

2024

🎓 PhD Defense: Thomas GIANOLI

Friday 17 May 2024From 14h00 at 17h00

  Thèses Cerfacs       JCA room, Cerfacs, Toulouse, France    

ALL EVENTS