Introduction

In High-Performance Computing, as the hardware evolves rapidly without reaching complete standardization, the cost incurred by code development and aging - technical debt - keeps increasing. Additionally, the validity and performance of software are harder to achieve its full maturity. This redirects experts’ focus on these aspects at the expense of technical debt. In areas such as physical modeling, numerical methods, applied mathematics and high-performance computing, the expertise needed to develop and master a code requires more time than ever and more specialized skills. This makes any waste of experts time even more unacceptable.

Regarding the HPC domain, Reed et al. [1] suggested that high-performance technologies evolve faster than people can keep up with. For example, GPU-powered (Graphics Processing Unit) supercomputers have experienced notable prominence since 2010, yet many legacy-HPC applications born before this era are still figuring how to adapt to GPU-based supercomputers at an affordable cost. Meanwhile, Codemetrics - a set of measurements that estimates code complexity - are useful to provide accurate information to development teams. Codebase analysis isn’t the only focus of Codemetrics [2, 3]. Indeed, Codemetrics also target the team involved [4]: their perception of the code, how they navigate and retrieve information from it.

CODEMETRICS

Codemetrics is a recent field of investigation, well-established in mainstream software development but seldom applied to HPC software. After a short state of the art on the existing solutions, the need of HPC-specific codemetrics will be detailed.

Existing solutions

While existing solutions address mainstream development processes, particularly focusing on the detection and management of complex segments. Additionally, some of these solutions (CodeScene (codescene.com), Doxygen (doxygen.nl)) offer graphical representations of code: a network view that illustrates interdependencies within the codebase.\ These tools, such as those provided by SonarSource (sonarsource.com) and CodeScene, enable users to gain qualitative insights such as maintainability, duplication rate, coverage of their codebases. Codee (codee.com), offers solutions tailored to the unique challenges posed by legacy systems. Indeed, different coding standards and versions of coding languages are covered. These type of tools have already been used in academic studies [5, 6] by bridging the gap between the metric computed and the overall perception of a set of developers towards given code snippets.

Why HPC need tailored codemetrics

Regrettably, mainstream codemetrics yield highly unfavorable results when applied to HPC software. A codebase originating in the 1990s, amalgamating the efforts of numerous PhD students, and continually adjusted to keep pace with the latest hardware advancements, inevitably exhibits significant complexity, limited contributor engagement, code bloat, and instances of dead code, among other issues.\ The current approach concentrates on well-established, and consequently successful, HPC Computational Fluid Dynamics (CFD) software, assessed through codemetrics. To the best of our knowledge, there has been no systematic comparison between communities and projects within the HPC realm.\ A community-aware analysis will be employed to elucidate the human aspect of the development process. The technical debt of HPC codes will be evaluated from both a historical standpoint and a structural perspective.

First of all, with the use of Git platform (Versioning Control System) the in-house tool: Anubis (gitlab.com/cerfacs/anubis) is giving access to the history of the codebase and its evolution. It allows for a deeper understanding of developers’ relationships with the codebase, as well as providing a visualization of workforce dynamics.\ A second tool: Maraudersmap (gitlab.com/cerfacs/maraudersmap) create a geographical representations of the codebase. Through the depiction of a Callgraph, which illustrates a network of code blocks and their interdependencies, we can discern inadequate structural practices and streamline the tracking of complexity, size, and dependencies.\ Given that HPC software projects rarely adhere to stringent structural designs, the resulting networks are vast and intricate, necessitating extensive filtering before any meaningful features can be discerned. Fig.1 .

fig_1

Callgraph of HPC software neko [7]. Hardware-specific, low-level code (cpu, gpu, sx) is emerging as green clusters.

The approach assesses technical deficiencies such as structural complexities and comprehension challenges, alongside human shortcomings including entrenched habits, deviations, and divergent practices. Special attention is paid to gradual shifts occurring over several years, which are often challenging to detect within the timeframe of a developer’s perspective.

Case studies

The primary objective is to gather data from authentic projects encompassing diverse HPC codes characterized by varying standards, teams, and management approaches. Ultimately, the goal is to uncover commonalities among these communities to pinpoint the most effective and enduring strategies employed by teams in the HPC field. Naturally, a wide array of development standards were observed. Legacy systems frequently keep parts written in older versions of programming languages such as Fortran 77 and 90.

Eight large codes (> 100k lines) have been analyzed: Alya, AVBP, Yales2, Neko, NekRS, Nek5000, MesoNH, Oasis-MCT. The first insteresting angle is to compare a code / community with itself several years ago, as shown in Fig.2 .

fig_2

The lines of code of the HPC-CFD solver {\em AVBP} over time are depicted, with colors representing the date of the last edit. A complete refactoring occurred between 2012 and 2014.

The second approach involves comparing two codes with similar technologies and sizes. These comparisons yield valuable insights into how a code evolves and is monitored over time. The challenges faced by developers were often associated with suboptimal programming practices and code maintenance.

Conclusions

In summary, the comparison provides valuable insights into how real teams effectively manage technical debt in High-Performance Computing (HPC) software development. By examining both technical and human aspects, the analysis enables a comprehensive understanding of codebase evolution and workforce dynamics. Given the rising cost of achieving optimal performance on the latest hardware, we believe that closely monitoring the evolution of our software and community is an essential asset to sustain HPC computing.

Acknowledgement

Funded by the European Union. This work has received funding from the European High Performance Computing Joint Undertaking (JU) and Germany, Italy, Slovenia, Spain, Sweden, and France under grant agreement No 101092621.

fig_3 fig_4

Disclaimer

Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European High Performance Computing Joint Undertaking ( JU) and Germany, Italy, Slovenia, Spain, Sweden, and France. Neither the European Union nor the granting authority can be held responsible for them.

References

  1. Reed et al. (2022) - Reinventing High Performance Computing: Challenges and Opportunities, Daniel Reed, Dennis Gannon, Jack Dongarra, 2022, arXiv:2203.02544 [cs.DC].

  2. McCabe (1976) - A Complexity Measure, T.J. McCabe, IEEE Transactions on Software Engineering, vol. SE-2, no. 4, pp. 308-320, 1976, doi:10.1109/TSE.1976.233837.

  3. Tornhill (2015) - Your Code as a Crime Scene, Adam Tornhill, Pragmatic Bookshelf, vol. 1, 2015.

  4. Himayat et Ahmad (2023) - Software Understandability using Software Metrics: An Exhaustive Review, Saif Himayat, Dr Ahmad, SSRN Electronic Journal, 2023, doi:10.2139/ssrn.4447189.

  5. Lenarduzzi et al. (2023) - Does Cyclomatic or Cognitive Complexity Better Represents Code Understandability? An Empirical Investigation on the Developers Perception, Valentina Lenarduzzi, Terhi Kilamo, Andrea Janes, 2023, arXiv:2303.07722 [cs.SE].

  6. Lavazza et al. (2023) - An empirical study on software understandability and its dependence on code characteristics, Luigi Lavazza, Sandro Morasca, Marco Gatto, Empirical Software Engineering, vol. 28, no. 6, p. 155, 2023, doi:10.1007/s10664-023-10396-7. (https://doi.org/10.1007/s10664-023-10396-7)

  7. Jansson et al. (2021) - Neko: A Modern, Portable, and Scalable Framework for High-Fidelity Computational Fluid Dynamics, Niclas Jansson, Martin Karp, Artur Podobas, Stefano Markidis, Philipp Schlatter, 2021, arXiv:2107.01243 [cs.MS].

Like this post? Share on: TwitterFacebookEmail


Thibault Marzlin is an engineer working on COOP Python tools.
Antoine Dauptain is a research scientist focused on computer science and engineering topics for HPC.

Keep Reading


Published

Category

Pitch

Tags

Stay in Touch