Should scientific software be linted? A look at pylint

Reading time: 7min

library

Photo fotografirende. A book is much more than a pile of sheets with words. There are paragraphs, margin, chapters, page numbers, tables, cover, authors, edition, etc… The same goes for codes, which are meant to be read Following some rules helps a lot the reader.

Scientific software is most often not written by computer scientists, but by people trained in other sciences. What’s more, getting the result right can be a significant challenge, requiring high domain knowledge and focus. Unfortunately, this can lead the only criterion for “good” code to be it’s scientific correctness. As software engineering practitioners know this is a recipe for disaster.

While software engineering is not something scientist can hope to master in a short time, some automatic tools can help increase significantly the quality of code produced, for a low effort. This is the case of linting. While many issues cannot be detected by automated linters, linted code is more readable, homogeneous, and easier to maintain, a great step in the right direction.

In this post, we’ll try to show you how, through linting, a low effort can significantly increase your code quality.

We’ll focus on the Python language, and its popular linting package pylint. If you want to read more, but on the topic of large legacy scientific codes in Fortran, jump to scanning Fortran HPC codes.

Who is code written for?

Any fool can write code that a computer can understand. Good programmers write code that humans can understand. ― Martin Fowler

Scientific software is read, not just by computers, but by humans, a lot. Notably, scientists need to know the details of the data they produce, and they aren’t likely to blindly trust a code base. What’s more, science is about exploring and inventing, so code is often tweaked and modified for new studies.

When your code is read by fellow scientists, there’s a chance you’ll get some questions. Probably a lot of them, if your code isn’t considered “readable” by your readers. That’s either going to take up a lot of your time to answer, or possibly discourage people from using your code. That’s too bad, they could have cited you!

Linters are a shortcut for this future interaction. You can think of them like an extremely pedantic friend. Many questions about your coding style, potential bugs and suspicious conduct will be treated as you write code, instead of waiting for someone to have to shout at you for it later.

But how does one define “readability”? What’s readable for me can be obscure to someone else. How can I ever get into the future readers’ head? The simple answer is: through standards. Linters are the result of endless arguments between thousands of people. They are the pinnacle of compromises when it come to coding style. In python, the most consensual writing style is defined by the PEP-8 standard, which the popular pylint package enforces. It comes with a handy metric, the Pylint score: you get a 10/10 for perfectly conforming to the standard, less it you stray from it.

With this handy metric, increasing code quality, and making sure the most people will be able to read it in the future, is simple: we just need to increase the pylint score!

Getting started with pylint

pylint isn’t part of Python’s standard library, so we first need to install it.

$ pip install pylint

Then, linting our code with it is as easy as:

$ pylint my_module.py 
************* Module my_module
my_module.py:38:0: C0303: Trailing whitespace (trailing-whitespace)
my_module.py:62:28: C0303: Trailing whitespace (trailing-whitespace)
my_module.py:66:0: C0305: Trailing newlines (trailing-newlines)
my_module.py:1:0: C0114: Missing module docstring (missing-module-docstring)
my_module.py:5:0: C0103: Constant name "h5file" doesn't conform to UPPER_CASE naming style (invalid-name)

------------------------------------------------------------------
Your code has been rated at 6.88/10 (previous run: 6.88/10, +0.00)

All right, 6.88/10, not bad! But there’s still room for improvement, and now we have a measure of it, s well as tips on the locations of the wrongful statements.

Technical details

The exact formula for the score is:

python 10.0 - ((float(5 * error + warning + refactor + convention) / statement) * 10) Notice that:

The maximum is 10, but there is no lower bound. The score can be negative, in particular if the authors were not paying attention to the PEP-8.

The size of the code wieghts the errors, thanks to the number of statements. Statements are the lines of code, unwrapped, without docstrings, comments, or blank lines. Therefore, scores for small and large modules are comparable.

Are all pylint rules relevant to scientific software crafting?

PEP-8 is vert thorough, and running pylint on your code can be overwhelming at first. In practice, many Python veterans like to say that the standard is more of a guide than a rule of law. So you could be tempted to ignore some of the warnings that pylint gives you. Here are some that we believe shouldn’t be overlooked, even for scientific software.

Bad naming (invalid-name)

Naming conventions might seem futile, but they are essential to code readability.

Names should be at least 3 char long to be meaningful. Indeed, u stands often for “velocity along x axis” in Computer Fluid Dynamics. But a one-letter variable could mean many things, depending on the context. E.g. u also stands for temperature in some heat equations examples. Confusion tends to decrease with the number of letters.

Always stick to lower case. The only exceptions are constants (UPPERCASE) and Class names (CamelCase), and they are very rare. Homogeneous naming rules like this bring strong visual cues: - if you see a blurb = CamelCase(...) somewhere, you instantly know that this is the building of an object, not your everyday assignation. - if you see a GRAVITY somewhere, you instantly know that this constant will never change.

Missing docstrings (missing-function-docstring)

Python code can be commented of course, using the # symbol. However, docstrings are a special type of comment, that are well codified and enable lots of cool features like automatic code documentation. In Python, inline comments are therefore discouraged, in favour of clear docstrings.

Here’s an example of a docstring that’s compatible with Sphinx, Python’s most popular documentation tool, and it’s autodoc feature. The function’s purpose and arguments will be displayed nicely in the documentation.

 def add_obstacle_circle(self, x_c=None, y_c=None, radius=None):
        """Add cicular obstacle.

        :param x_c: x center in m
        :param y_c: y center in m
        :param radius: radius in meters
        """
        ...

To see an example of what this automatic documentation yields, check out this in-house example.

Too many stuff? (too-many-*)

Too many branches (13/12) (too-many-branches) If you reach 12 options, maybe you overlooked some commons traits. The rule of thumb is “666”. Choose between 6 options max, then again 6 options max, then again 6 options (216 combinations).

“few people can understand more than three levels of nested ifs” ― Steve McConnell, Code Complete
Too many statements (72/50) (too-many-statements) If your function/method is beyond 50 statements, maybe you are doing a bit too much in one single stride. Consider whether you can take one piece of this code into a separate function/method.
Too many local variables (21/15) (too-many-locals) If you have more than 15 locals variables, maybe you are re-defining too many things at a low level here (code smell : primitive obsession). Maybe a better data structure (a class or dataclass) would simplify things?
Too many nested blocks (6/5) (too-many-nested-blocks) If you have too many nested blocks (>5), maybe you can move some pieces in dedicated functions. These pieces will have their own docstrings, and if well written you can abstract away some complexity when reading the current function.

Unnecessary “else” after “return” (no-else-return)

This one is a bit tricky. We recommend the reading of this post on Else after return. As a rule of thumb, try to comply to pylint and use implicit else, and if you end up with something too cryptic, go explicit

Do we care about spacings and trailing space?

Yes and no excuses. Tools such as autopep8 can take care of these menial things for you (and can do a lot more). If you are in a bad mood, you can also use black, the most uncompromising tools of all (no customization allowed).

>black flame_metric.py 
reformatted flame_metric.py
All done! ✨ 🍰 ✨
1 file reformatted.

Disabling rules locally

If you think pylint is too nagging and some rules really don’t apply to you, you can easily disable them at various levels. Read the message control section to see the different levels where disabling/enabling rules can take place.

As an example, to disable the too many branches warning in a function, simply add a specific comment before the function definition:

    # pylint: disable=too-many-branches
    def solve_climate_crisis(world):
        """Unacceptable population control policy."""

        if country is "groland":
            ...
        elif country is "mordor";
            ...
        eltf ... :
           ...
        ...
        return new_world
        """

What about this 79 chars limit on a single line?

This is the oldest argument I guess. Sticking to PEP8 79 chars have nice advantages, because code fits better in 3-columns merge tools, is easier to copy/paste in a slide, and can be easily inserted into printable documents. However, in the age of 4K screens, many choose to relax this constraint slightly, which in turns reduces the number of line breaks you have to make.

Even if you configure a 100 or 110 limit, exceptions can be done, as stated before. When a line is too long but the wrapping would make it awkward to read, you can just disable the line limit locally :

this = is * a * reaallly(long, line) - spanning * (over / 100 * characters) # pylint: disable=line-too-long

Here again autopep8 and black can do the wrapping for you. Be aware that black performs the indentation is some ways that pylint deos not like , due to different interpretation of the PEP008.

Going too far

Have you heard of Goodheart’s law? When the metric becomes a target, it ceases to be a good metric. So if the pylint score becomes your only target, bad things can happen to your code. For example:

Too many arguments (6/5) (too-many-arguments)

is a clue that this function is using a lot of inputs:

def awesome_func(arg1_path, arg2_start, arg3_end, arg4_step, arg5_version, arg6_binary):
    """
    Read a series of data files PATH_{ITER}.h5 
    where ITER runs from START to END by STEP

    arg1_path : string, full path to data files
    arg2_start : integer, START of indexes
    arg3_end : integer, END of indexes
    arg4_step : integer, STEP of indexes
    arg5_version : string, either "legacy" or "V7*"
    arg6_binary: boolean, True if binary
    """
    (...)

The input is made of plenty of little primitive bits. pylint is saying:

Your input is scattered over many arguments: maybe these arguments are too primitive - or maybe your function does too many actions and could be split?

A smart answer

You could e.g. make a FileBundle class that points to a set of files here. As a bonus, you could enrich the information about these files, like if they are in binary format, and what their version is. Now, your function expects only an argument arg1_filebundle of type FileBundle:

def awesome_func(arg1_filebundle):
    """
    read a series of files from a FileBundle

    arg1_filebundle : FileBundle object point to files to read
    """
    (...)

A dangerous answer

You could also think: “let’s replace 2 arguments by a single *args, so we’re back to 5 arguements.”

def awesome_func(arg1_filebundle, arg2_start, arg3_end, arg4_step, *args):
    """
    read a serie of data files  PATH_{ITER}.h5 
    where ITER is from START to END by STEP

    arg1_path : string, full path to data files
    arg2_start : integer, START of indexes
    arg3_end : integer, END of indexes
    arg4_step : integer, STEP of indexes

    Keywords arguments

    *args = [arg5_version, arg6_binary]
    arg5_version : string, either "legacy" or "V7*"
      arg6_binary: boolean, True if binary
    """

Technically, that will indeed make the warning go away, and raise the pylint score, but obviously it doesn’t respect its intent. *args is not meant to shorten a list of positional arguments at all (read about * args and **kwargs here ).

So, simply put, the tool is automatic, but you’re still expected to exercice good judgment when interpreting its result. That’s why seasoned developers rarely look for a 10/10 score. Either they ignore some rules explicitly, or they settle for a certain bar, like 8/10.

Key Takeaways

While tedious for the lonely programmer, a linter is critical for a community, and scientific programming is not an exception.

A linter is for you if:

you cannot bear any more criticism on your style.
you want people to read and contribute to your code.
you like this “what could be improved next?” feeling.
you like to boast on your high scores, but are too lazy to do it with test coverage.

It is not for you if :

you will put this code in the trashcan tonight.
your coding style is already flawless (ha!).
you do not want other people to read your code.

handwritten

(Phot Dim Hou) Handwritten notes are enough for your own perusal, and can be shared with a friend or two. But If you expect to have more readers, you will eventually move to more strict formatting rules. Same goes for code…

Antoine Dauptain is a research scientist on computer science and engineering for HPC. He is the assistant team leader of COOP.

Corentin Lapeyre was a research scientist focused on AI for physical modeling at Cerfacs until Feb. 2024. He is working now at NVIDIA