Using conda at Cerfacs
By Victor Xing - 2020/06/15
What is conda?
conda is a language-agnostic package and environment manager. For Python, it can replace the combined use of pip
and venv
. Using virtual environments is heavily encouraged at Cerfacs because they offer project-specific, user-controlled, clean work environments. For the most part, conda and pip
+venv
are interchangeable and choosing one or the other comes down to user preference. I personally like the flexibility of conda and its ability to easily control the Python version of virtual environments.
There are many conda tutorials out there. This post will focus on how to use it on the Kraken and Nemo internal clusters at Cerfacs.
Setting up conda on the internal clusters
At the time of writing, conda versions 4.4.10 and 4.5.1 are installed on Kraken and Nemo, respectively.
- Load the
python3.6/anaconda
module
module load python3.6/anaconda
This will enable all conda commands and activate a default base
environment containing a few basic packages.
- To use project-specific packages, you should create your own virtual environment.
An important caveat is that by default, conda stores its cache (containing package builds that can be reused for later installs) and environments in the home directory. Since the home directory of the clusters has limited disk and inode space, you should specify a better-suited location, for instance a dedicated pyenvs
directory in the scratch. Create a ~/.condarc
file and add in the following lines:
envs_dirs:
- /scratch/coop/<yourusername>/pyenvs
pkgs_dirs:
- /scratch/coop/<yourusername>/pyenvs/conda-pkgs
- Create your virtual environment. In this example, it is named
toto
, runs on python 3.7, and contains thetensorflow
package version 2.1.0 with all its dependencies.
conda create --name=toto python=3.7 tensorflow=2.1.0
This will create a /scratch/coop/<yourusername>/pyenvs/toto
directory with your new environment. To create an environment at a different location from the one specified in envs_dirs
, go to that directory and use --prefix
instead of --name
.
When creating the environment, you should specify all the packages you know will be included so that dependencies can be handled in the best possible way.
- Activate your environment
conda activate toto
- You can still install additional packages and conda will ask you to accept any dependency conflict resolutions
conda install numpy=1.19
Notes
- Conda uses its own package channels, so some PyPI packages may not be available directly via
conda install
. In this case, you can use pip in your conda environment and it will work just the same as usual. Only use pip if the package is not available in any conda channels. - To enable your virtual environment (conda or
venv
) inside jupyter notebooks, activate your environment and install theipykernel
package. Then run the following command
python -m ipykernel install --user