Skip to main content

Python Packages with Virtual Environments

tip

This page describes how you can create virtual environments without containers in $SCRATCH//home/$USER. We strongly encourage you to create your computational environments within Overlay files attached to Apptainer containers as described in this section. Please note that creating both kinds of conda environments is strongly discouraged as it leads to packages from one environment being accidentally used in another.

In order to be able to install new Python packages and make your work reproducible, please use virtual environments. There is more than one way to create a private environment in Python.

Create project directory and load Python module

## Find python version you need
module avail python
## created directory for your project and cd there
mkdir /scratch/$USER/my_project
cd /scratch/$USER/my_project
## load python module (different versions available)
module load python/intel/3.8.6

Automatic deletion of your files

This page describes the installation of packages on /scratch. One has to remember, though, that files stored in the HPC scratch file system are subject to the HPC Scratch old file purging policy:

Automatic deletion of your files Files on the /scratch file system that have not been accessed for 60 or more days will be purged (see HPC Storage for details). Click to see how you can work around this limitation.

tip

Thus you can consider the following options:

  • Reinstall your packages if some of the files get deleted
    • You can do this manually
    • You can do this automatically. For example, within a workflow of a pipeline software like Nextflow
  • Pay for "Research Project Space" - for details see Research Project Space

Create virtual environment

It is advisable to create private environment inside the project directory. This boosts reproducibility and does not use space in /home/$USER:

## created directory for your project and cd there
mkdir /scratch/$USER/my_project
cd /scratch/$USER/my_project

virtualenv

virtualenv is a tool to create isolated Python environments. Since Python 3.3, a subset of it has been integrated into the standard library under the venv module. You can create new virtual environment in two ways:

  • empty
## Create an EMPTY virtual environment
virtualenv venv
  • inherit all packages from those installed on HPC already (and available in PATH after you load python module)
## Create an virtual environment that inherits system packages
virtualenv venv --system-site-packages

venv

venv is package shipped with Python. It provides subset of options available in virtualenv tool (link).

python -m venv venv

Create new virtual environment in current directory:

  • empty
## (use venv command to create environment called "venv")

python -m venv venv
  • inherit all packages from those installed on HPC already (and available in PATH after you load python module)
## Inhering all packages
python -m venv venv --system-site-packages

Install packages. Keep things reproducible

## activate
source venv/bin/activate
## install packages
pip install <package you need>
## If package was inherited, but you want to install it in your own env anyway
pip install <package you need> --ignore-installed
## export list of packages (to report together with paper and/or to reproduce environment on another computer)
pip freeze > requirements.txt
## restore
pip install -r requirements.txt

Close an Activated Virtual Environment

If you have activated a virtual environment, you can exit it with the following command:

deactivate

Use with sbatch

When you use this env in sbatch script, please use

module purge;
source venv/bin/activate;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK;
python python_script.py

If you use mpi

mpiexec bash -c "module purge;
source venv/bin/activate;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK;
python python_script.py"