Python Packages with Virtual Environments
This page describes how you can create virtual environments without containers in $SCRATCH//home/$USER. We strongly encourage you to create your computational environments within Overlay files attached to Apptainer containers as described in this section. Please note that creating both kinds of conda environments is strongly discouraged as it leads to packages from one environment being accidentally used in another.
In order to be able to install new Python packages and make your work reproducible, please use virtual environments. There is more than one way to create a private environment in Python.
Create project directory and load Python module
## Find python version you need
module avail python
## created directory for your project and cd there
mkdir /scratch/$USER/my_project
cd /scratch/$USER/my_project
## load python module (different versions available)
module load python/intel/3.8.6
Automatic deletion of your files
This page describes the installation of packages on /scratch. One has to remember, though, that files stored in the HPC scratch file system are subject to the HPC Scratch old file purging policy:
Automatic deletion of your files
Files on the /scratch file system that have not been accessed for 60 or more days will be purged (see HPC Storage for details). Click to see how you can work around this limitation.
/scratch file system that have not been accessed for 60 or more days will be purged (see HPC Storage for details). Click to see how you can work around this limitation.Thus you can consider the following options:
- Reinstall your packages if some of the files get deleted
- You can do this manually
- You can do this automatically. For example, within a workflow of a pipeline software like Nextflow
- Pay for "Research Project Space" - for details see Research Project Space
Create virtual environment
It is advisable to create private environment inside the project directory. This boosts reproducibility and does not use space in /home/$USER:
## created directory for your project and cd there
mkdir /scratch/$USER/my_project
cd /scratch/$USER/my_project
virtualenv
virtualenv is a tool to create isolated Python environments. Since Python 3.3, a subset of it has been integrated into the standard library under the venv module. You can create new virtual environment in two ways:
- empty
## Create an EMPTY virtual environment
virtualenv venv
- inherit all packages from those installed on HPC already (and available in PATH after you load python module)
## Create an virtual environment that inherits system packages
virtualenv venv --system-site-packages
venv
venv is package shipped with Python. It provides subset of options available in virtualenv tool (link).
python -m venv venv
Create new virtual environment in current directory:
- empty
## (use venv command to create environment called "venv")
python -m venv venv
- inherit all packages from those installed on HPC already (and available in PATH after you load python module)
## Inhering all packages
python -m venv venv --system-site-packages
Install packages. Keep things reproducible
## activate
source venv/bin/activate
## install packages
pip install <package you need>
## If package was inherited, but you want to install it in your own env anyway
pip install <package you need> --ignore-installed
## export list of packages (to report together with paper and/or to reproduce environment on another computer)
pip freeze > requirements.txt
## restore
pip install -r requirements.txt
Close an Activated Virtual Environment
If you have activated a virtual environment, you can exit it with the following command:
deactivate
Use with sbatch
When you use this env in sbatch script, please use
module purge;
source venv/bin/activate;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK;
python python_script.py
If you use mpi
mpiexec bash -c "module purge;
source venv/bin/activate;
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK;
python python_script.py"