User Tools

Site Tools


guide:soft:miniconda3

Conda


Conda is a package management system which was initially created for Python, but currently also supports several other languages such as R or lua. Conda can be used to quickly find, install and update packages and their dependencies using community-maintained remote repositories (also called channels). Software packages and scientific libraries are installed in “environments” to provide the ability to maintain different, often incompatible, sets of software. Environments are also managed by Conda.

Please note: compared to the software modules that we provide on the cluster, there are much more and newer libraries available via Conda that you can manage yourself, but they may not be as well optimized for the processor architectures on the cluster. In addition, by default, Conda installs packages in user's home directory ($HOME) which has a quota for the size and number of files. You should take care to re-set the Conda installation directory to a subdirectory of your group's $SOFTWARE directory. Otherwise, you'll probably quite quickly run into your quota in $HOME, with inconvenient consequences (like not being able to use graphical logins any more before you remove the extraneous files, possibly erroneously also deleting files that should remain).

Please note: it is not recommended to mix cluster software modules and Conda-managed software in the same work environment.

See the Conda documentation for detailed instructions. A short list of Conda commands as a PDF file can be found here.

In this section we explain how to use Conda in the cluster.

Conda usage on the cluster

In the cluster, Conda is available through the Miniconda installation, which is a small version of Anaconda and only includes conda package manager, Python and a small number of other packages.

Creating a Conda environment

In order to create and use Conda environments on the cluster, you first need to load the Miniconda3 module:

module load Miniconda3

If you want to manage the Python 2 based conda environments, load the module Miniconda2 instead. Miniconda3 is based on Python 3.

Please note: Since loading the Miniconda module will also initialize conda for shell interaction, you do not need to additionally run the conda init <shell> command modifying your shell init file, e.g. $HOME/.bashrc, which should be avoided, as this may cause issues with your interactive or batch jobs. If you ran the command in the past, you may have conda entries in your shell init file. Please edit the file and remove the lines between # >>> conda initialize >>> and # <<< conda initialize <<<.

If you use Conda-installed packages, the Miniconda3 module should be the only module you load in your work environment or in your job.

The following will create a conda environment named myconda and install the packages python and numpy into it:

conda create -n myconda python numpy

Note that if you have loaded the Miniconda3 module, Python 3.x will be installed, unless you explicitly specify the python version above. By default, the Miniconda2 module installs Python 2.x.

Your conda environments by default are located in your home directory under the $HOME/.conda/envs path. We recommend to change this location. Either use the --prefix flag specifying the path to the directory you want (probably $SOFTWARE/<nameofmycondadir>), or assign the path to the environment variable $CONDA_ENVS_PATH. Another way is to edit your $HOME/.condarc file and define your conda environment locations using the key envs_dirs:. For more details about environments see the Conda documentation.

Packages that can be installed using conda are provided with channels. Popular channels are Conda Forge and Bioconda, which are set by default on the cluster. If you want to install packages from a channel which is not among the default channels, you can specity it using the --channel <your-channel> (or shortly -n <your-channel>) flag during package installation, or add it permanently into your $HOME/.condarc file under the channels: key. See the Conda documentation for more information about managing channels.

The default conda settings are defined in the cluster-wide configuration file $EBROOTMINICONDA3/.condarc

auto_activate_base: false

envs_dirs:
  - $HOME/.conda/envs

pkgs_dirs:
  - $HOME/.conda/pkgs

channels:
  - conda-forge
  - bioconda
  - defaults

The file also defines the default location of conda's package cache directory (key pkgs_dirs:). The $CONDA_PKGS_DIRS environment variable overwrites the pkgs_dirs: setting. If you want to remove packages that are not used in any environment, run:

conda clean --all

Execute conda info to see your current conda settings, including the default channel URLs and the location of your conda environments.

The command lists all your environments:

conda info --envs

The active environment is marked with an asterisk (*).

To search for a package in configured channels, use the command:

conda search <package-name>

If you want to install additional packages in an existing environment, e.g. in myconda, it must first be activated:

conda activate myconda

Once in the conda environment, which may be recognized by the presence of the environment name (myconda)$ on the command prompt, you can install packages. For example, the following installs pandas, matplotlib and a specific version of scipy:

(myconda)$ conda install scipy==1.6.3 pandas matplotlib

Note that package installation has to be done on a cluster login node, as the compute nodes do not have access to the public network and thus to the conda channels.

Should you need to see all packages installed in the environment myenv then run:

conda list -n myenv

Without the -n option, packages in the current active environment are listed.

A faster solver for Conda

The default package dependency solver of Conda is known to be slow or even fail to resolve some environments. If this is your case, you may try an alternative, faster solver for Conda using the --solver flag as follows:

conda create -n myconda --solver=libmamba package1 package2 ...

The option --solver is also available for conda install|remove|update commands. If you want to enable the libmamba solver permanently, either add solver:libmamba to your $HOME/.condarc file or run the command:

conda config --set solver libmamba

To revert back to the default classic solver:

conda config --remove-key solver

More on the libmamba solver can be read here.

Using your Conda environments

To use applications from your conda environments interactively or in a job script, you first need to load the Miniconda module and then activate the environment containing the applications:

module load Miniconda3
conda activate myenv

Note that we do not recommend putting the above lines in your shell init file, e.g. ~/.bashrc. This may cause issus with your interactive or batch jobs.

Here is a sample job script to run an application from the conda environment:

conda-app-job.sh
#!/bin/bash -l
#SBATCH --job-name=my-conda-application
#SBATCH --nodes=1 
#SBATCH --cpus-per-task=20
#SBATCH --mem=60G
#SBATCH --time=00:30:00
#SBATCH --mail-user=user@yourinstitute.uni-hannover.de
#SBATCH --mail-type=END
 
# Activate your conda environment
module load Miniconda3
conda activate <your_conda_env_name>
 
# Run app
<run your application>

As already mentioned, if you use conda managed software, you should not mix it with the cluster software modules, thus loading only the Miniconda module in your interactive shell or in a job script.

Further Reading

guide/soft/miniconda3.txt · Last modified: 2024/05/22 14:27 by zzzznana

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki