The following steps will help you to run a Jupyter server and a Dask cluster on one of the SURF systems running SLURM, such as Spider, Snellius or Lisa. Find information on how to get access to SURF infrastructure here. Or on the DelftBlue Supercomputer at TU Delft.
This guide assumes that you have received credentials from SURF or TU Delft, that you are able to access the system via SSH, and that an SSH key pair has been set up for password-less login, see the dedicated SURF guides for Lisa/Snellius, Spider, and DelftBlue.
Contents:
This repository includes a Python script to install the components remotely on a SURF platform (Snellius/Spider/etc.), and to start Jupyter and Dask services on that platform from a local machine.
On your local machine, download the script by cloning this repository:
git clone http://github.com/RS-DAT/JupyterDaskOnSLURM.git
cd JupyterDaskOnSLURM
The script requires Python 3 and the Fabric library
and, currently, decorator as well,
which can be installed via pip
:
pip install fabric decorator
Running the script the first time using the option --add_platform
queries the
user for a few infomation about the platform e.g. username and path to the
private ssh-key, and stores these in a configuration file at
.config/platforms/platforms.ini
for later use:
python runJupyterDaskOnSLURM.py --add_platform
NOTE: Don't use
~
for entering a path.
Before installing on the platform, edit the environment.yaml
file in the
folder to include all conda/pip packages that are needed to run your proposed
workflow.
After editing the environment.yaml
file, installing the components on the
platform can be done from your local machine as:
python runJupyterDaskOnSLURM.py --uid <UID> --mode install
NOTE: that installation can take a while and requires user input to complete.
In order to configure access to the SURF dCache
storage
via the Filesystem Spec
library (internally used by
Dask and other libraries), you can use the configuration file provided in
config/fsspec
. Edit config/fsspec/config.json
, replacing the <MACAROON>
string with the actual macaroon (see this
guide
for information on how to generate it), then copy the file to
${HOME/.config/fsspec}
on the platform:
mkdir -p ~/.config/fsspec
cp ./config/fsspec/config.json ~/.config/fsspec/
More information on how to read files from the dCache storage are provided in the documentation of the dCacheFS package.
You can run Jupyter Lab on the remote server by running the following command on your local system:
python runJupyterDaskOnSLURM.py --uid <UID> --mode run
A browser window should open up. Note that it might take few seconds for the Jupyter server to start, after which you should have access to a JupyterLab interface (login using the password set as above).
A Dask cluster (with no worker) is started together with the JupyterLab session, and it should be listed in the menu appearing when selecting the Dask tab on the left part of the screen. Workers can be added by clicking the "scale" button on the running cluster instance and by selecting the number of desired workers.
From the Dask tab in the Jupyter interface, click "shutdown" on a running cluster instance to kill all workers and the scheduler (a new cluster based on the default configurations can be re-created by pressing the "+" button).
From the Jupyter interface, select "File > Shutdown" to stop the Jupyter server and release resources.
If the job running the Jupyter server and the Dask scheduler is killed, the Dask
workers will also be killed shortly after (configure this using the
death-timeout
key in the config file).
Uninstalling the components on the platform can be done from your local system as:
python runJupyterDaskOnSLURM.py --uid <UID> --mode uninstall
This will remove all associated files and folders. However, mamba will remain installed on the platform and needs to be removed manually, if needed.
NOTE: Follow these instructions if = 'install' does not work.
NOTE: if you work on Spider, follow the instructionContainer wrapper for Spider system.
Login to the SURF system from your terminal, then clone and access this repository:
git clone http://github.com/RS-DAT/JupyterDaskOnSLURM.git
cd JupyterDaskOnSLURM
Alternatively copy the local copy of JupyterDaskOnSLURM
which has the modified
environment.yaml
file with the updated packages to the platform using the
scp
command. For usage of the scp
command, you can refer to this blog
post.
The required packages are most easily installed via the conda
package manager,
and they are available from the conda-forge
channel. In order to install
conda
(and its faster C++ implementation mamba
) and to configure the
conda-forge
channel as the default channel, download and run the following
installation script (you can skip this step if conda
is already installed):
wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
chmod +x Mambaforge-Linux-x86_64.sh
./Mambaforge-Linux-x86_64.sh
After accepting the license term and selecting the installation location (the
default is ${HOME}/mambaforge
), type yes
to initialize Mambaforge.
Logout/login to activate.
Create a new environment using the conda environment file in this repository - note that the base version of this file has been provided but can be updated to include relevant packages for your workflow:
mamba env create -f environment.yaml
Activate the environment and install additional dependencies using
mamba
/pip
, as required by each use case:
mamba activate jupyter_dask
mamba install ...
pip install ...
After having created the environment, we need to configure few settings.
Configure the password to access Jupyter:
jupyter server --generate-config
jupyter server password
and make the Jupyter config file that is created after running the previous step readable by the current user only:
chmod 400 ~/.jupyter/jupyter_server_config.py
The repository directory scripts
contains template job scripts for Spider and
Snellius. These scripts define the requirements of the SLURM job running the
Jupyter server, they can be customized depending on the specific user needs.
The repository directory config/dask
contains a template Dask configuration
file. This file defines the default worker settings in the Dask cluster, and it
thus needs to be edited depending on the SURF system or other system which
we are running on. In particular, uncomment the correct block in
config/dask/config.yaml
, then copy the file to ${HOME}/.config/dask
:
mkdir -p ~/.config/dask
cp config/dask/config.yaml ~/.config/dask/
The default Dask settings can be further tuned depending on user needs.
In order to configure access to the SURF dCache
storage
via the Filesystem Spec
library (internally used by
Dask and other libraries), you can use the configuration file provided in
config/fsspec
. Edit config/fsspec/config.json
, replacing the <MACAROON>
string with the actual macaroon (see this
guide
for information on how to generate it), then copy the file to
${HOME/.config/fsspec}
:
mkdir -p ~/.config/fsspec
cp config/fsspec/config.json ~/.config/fsspec/
More information on how to read files from the dCache storage are provided in the documentation of the dCacheFS package.
As an alternative to the deployment script, the Jupyter and Dask services can be started via the following "manual" procedure.
Login to the SURF system, then submit a batch job script based on the template
provided in scripts/jupyter_dask_<PLATFORM_NAME>.bsh
to start the Jupyter
server and the Dask scheduler on a platform, for example:
sbatch scripts/jupyter_dask_snellius.bsh
Copy the ssh
command printed in the job stdout (file slurm-<JOB_ID>.out
). It
should look like:
ssh -i /path/to/private/ssh/key -N -L 8889:NODE:8888 [email protected]
Paste the command in a new terminal window on your local machine (modify the
path to the private key). You can now access the Jupyter session from your
browser at localhost:8889
.
On Spider, using conda environments will lead to performance issues, due to conda's nature of many small files. In such cases, one can containerize the conda environment. One way to do this is to use the hpc-container-wrapper tool. This is a container wrapper tool developed by Finnish IT center for science (CSC).
To set up the container wrapper, first log in to Spider:
Then, clone the JupyterDaskOnSLURM
repository in your home directory:
git clone http://github.com/RS-DAT/JupyterDaskOnSLURM.git
change to the JupyterDaskOnSLURM
directory:
cd JupyterDaskOnSLURM
and execute the spider_container_deploy.sh
script:
bash spider_container_deploy.sh
This will run the setup and containerization of the environment.yaml
file
contained in the JupyterDaskOnSLURM
directory (please modify as needed before
running the script).
Now you are all set!
If you want to manually set up the container wrapper on Spider, follow the steps below.
First change to your home directory:
cd ~
Then, clone both the hpc-container-wrapper
and JupyterDaskOnSLURM
repositories:
git clone https://github.com/CSCfi/hpc-container-wrapper.git
git clone http://github.com/RS-DAT/JupyterDaskOnSLURM.git
Then, copy the container config file spider.yaml
file from the
JupyterDaskOnSLURM
to the .config
file in hpc-container-wrapper
:
cp ./JupyterDaskOnSLURM/config/container/spider.yaml ./hpc-container-wrapper/configs/
Change to the hpc-container-wrapper
directory and run the
install.sh
script to install the container wrapper:
cd hpc-container-wrapper
bash install.sh spider
Next, copy the environment.yaml
file from the JupyterDaskOnSLURM
to the current directory and create a container. In the following example, we
create a container under jupyter_dask
directory:
mkdir -p ./jupyter_dask
cp ../JupyterDaskOnSLURM/environment.yaml .
bin/conda-containerize new --prefix ./jupyter_dask ./environment.yaml
At the end of the installation, the tool will print the path to the executable
directory (bin
directory) of the container. For example:
export PATH="/absolute/path/to/the/container/bin:$PATH"
cd ..
mkdir -p ~/.config/dask
cp JupyterDaskOnSLURM/config/dask/config_spider.yml ~/.config/dask/config.yml
Then add the following lines to the ~/.config/dask/config.yml
file, under the
slurm
section of jobqueue
section, note that you need to replace the export PATH
part with the output from the container creation step:
job_script_prologue:
- 'export PATH="/absolute/path/to/the/container/bin:$PATH"' # Export environment path to
python: python
After adding the lines, the ~/.config/dask/config.yml
file should look like this:
distributed:
... Some other configurations ...
labextension:
... Some other configurations ...
jobqueue:
slurm:
... Some other configurations ...
job_script_prologue:
- 'export PATH="/home/caroline-oku/caroline/Public/demo_mobyle/container_wrapper/hpc-container-wrapper/tmp/bin:$PATH"'
python: python
Then also configure the SLURM job file
JupyterDaskOnSLURM/scripts/jupyter_dask_spider_container.bsh
. Then replace the
following part with the PATH exportaion from the container creation step:
# CHANGE THIS TO THE ABSOLUTE PATH TO THE CONTAINER BIN
export PATH="/absolute/path/to/the/container/bin:$PATH"
Now you have reached the exit point of the deployment script! The Jupyter Server
with Dask plugin can now be started using the
jupyter_dask_spider_container.bsh
script.
sbatch JupyterDaskOnSLURM/scripts/jupyter_dask_spider_container.bsh
After the job starts, there will be an example ssh
command printed in the job stdout (file slurm-<JOB_ID>.out
). It should look like:
ssh -i /path/to/private/ssh/key -N -L 8889:NODE:8888 [email protected]
You can execute this command in a new terminal window on your local machine
(modify the path to the private key). You can now access the Jupyter session
from your browser at localhost:8889
.
Follow the steps bellow to install and configure
- Use
module
to loadminiconda
:
module load miniconda3/4.12.0
- Clone the repository:
git clone http://github.com/RS-DAT/JupyterDaskOnSLURM.git
cd JupyterDaskOnSLURM
- If required, modify the
environment.yaml
to include relevant packages for your workflow. Then, create an environment using conda":
conda env create -f environment.yaml
- Activate the environment and install additional dependencies using
conda
/pip
, as required by each use case:
conda activate jupyter_dask
conda install ...
pip install ...
- Configure the password to access Jupyter:
jupyter server --generate-config
jupyter server password
and make the Jupyter config file that is created after running the previous step readable by the current user only:
chmod 400 ~/.jupyter/jupyter_server_config.py
The repository directory scripts
contains template job scripts for Spider,
Snellius and DelftBlue. These scripts define the requirements of the SLURM job
running the Jupyter server, they can be customized depending on the specific
user needs.
#!/bin/bash
### Initialize the server
#SBATCH --partition=compute ## one of 'compute', 'gpu', 'memory'
#SBATCH --ntasks=1
#SBATCH --time=23:00:00
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=4G
#SBATCH --account=innovation ## replace with research-<faculty>-<department>. This will enable to request more resources.
source ~/.bashrc
conda activate jupyter_dask
node=`hostname -s`
port=`shuf -i 8400-9400 -n 1`
if [ -z ${lport:+x} ]; then lport="8889" ; else lport=${lport}; fi
echo "Run the following on your local machine: "
echo "ssh -i /path/to/private/ssh/key -N -L ${lport}:${node}:${port} ${USER}@login.delftblue.tudelft.nl"
jupyter lab --no-browser --port=${port} --ip=${node}
- Configure the Dask settings. The repository directory
config/dask
contains templates for the Dask configuration files. The cofiguration file defines the default worker settings in the Dask cluster, and it thus needs to be edited depending on the SURF system or other system which we are running on. In particular, the file corresponding to DelftBlueconfig/dask/config_delftblue.yml
. Copy the file to${HOME}/.config/dask
:
mkdir -p ~/.config/dask
cp -r config/dask/config_delftblue.yml ~/.config/dask/config.yml
The default Dask settings can be further tuned depending on user needs.
On DelftBlue, you need to start the the Jupyter and Dask services manually using this procedure.
- Login to DelftBlue, then submit a SLURM job based using the template provided
in
scripts/jupyter_dask_delftblue.bsh
to start the Jupyter server and the Dask scheduler on the login node:
sbatch scripts/jupyter_dask_delftblue.bsh
Copy the ssh
command printed in the job stdout (file slurm-<JOB_ID>.out
). It
should look like:
ssh -i /path/to/private/ssh/key -N -L 8889:NODE:8888 [email protected]
Paste the command in a new terminal window on your local machine (modify the
path to the private key). You can now access the Jupyter session from your
browser at localhost:8889
.