6. Interactive Computing on Cyclone with Jupyter Notebooks
6.1. Overview
This tutorial introduces participants to running Jupyter Notebooks directly on Cyclone’s compute nodes, enabling interactive workflows for data analysis, AI model development, and other computational tasks. Participants will gain an understanding of the benefits of using Jupyter Notebooks in an HPC environment and learn the step-by-step process to launch and access them. By the end of the tutorial, users will be equipped with the knowledge to set up and interact with Jupyter Notebooks efficiently on Cyclone.
6.1.1. Why Jupyter Notebooks on HPC?
Jupyter Notebooks offer a highly interactive environment that seamlessly combines code execution, visualizations, and narrative explanations, making them ideal for tasks like data exploration, visualization, and AI model development. Their intuitive, web-based interface simplifies complex workflows, lowering the learning curve for users across various expertise levels.
Leveraging Jupyter Notebooks on HPC systems amplifies these benefits by providing access to powerful compute resources, such as CPUs and GPUs, that can handle large-scale datasets and perform demanding AI training or numerical simulations. This integration enables users to work interactively and efficiently, tackling computational challenges beyond the capabilities of local machines.
6.2. Learning Objectives
By the end of this tutorial, participants will be able to:
- Understand the advantages of using Jupyter Notebooks on HPC systems for interactive computing.
- Follow the steps to configure and launch Jupyter Notebooks on Cyclone’s compute nodes.
- Establish secure SSH tunnels to access notebooks from a local browser.
- Optimize resource allocation for Jupyter Notebook sessions using SLURM scripts.
6.3. Prerequisites
-
T01 - Introduction to HPC Systems: This tutorial will give you some basic knowledge on HPC systems and basic terminologies.
-
T02 - Accessing and Navigating Cyclone:This tutorial will give you some basic knowledge on how to connect, copy files and navigate the HPC system.
6.4. Workflow Steps
Running Jupyter Notebooks on an HPC system involves allocating resources using a SLURM script and establishing a secure connection to access the notebook interface in your local browser, or VS Studio.
To launch and access a notebook from Cyclone's compute nodes, the following workflow must be followed:
- Create a Clean Environment: Create a clean
conda
environment with Jupyter Notebook and relevant dependencies. - Write a SLURM Script: Create a SLURM job script specifying the resources required for your Jupyter session, such as CPUs, memory, or GPUs.
- Submit the Script: Use the sbatch command to submit the script to the HPC scheduler, which will allocate the requested resources and launch the Jupyter Notebook server.
- Create an SSH Tunnel: Establish a secure SSH tunnel to forward the notebook's port from the remote HPC system to your local machine, enabling browser access.
- Open the Notebook: Use the forwarded port to access the Jupyter Notebook interface in your web browser, enabling an interactive and powerful environment for your tasks.
6.5. Initial Setup
First, establish a connection to Cyclone using SSH:
ssh username@cyclone.hpcf.ac.cy
⚠️ Replace username
with your actual Cyclone username. If you encounter connection issues, refer back to Tutorial 02 - Accessing and Navigating Cyclone.
Next, create an environment with the necessary dependencies.
⚠️ During these steps you might see this in your terminal:Just type the letter y and then press Enter to continue.Proceed ([y]/n)?
First, create a simple
conda
environment
module load Anaconda3
conda create --name notebookEnv
Your terminal should look something like this
(base) [gkosta@front02 ~]$ module load Anaconda3
(base) [gkosta@front02 ~]$ conda create --name notebookEnv
Retrieving notices: ...working... done
Channels:
- defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /nvme/h/gkosta/.conda/envs/notebookEnv
Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate notebookEnv
#
# To deactivate an active environment, use
#
# $ conda deactivate
To activate the environment, type
conda activate notebookEnv
You should now see the name of the environment before your username:
(base) [gkosta@front02 ~]$ conda activate notebookEnv
(notebookEnv) [gkosta@front02 ~]$
Once you activate the environment, to install the dependencies required by Jupyter notebook run the following:
(notebookEnv) [gkosta@front02 ~]$ conda install -c conda-forge notebook
Proceed ([y]/n)? y
Downloading and Extracting Packages:
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
⚠️ This installation might take a few minutes. Be patient and don't interrupt the process.
6.5. Launching Jupyter on a Compute Node
We'll use a pre-configured Slurm script to launch our Jupyter server. Let's break down the key components:
Step 1: Slurm Environment Setup
We first setup the basic slurm environment variables so our job can be submitted using
sbatch
:
#!/bin/bash -l
#SBATCH --job-name=jupyter_test
#SBATCH --partition=gpu # Partition
#SBATCH --nodes=1 # Number of nodes
#SBATCH --gres=gpu:1 # Number of GPUs
#SBATCH --ntasks-per-node=1 # Number of tasks
#SBATCH --cpus-per-task=10 # Number of cpu cores
#SBATCH --mem=20G # Total memory per node
#SBATCH --output=job.%j.out # Stdout (%j=jobId)
#SBATCH --error=job.%j.err # Stderr (%j=jobId)
#SBATCH --time=1:00:00 # Walltime
#SBATCH -A <your_project_id> # Accounting project
In this instance, we're requesting resources from 1 Node (
--nodes=1
) in the GPU partition (--partition=gpu
) with:
- 1 GPU (
--gres=gpu:1
) - 1 hour (
--time=1:00:00
) - 20GB of RAM (
--mem=20G
) - 10 CPU cores. (
--cpus-per-task=10
)
The job name is
jupyter_test
and the usage will be deducted from account your_project_id
.
⚠️ Remember to replace your_project_id
with your allocated project budget.
Step 2: Activate conda
Environment
We load the Anaconda3 module and activate the environment created previously:
# Load any necessary modules and activate environment
module load Anaconda3
conda activate notebookEnv
Step 3: Configure the Jupyter Server
This piece of the Slurm script initialises some basic variables so we can securely connect to our Jupyter server:
# Add our environment as a notebook kernel
python -m ipykernel install --user --name=notebookEnv
# Compute node hostname
HOSTNAME=$(hostname)
# Generate random ports for Jupyter
JUPYTER_PORT=$(shuf -i 10000-60000 -n 1)
# Generate a random password for Jupyter Notebook
PASSWORD=$(openssl rand -base64 12)
# Hash the password using Jupyter's built-in function
HASHED_PASSWORD=$(python -c "from jupyter_server.auth import passwd; print(passwd('$PASSWORD'))")
Let's look at the above code snapshot step by step:
First, we start by adding our environment as a notebook kernel. This is done so we can effieciently manage our python packages. You can add more environments for different use cases. For example you can have a
conda
environment for PyTorch and one for Tensorflow.
# Add our environment as a notebook kernel
python -m ipykernel install --user --name=notebookEnv
Then, we retrieve the hostname or IP of the compute node:
# Compute node hostname
HOSTNAME=$(hostname)
Next, we generate random port numbers so we're less likely to use an already used port. Additionally, we generate a random hashed password to avoid unauthorised usage of your Jupyter server and HPC resources.
# Generate random ports for Jupyter
JUPYTER_PORT=$(shuf -i 10000-60000 -n 1)
# Generate a random password for Jupyter Notebook
PASSWORD=$(openssl rand -base64 12)
# Hash the password using Jupyter's built-in function
HASHED_PASSWORD=$(python -c "from jupyter_server.auth import passwd; print(passwd('$PASSWORD'))")
Step 4: Launching the Jupyter Server
We launch the jupyter server with the variables we just generated. Feel free to change the
--notebook-dir
option to point at whatever directory you want.
# Run Jupyter notebook
jupyter notebook --port=$JUPYTER_PORT --NotebookApp.password="$HASHED_PASSWORD" --notebook-dir="$HOME" --no-browser > jupyter.log 2>&1 &
The
jupyter
command generates a blocking process, meaning it keeps control of our bash session until we end that process. So we redirect it's output to the jupyter.log
file and leave it running as a background process.
Step 5: Generating Connection Commands
Since we want to connect from our personal machine, laptop for example, to the Jupyter server running on the compute node, we'll need an SSH tunnel. This tunnel will first create a jump connection from the front node to our assigned compute node, and then bind the port our server is running to our local machine's port. We've prepared the code which automatically generates this command for you:
LOGIN_HOST="cyclone.hpcf.cyi.ac.cy"
# Prepare the message to be displayed and saved to a file
CONNECTION_MESSAGE=$(cat <<EOF
==================================================================
Run this command to connect on your jupyter notebooks remotely
ssh -N -J ${USER}@${LOGIN_HOST} ${USER}@${HOSTNAME} -L ${JUPYTER_PORT}:localhost:${JUPYTER_PORT}
Jupyter Notebook is running at: http://localhost:$JUPYTER_PORT
Password to access the notebook: $PASSWORD
==================================================================
EOF
)
# Print the connection details to both the terminal and a txt file
echo "$CONNECTION_MESSAGE" | tee ./connection_info.txt
wait
The Complete Script
The complete script for Steps 1-5 is listed below for your convenience:
#!/bin/bash -l
#SBATCH --job-name=jupyter_test
#SBATCH --partition=gpu # Partition
#SBATCH --nodes=1 # Number of nodes
#SBATCH --gres=gpu:1 # Number of GPUs
#SBATCH --ntasks-per-node=1 # Number of tasks
#SBATCH --cpus-per-task=10 # Number of cpu cores
#SBATCH --mem=20G # Total memory per node
#SBATCH --output=job.%j.out # Stdout (%j=jobId)
#SBATCH --error=job.%j.err # Stderr (%j=jobId)
#SBATCH --time=1:00:00 # Walltime
#SBATCH -A <your_project_id> # Accounting project
# Load any necessary modules and activate environment
module load Anaconda3
conda activate notebookEnv
# Add our environment as a notebook kernel
python -m ipykernel install --user --name=notebookEnv
# Compute node hostname
HOSTNAME=$(hostname)
# Generate random ports for Jupyter
JUPYTER_PORT=$(shuf -i 10000-60000 -n 1)
# Generate a random password for Jupyter Notebook
PASSWORD=$(openssl rand -base64 12)
# Hash the password using Jupyter's built-in function
HASHED_PASSWORD=$(python -c "from jupyter_server.auth import passwd; print(passwd('$PASSWORD'))")
# Run Jupyter notebook
jupyter notebook --port=$JUPYTER_PORT --NotebookApp.password="$HASHED_PASSWORD" --notebook-dir="$HOME" --no-browser > jupyter.log 2>&1 &
sleep 5
LOGIN_HOST="cyclone.hpcf.cyi.ac.cy"
# Prepare the message to be displayed and saved to a file
CONNECTION_MESSAGE=$(cat <<EOF
==================================================================
Run this command to connect on your jupyter notebooks remotely
ssh -N -J ${USER}@${LOGIN_HOST} ${USER}@${HOSTNAME} -L ${JUPYTER_PORT}:localhost:${JUPYTER_PORT}
Jupyter Notebook is running at: http://localhost:$JUPYTER_PORT
Password to access the notebook: $PASSWORD
==================================================================
EOF
)
# Print the connection details to both the terminal and a txt file
echo "$CONNECTION_MESSAGE" | tee ./connection_info.txt
wait
To create the script:
[gcosta@front02 ~]$ cd $HOME
[gcosta@front02 ~]$ mkdir tutorial_06
[gcosta@front02 ~]$ cd tutorial_06
[gcosta@front02 tutorial_06]$ touch launch_notebook.sh
[gcosta@front02 tutorial_06]$ nano launch_notebook.sh # copy the Bash code above
[gcosta@front02 tutorial_06]$ chmod +x launch_notebook.sh # make the script executable
Step 6: Job Submission
Now that everything is configured, let's submit this slurm script and see what it does.
Submit the
launch_notebook.sh
from inside tutorial_06
directory using the following command:
[gcosta@front02 tutorial_06]$ sbatch launch_notebook.sh
Submitted batch job 1034638
In this instance
1034638
is your job id. To view the status of your job you can use the squeue
command:
squeue -u $USER
The output will look like this:
[gcosta@front02 tutorial_06]$ sbatch launch_notebook.sh
Submitted batch job 1034638
[gcosta@front02 tutorial_06]$ squeue --u $USER
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1034638 gpu jupyter_ gkosta R 38:09 1 gpu01
ℹ️ Under theST
column you can see the status of your job. In this caseR
means it's running. If you seeCF
then it means your node is in its configuration state - waiting 5 minutes should be enough for it to get ready and for your job to start. If you seePD
then it means your job is Pending resource allocation, meaning there aren't enough resources and your job has been placed on a queue.
When you're sure your job is running, you should see some new files generated in your directory:
[gcosta@front02 tutorial_06]$ ls -l
total 5
-rw-r--r-- 1 gkosta p166 382 Dec 17 13:09 connection_info.txt
-rw-r--r-- 1 gkosta p166 81 Dec 17 13:48 job.1034638.err
-rw-r--r-- 1 gkosta p166 474 Dec 17 13:09 job.1034638.out
-rw-r--r-- 1 gkosta p166 8308 Dec 17 13:50 jupyter.log
-rwxr-xr-x 1 gkosta p166 1977 Dec 17 12:47 launch_notebook.sh
job.1034638.out
is your jobs output stream redirectionjob.1034638.err
is your jobs error stream redirectionjupyter.log
is your jupyter server log outputconnection_info.txt
contains the information on how to access the Jupyter Server on the compute node.
ℹ️ The only file we are interested in here is connection_info.txt
, which will be described in detail in the next section. Unless you are debugging something the remaining files shouldn't concern you.
6.6. Connect to the Jupyter Server
We'll look at two different options on how we can run notebooks on the Jupyter server we just launched on the compute node of Cyclone:
- Browser
- VSCode
6.6.1. Locating the Connection Information
Before we connect to the Jupyter Server, we need to create the SSH tunnel to securely forward ports from the Cyclone to our local machine. The connection info is stored in a text file with the name
connection_info.txt
. The file is located in the directory the Slurm script launch_notebook.sh
was executed from (i.e. in $HOME/tutorial_06
). To view its content you can use your VSCode editor if you're following from Tutorial 03: Setting up and Using Development Tools, or simply use the cat
command from your terminal:
[gcosta@front02 tutorial_06]$ cat ./connection_info.txt
==================================================================
Run this command to connect on your jupyter notebooks remotely
ssh -N -J gkosta@cyclone.hpcf.cyi.ac.cy gkosta@gpu01 -L 11083:localhost:11083
Jupyter Notebook is running at: http://localhost:11083
Password to access the notebook: s23un9qxYjpenFnE
==================================================================
6.6.2. Establishing the SSH Tunnel
Locate the tunneling command in
connection_instructions.txt
. It looks like:
ssh -N -J <username>@cyclone.hpcf.cyi.ac.cy <username>@<batch-node> -L <port>:localhost:<port>
In my case, the command for SSH Tunneling would be:
ssh -N -J gkosta@cyclone.hpcf.cyi.ac.cy gkosta@gpu01 -L 11083:localhost:11083
In other words, running this command will create a secure connection for user
gkosta
from compute node gpu01
to our local machine through Cyclone's login nodes and via ports 11083
. Note that for your case, the command will be adjusted with your own username, allocated compute node and ports.
Now, open a new terminal and run your own SSH Tunneling command on your local machine:
gkosta@gkosta-dell:~$ ssh -N -J gkosta@cyclone.hpcf.cyi.ac.cy gkosta@gpu01 -L 11083:localhost:11083
ℹ️ The SSH command is blocking, meaning nothing will be printed when you run the above command. You may be prompted though for your key's passphrase. Otherwise, your cursor will stay there blinking with the connection established. Minimise the window and you are ready for the next step.
‼️ The SSH command should be run on a fresh local terminal, NOT the one already connected to Cyclone.
6.6.3. Connecting via Browser
With the SSH tunnel running, our local machine now is now connected to the compute node via the port
11083
in the above example. To launch the Jupyter notebook in a browser, just pick your favourite browser and copy the link printed from connection_info.txt
, in our case it's http://localhost:11083
.
You should reach a page asking for the password looking like this:
The password can be found again in
connection_info.txt
. Once we input the password from, which in this example is s23un9qxYjpenFnE
, and press the Log In button, we're in!
Let's create a new notebook! Click the New button
Now we can see serveral options:
- Notebook kernels: Various Python Kernels available to be used
- Python 3 (ipykernel): Default Python kernel
- notebookEnv: Our custom kernel we added
ℹ️ Note that both kernels have the same Python Interpreter, i.e., the one in our
conda
environment. - Terminal: Launches a terminal session on the compute node. You can use this for running
htop
ornvidia-smi
to view hardware utilisation. - Console: Launches a python interactive shell.
- New File: Create a new file, this might be a text file, a python script or whatever you want.
- New Folder: Create a new folder
If you click on any of the Python kernel options a new tab in your browser will open with a notebook:
As you can see, in this case we have selected the
notebookEnv
. In other words, this notebook will now run on 1 GPU on the GPU nodes (gpu01
) of Cyclone, using the environment as configured in notebookEnv
kernel.
6.6.4. Connecting via VSCode
Alternatively, you can use VS code to view and run notebooks in similar manner. To view and run notebooks in VSCode we need to have some extentions installed. Searching
jupyter
in the extensions tab of VSCode should show you something like this:
Click install on the one circled and wait for it to be installed. Once that's done, Open a Folder on your local machine:
For this example we have created a folder called
vs_tutorial_06
, but it can be any folder you'd like:
Right click inside the folder, and create a New File:
Name the file
example_notebook.ipynb
. Make sure to add the .ipynb
extention at the end!
Now the notebook should open in your VSCode window. You are now ready to connect this notebook to the Jupyter Server running on the compute node. We do this by selecting a remote server by pressing the Select Kernel button at the top right of your screen:
Then you will see this in the top middle of your screen:
Select Existing Jupyter Server...
Add the link that's inside your
connection_info.txt
Add the password, again found inside the
connection_info.txt
And finally a display name for your connection, this can be anything you want:
Select the appropriate kernel:
That's it. Now your notebook is running remotely on the compute node! Adding a couple of cells and calling
nvidia-smi
shows us the 1 GPU running on gpu01
:
6.7. Notes and Troubleshooting
6.7.1. Port Conflicts in SSH Tunnel
Error message "Address already in use" or unable to connect to the specified port.
- Check if the port is already in use:
lsof -i :PORT_NUMBER # On your local machine
- Kill any existing SSH tunnels:
pkill -f "ssh -N -J"
6.7.2. SSH Authentication issues
SSH key authentication failures
- Verify your SSH key is properly added to Cyclone:
ssh-add -l # List loaded keys
ssh-add ~/.ssh/id_rsa # Add your key if needed
- Check key permissions:
chmod 600 ~/.ssh/id_rsa
chmod 700 ~/.ssh
💡 If you are still facing SSH/connection issues, the Notes and Troubleshooting Section in Tutorial 2 might be benefecial.
6.7.3. Activating Conda in Slurm Script
When initialising
conda
inside a Slurm Script, such as when running conda activate notebookEnv
in launch_notebook.sh
, you might come across the error "Failure to initialise Conda". If conda
fails to initialise inside the SLURM script, you will need to add the following after loading the Anaconda module (i.e. after module load Anaconda3
with):
__conda_setup="$('/nvme/h/buildsets/eb_cyclone_rl/software/Anaconda3/2023.03-1/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/nvme/h/buildsets/eb_cyclone_rl/software/Anaconda3/2023.03-1/etc/profile.d/conda.sh" ]; then
. "/nvme/h/buildsets/eb_cyclone_rl/software/Anaconda3/2023.03-1/etc/profile.d/conda.sh"
else
export PATH="/nvme/h/buildsets/eb_cyclone_rl/software/Anaconda3/2023.03-1/bin:$PATH"
fi
fi
unset __conda_setup
6.7.4. General Debugging Tips
Check the job output files for errors:
cat job.[jobid].out
cat job.[jobid].err
⚠️ Replace [jobid]
with the your Job ID
These commands will print out the contents of the outputs of the job. They might contain some more information that will guide you to find the problem. Some examples:
- The
conda
environment name might be wrong. - Package dependency issues inside your
conda
environment. - The project you're requesting resources from might not have access to the partition you requested.