How to forward a port with Slurm `srun`?
Feb 1, 10:08pm

To forward a port using srun in a Slurm environment, you typically want to set up an SSH tunnel to a compute node where your job is running. Here’s how you can do it:


1. Request an Interactive Job with Port Forwarding

When you submit an interactive job with srun, you can request port forwarding using the --tcpip or --tunnel options, but Slurm itself does not directly support port forwarding. Instead, you usually combine srun with SSH tunneling.

Basic Command:

bash
srun --pty --nodes=1 --ntasks-per-node=1 --cpus-per-task=1 --mem=4G --time=01:00:00 --x11=first bash
  • This gives you an interactive shell on a compute node.

2. Set Up SSH Tunneling

Once your job is running, you can set up an SSH tunnel from your local machine to the compute node.

Find the Compute Node:

When your srun job starts, it will print the hostname of the compute node, e.g., node123.

Create the SSH Tunnel:

On your local machine, run:

bash
ssh -L <local_port>:localhost:<remote_port> <your_username>@<login_node> -J <your_username>@<compute_node>
  • Replace:
    • <local_port>: The port on your local machine (e.g., 8888).
    • <remote_port>: The port on the compute node (e.g., 8888).
    • <your_username>: Your username on the cluster.
    • <login_node>: The login node of the cluster.
    • <compute_node>: The compute node where your job is running (e.g., node123).

Example:

bash
ssh -L 8888:localhost:8888 benjamin@login.cluster.edu -J benjamin@node123
  • This forwards localhost:8888 on your machine to node123:8888.

3. Use the Forwarded Port

Now, any service running on <remote_port> (e.g., a Jupyter notebook) on the compute node will be accessible on your local machine at localhost:<local_port>.


Alternative: Use srun with --tunnel (if supported)

Some Slurm setups allow you to specify --tunnel directly with srun:

bash
srun --tunnel=8888:8888 --pty bash
  • This is less common and depends on your cluster’s configuration.

Notes:

  • Ensure the port you want to forward is not blocked by the cluster’s firewall.
  • If you’re running a service (e.g., Jupyter), make sure it’s bound to 0.0.0.0 or the node’s hostname, not just localhost.

Feb 1, 10:08pm

This is a shared conversation and may include unverified content that doesn't reflect Mistral AI's views. Login to resume chatting.