How to check if current Slurm node is a master or computation one?
Feb 19, 3:37pm

To determine whether a Slurm node is a master (control) node or a compute node, you can use several methods depending on your access level and the information available on the node. Here are the most common approaches:


1. Check Slurm Configuration Files

Slurm configuration files usually define the roles of nodes. You can inspect these files if you have access:

  • slurm.conf: This file lists all nodes and their roles.
    bash
    grep -E '^NodeName|^ControlMachine' /etc/slurm/slurm.conf
    • The ControlMachine line specifies the master node.
    • The NodeName lines list all nodes, often with additional attributes like State=UNKNOWN or State=IDLE for compute nodes.

2. Use Slurm Commands

If Slurm is running, you can use the following commands:

  • sinfo: Lists all nodes and their states.

    bash
    sinfo -N
    • The master node is typically marked as mixed or cloud if it also runs jobs, or it may not appear in the compute node list.
  • scontrol show nodes: Shows detailed information about each node.

    bash
    scontrol show nodes
    • Look for the NodeName and State fields. The master node may have a different state or additional flags.
  • scontrol show config: Displays the Slurm configuration, including the master node.

    bash
    scontrol show config | grep ControlMachine

3. Check Hostname

  • The master node often has a distinct hostname (e.g., master, headnode, or similar).
    bash
    hostname
    • Compare this with the output of scontrol show config | grep ControlMachine.

4. Check Running Processes

  • The master node runs Slurm daemons like slurmctld (Slurm controller daemon).
    bash
    ps aux | grep slurm
    • If you see slurmctld, you are on the master node.
    • Compute nodes run slurmd (Slurm node daemon).

5. Check Network Configuration

  • Master nodes often have specific network configurations or are the only nodes with external network access.

6. Ask the Cluster Administrator

If you are unsure or lack permissions, the cluster administrator can confirm the role of the node.


Summary Table

Slurm config files
grep -E '^NodeName|^ControlMachine' /etc/slurm/slurm.conf
ControlMachine line
sinfo
sinfo -N
Node states, master may be absent
scontrol show nodes
scontrol show nodes
Node details, master may be special
scontrol show config
scontrol show config | grep ControlMachine
ControlMachine value
Hostname
hostname
Compare with ControlMachine
Running processes
ps aux | grep slurm
slurmctld = master, slurmd = compute

Feb 19, 3:37pm

This is a shared conversation and may include unverified content that doesn't reflect Mistral AI's views. Login to resume chatting.