The error bash: nvcc: command not found means the NVIDIA CUDA Compiler (nvcc) is not in your PATH or is not installed on the node you are using.
Run:
bashls /usr/local/cuda*
or
bashfind / -name "nvcc" 2>/dev/null
If CUDA is installed, you’ll see paths like /usr/local/cuda-11.8/bin/nvcc.
Most Slurm clusters use environment modules to manage software. Try:
bashmodule avail
Look for a CUDA module (e.g., cuda/11.8, cuda/12.1). Then load it:
bashmodule load cuda/11.8
Replace 11.8 with the version available on your cluster.
If you found nvcc but it’s not in your PATH, add it:
bashexport PATH=/usr/local/cuda-11.8/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH
Again, replace 11.8 with your CUDA version.
Check if nvcc is now available:
bashwhich nvcc nvcc --version
bashsingularity exec --nv /path/to/cuda-container.img nvcc --version
If you’re submitting a job, add the module load command to your Slurm script:
bash#!/bin/bash #SBATCH --gres=gpu:1 module load cuda/11.8 nvcc --version