CS Hydra Cluster
The CS Hydra cluster contains ubuntu noble(r24.04) compute and gpu nodes. Here are the paritions listing for the Hydra cluster.
root@gpu2301:~# sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
compute* up infinite 1 mix echidna
compute* up infinite 28 idle smblade16a[1-12,14],smblade16b[1-8],smblade24a[1-6],typhon
gpus up infinite 5 mix gpu[1907,2002,2201,2301,2501]
gpus up infinite 1 alloc gpu2003
gpus up infinite 22 idle gpu[1601-1605,1701-1708,1801-1802,1901-1906,2001]
tstaff up infinite 3 idle node[1-3]
Connecting to Hydra
The simplest way to connect to Hydra is through the ssh.cs.brown.edu gateway or the fastx cluster. You need to set up your ssh keypair in order to use the ssh gateway or fastx cluster.
Submit a compute job
You can submit a job using sbatch:
You can confirm that your compute job ran successfully by running:
By default, your job is submitted to the compute partition and will run for 1hr if you don't specify the partion name or run time limit.
Submit a gpu job
To submit a gpu job, you must use the gpus partition and request a gpu in your request. Use sbatch with the following options to submit a gpu job.
You can confirm that your compute job ran successfully by running:
The gpus partition contains all the gpu hardware down in the CIT datacenter. You must request at least 1 gpu resource in order to run a gpu job on the gpus partition.
Showing the job queue
To see the job queue, use the squeue command.
Cancel a job
To cancel your job, use the scancel command i. e. scancel <job id>.
Using slurm options in a script
The script you submit to slurm can contain slurm options in it. Here is a simple template, batch.script, to use for that:
# This is an example batch script for slurm on Hydra
#
# The commands for slurm start with #SBATCH
# All slurm commands need to come before the program # you want to run. In this example, 'echo "Hello World!"
# is the command we are running.
#
# This is a bash script, so any line that starts with # is # a comment. If you need to comment out an #SBATCH line, use # infront of the #SBATCH
#
# To submit this script to slurm do:
# sbatch batch.script
#
# Once the job starts you will see a file MySerialJob-****.out
# The **** will be the slurm JobID
# --- Start of slurm commands -----------
# set the partition to run on the gpus partition. The Hydra cluster has the following partitions: compute, gpus, debug, tstaff
#SBATCH --partition=gpus
# request 1 gpu resource
#SBATCH --gres=gpu:1
# Request an hour of runtime. Default runtime on the compute parition is 1hr.
#SBATCH --time=1:00:00
# Request a certain amount of memory (4GB):
#SBATCH --mem=4G
# Specify a job name:
#SBATCH -J MySerialJob
# Specify an output file
# %j is a special variable that is replaced by the JobID when the job starts
#SBATCH -o MySerialJob-%j.out #SBATCH -e MySerialJob-%j.out
#----- End of slurm commands ----
# Run a command
echo "Hello World!"
Slurm Training
Slurm training is available the CCV slurm workshop. Go to the CCV Help page for details.