For Windows users without SSH, use KiTTY.
Login and password should be given to you by the training staff.
cd $WORKDIR
We will start with the tutorials of this repository:
git clone https://github.com/CExA-project/cexa-kokkos-tutorials.git
If we have time, we will pick some exercises of the Kokkos official tutorials too:
git clone https://github.com/kokkos/kokkos-tutorials.git
Please refer to the Ruche documentation. Ruche provides CPU nodes with Intel Xeon Gold 6230 (Cascade Lake) processors, and GPU nodes with NVIDIA V100 devices.
module load gcc/11.2.0/gcc-4.8.5 \
cmake/3.28.3/gcc-11.2.0 \
cuda/12.2.1/gcc-11.2.0
export OMP_PROC_BIND=spread \
OMP_PLACES=threads
Build with the OpenMP backend:
cmake -B build_openmp \
-DCMAKE_BUILD_TYPE=Release \
-DKokkos_ENABLE_OPENMP=ON
cmake --build build_openmp \
--parallel 5
Build with the Cuda backend:
cmake -B build_cuda \
-DCMAKE_BUILD_TYPE=Release \
-DKokkos_ENABLE_CUDA=ON \
-DKokkos_ARCH_VOLTA70=ON
cmake --build build_cuda \
--parallel 5
Ruche is a cluster using Slurm as job submission system. You should not run executables on the login node, beyond compilers and usual tasks.
Warning
Beware that the login node login02
has a GPU, which may mess with your configuration!
These commands allow to run one executable in a CPU or a GPU job, and redirect its standard output/error to the screen, just like you were running it live.
Run on CPU:
srun --partition cpu_short --cpus-per-task 20 --pty path/to/exe
Run on GPU:
srun --partition gpu --gres gpu:1 --pty path/to/exe
Alternatively to the one-liner commands provided above, you can also use submission scripts.
Script for CPU:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=20
#SBATCH --mem=10G
#SBATCH --time=00:10:00
#SBATCH --partition=cpu_short
#SBATCH --job-name=kt25-exercise-cpu
#SBATCH --output=%x.%J.out
#SBATCH --error=%x.%J.err
#SBATCH --export=NONE
#SBATCH --propagate=NONE
module load gcc/11.2.0/gcc-4.8.5
export OMP_PROC_BIND=spread \
OMP_PLACES=threads
path/to/exe
Script for GPU:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --mem=10G
#SBATCH --time=00:10:00
#SBATCH --partition=gpu
#SBATCH --gres gpu:1
#SBATCH --job-name=kt25-exercise-gpu
#SBATCH --output=%x.%J.out
#SBATCH --error=%x.%J.err
#SBATCH --export=NONE
#SBATCH --propagate=NONE
module load gcc/11.2.0/gcc-4.8.5 \
cuda/12.2.1/gcc-11.2.0
path/to/exe
Submit your job:
sbatch my_script.sh
See the documentation for how to manage your Slurm jobs.