Skip to content
Snippets Groups Projects
Commit 6fffd0f8 authored by Francois Ledoyen's avatar Francois Ledoyen
Browse files

add slurm inspect job

parent 4c73ef6c
No related branches found
No related tags found
No related merge requests found
Pipeline #26879 passed
......@@ -4,6 +4,18 @@ order: 1
# SLURM guide
## Inspect running job
You can further inspect a running job by connecting to it with this command :
```sh
srun --jobid=jobid --overlap --pty bash
```
This will open an interactive shell as a job step under an already allocated
job. I.e. you will be able to see how your job is behaving. For distributed
memory jobs you will get a shell at the first node used by your job.
## Debug in Real-time on SLURM
### Problem
......@@ -15,7 +27,7 @@ process. It goes something like this :
1. Check for errors/change code.
1. Repeat endlessly until your code works.
### Solution
### Solution : interactive jobs (`srun --pty`)
Fortunately, there’s a better way, you can debug in real-time like so:
......@@ -25,6 +37,7 @@ Fortunately, there’s a better way, you can debug in real-time like so:
srun --partition=<name> --nodes=<nnodes> --gres=gpu:<ngpus> --time=<time> --pty bash -i
```
````
1. This is how it looks once the interactive job starts :
```sh
......@@ -38,3 +51,4 @@ Fortunately, there’s a better way, you can debug in real-time like so:
### Sources
- [HPC-UiT Services User Documentation](https://hpc-uit.readthedocs.io/en/latest/jobs/interactive.html#starting-an-interactive-job)
````
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment