Skip to content
Snippets Groups Projects
Commit 0e7070c8 authored by Francois Ledoyen's avatar Francois Ledoyen
Browse files

add slurm inspect job

parent 4c73ef6c
No related branches found
No related tags found
No related merge requests found
Pipeline #26881 passed
...@@ -4,6 +4,18 @@ order: 1 ...@@ -4,6 +4,18 @@ order: 1
# SLURM guide # SLURM guide
## Inspect running job
You can further inspect a running job by connecting to it with this command :
```sh
srun --jobid=jobid --overlap --pty bash
```
This will open an interactive shell as a job step under an already allocated
job. I.e. you will be able to see how your job is behaving. For distributed
memory jobs you will get a shell at the first node used by your job.
## Debug in Real-time on SLURM ## Debug in Real-time on SLURM
### Problem ### Problem
...@@ -15,7 +27,7 @@ process. It goes something like this : ...@@ -15,7 +27,7 @@ process. It goes something like this :
1. Check for errors/change code. 1. Check for errors/change code.
1. Repeat endlessly until your code works. 1. Repeat endlessly until your code works.
### Solution ### Solution : interactive jobs (`srun --pty`)
Fortunately, there’s a better way, you can debug in real-time like so: Fortunately, there’s a better way, you can debug in real-time like so:
...@@ -35,6 +47,7 @@ Fortunately, there’s a better way, you can debug in real-time like so: ...@@ -35,6 +47,7 @@ Fortunately, there’s a better way, you can debug in real-time like so:
1. Check for errors/change code continuously until code is fixed or node has 1. Check for errors/change code continuously until code is fixed or node has
timed out. timed out.
### Sources ## Sources
- [HPC-UiT Services User Documentation](https://hpc-uit.readthedocs.io/en/latest/jobs/interactive.html#starting-an-interactive-job) - [HPC-UiT Services User Documentation](https://hpc-uit.readthedocs.io/en/latest/jobs/interactive.html#starting-an-interactive-job)
- [Innsbruck University's SLURM Tutorial](https://www.uibk.ac.at/zid/systeme/hpc-systeme/common/tutorials/slurm-tutorial.html)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment