Slurm2Slack

SlackSlurm posts Slurm job lifecycle notifications to Slack via Incoming Webhooks. It is designed to run as a watcher on a login node, so compute nodes do not need outbound internet access.

Requirements

Python 3.8+
Slurm commands available in PATH: squeue, sacct, scontrol
Login node can reach https://hooks.slack.com

How to get Slack Incoming Webhooks

Go to your Slack workspace and navigate to "Apps". https://api.slack.com/apps?new_app=1
Click "Create New App" and choose "From scratch".

Name your app and select the workspace.

In the left sidebar, go to "Incoming Webhooks" and activate them.

Click "Add New Webhook to Workspace", choose a channel (Your own channel, not a public one), and click "Allow".

Install

pip install --user .

Quick Start

Initialize config (writes the default config file):

slackslurm init --webhook-url "https://hooks.slack.com/services/..."

Add a marker to the sbatch script header:

#SLACKSLURM project=demo mention_on_fail=@here

Run the watcher on a login node:

slackslurm watch

You can running in a tmux session or via nohup for long-term monitoring.

Optional: send a test message:

slackslurm test

How It Works

The watcher polls squeue for active jobs owned by the current user.
It only tracks jobs whose sbatch script header contains #SLACKSLURM.
When a job disappears from squeue, the watcher queries sacct for the final state and exit code. If sacct is unavailable, it falls back to scontrol show job (best effort).
Notifications are sent via Slack Incoming Webhooks using Block Kit payloads.
Job submitted:
Job finished successfully:

Commands

slackslurm init --webhook-url URL
- Creates ~/.config/slackslurm/config.json.
- You can also set SLACKSLURM_WEBHOOK_URL instead of passing the flag.
slackslurm test
- Sends a sample message to verify webhook connectivity.
slackslurm watch [--once] [--poll-interval SECONDS]
- Runs the watcher loop. --once performs a single poll cycle.

Configuration

Default config path:

~/.config/slackslurm/config.json

You can override paths with:

SLACKSLURM_CONFIG (config file path)
SLACKSLURM_STATE (state file path)

Example config:

{
  "webhook_url": "https://hooks.slack.com/services/...",
  "poll_interval_seconds": 45,
  "notify_on": ["submit", "start", "end"],
  "script_marker": "#SLACKSLURM",
  "tail_log_lines_on_fail": 40,
  "max_log_chars": 1800,
  "mention_on_fail": "@here",
  "include_log_paths": true
}

Key fields:

webhook_url: Slack Incoming Webhook URL.
notify_on: choose any of submit, start, end.
script_marker: change the marker if you want a different tag.
tail_log_lines_on_fail: number of lines to include from stderr/stdout.
max_log_chars: hard cap for the log snippet size.
mention_on_fail: mention string added to failed jobs.
include_log_paths: include stdout/stderr file paths in the message.

Marking Jobs

Add the marker line in the header (before the first non-comment line):

#!/bin/bash
#SBATCH --job-name=train_x
#SBATCH --gres=gpu:4
#SBATCH --time=12:00:00
#SBATCH --output=logs/%x_%j.out
#SBATCH --error=logs/%x_%j.err

#SLACKSLURM project=robot exp=run42 mention_on_fail=@here

python train.py

Supported tag behavior:

key=value pairs are shown in the Slack message context as tags.
mention_on_fail overrides the config value for this job only.

Running as a Daemon

For long-running sessions, use tmux or nohup:

tmux new -s slackslurm
slackslurm watch

nohup slackslurm watch > ~/.local/state/slackslurm/watch.log 2>&1 &

To stop, send SIGINT or kill the process.

State Files

State is stored at:

~/.local/state/slackslurm/state.json

If you need to reset notification history, stop the watcher and delete the state file.

Troubleshooting

No messages: ensure the sbatch script has the #SLACKSLURM marker and that the watcher runs on a login node with internet access.
Missing end notifications: confirm sacct works on your cluster.
Webhook errors: run slackslurm test and check Slack app configuration.
Log tails missing: ensure StdOut/StdErr paths exist and are readable.

Security Notes

Treat the webhook URL as a secret and do not commit it to the repo.
The config file is written with 0600 permissions when possible.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
slackslurm		slackslurm
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
slackslurm_test.sh		slackslurm_test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Slurm2Slack

Requirements

How to get Slack Incoming Webhooks

Install

Quick Start

How It Works

Commands

Configuration

Marking Jobs

Running as a Daemon

State Files

Troubleshooting

Security Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Slurm2Slack

Requirements

How to get Slack Incoming Webhooks

Install

Quick Start

How It Works

Commands

Configuration

Marking Jobs

Running as a Daemon

State Files

Troubleshooting

Security Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages