Skip to content

Releases: amayer1983/docksentry

v1.23.6

19 Jun 20:15

Choose a tag to compare

Fixed

Disk-usage monitoring no longer rides on the update-check schedule.

The disk check only ran inside a cron tick — so with a once-a-day schedule (0 18 * * *) the disk was inspected once per day, right after the update. A container that filled the disk between two daily ticks (e.g. a crash-looping service writing gigabytes of logs) crossed the warning threshold and ran the disk to 100% with no alert. Disk monitoring now runs on its own cadence, independent of the update cron — every disk_check_interval_seconds (default 300s / 5 min), starting at boot.

A failed state-file write no longer causes the same alert to repeat endlessly.

When the disk is full, the throttle/state files for the disk warning and the weekly report can't be written — so the "already sent" marker never persisted and both fired again on every pass. The scheduler now keeps an in-memory guard for each: once a disk warning or weekly report has gone out, it won't repeat for its throttle window even if the state file can't be persisted. The guard re-arms when the disk drops back below the threshold (or the process restarts).

v1.23.5

18 Jun 19:40

Choose a tag to compare

Fixed

An update that leaves a container crash-looping is no longer reported as a success.

A container can pass back through running/starting between crashes — so an image whose new version exits on startup (e.g. a failed database migration) and gets revived over and over by its restart: always policy would, at the moment Docksentry happened to inspect it, look alive. The health-wait only checked the current State/Health snapshot, never the restart count, so it either reported healthy or timed out into the "slow start" branch — and in the standalone path that branch even deleted the rollback backup before recording success. The result: a service down in a tight restart loop, marked as a clean update, with the backup it needed for recovery already gone.

The health-wait now records the container's RestartCount before it starts watching and re-checks it on every poll. If the count climbs while we wait, the update is classified as a crash loop — a hard failure: roll back to the backup (standalone), leave the old container in place (compose), and report success: false with a clear "crash-restart loop" message. The backup is never removed on this path.

It also no longer declares success the instant a container first looks healthy: it confirms the container stays up, with no restarts, for crashloop_stable_seconds (default 30s) before reporting healthy. This catches a container that boots cleanly and then crashes a few seconds later — slower than a single health poll would see. Genuinely slow-but-stable services (no restarts) are unaffected and still resolve to the "starting" warning rather than a false failure.

Together with the existing restart: nounhealthy path, both crash shapes are now covered: a container that stays exited, and one that loops.

v1.23.4 — /selfupdate latest, old-tag warning, auto-detect import fix

16 Jun 18:16

Choose a tag to compare

Three items from the #2 thread.

Added

/selfupdate latest and /selfupdate stable

Previously /selfupdate only accepted an exact X.Y.Z version or previous, so a container running a fixed version tag (e.g. :1.19.0) had no in-band way to rejoin the rolling :latest line. Reported by @famewolf — his host was stuck on :1.19.0 and /selfupdate latest was rejected.

Plain /selfupdate warns when you're on an outdated fixed tag

A container running :1.19.0 checks that immutable tag and correctly reports "up to date" even when a much newer release exists — which is misleading and was genuinely dangerous (@famewolf's :1.19.0 host kept losing its config to the pre-v1.22.0 bug, with no signal it was stuck on a buggy version). Now, when the current tag is a fixed X.Y.Z and the upstream CHANGELOG advertises a newer release, /selfupdate says so and offers three ways out: /selfupdate <new>, /selfupdate latest, or switching the compose image to :latest. Only triggers on a plain /selfupdate; an explicit /selfupdate X.Y.Z is never second-guessed; falls back silently if the CHANGELOG can't be reached.

Fixed

Auto-detect "Import selected" no longer imports stacks you didn't pick

Reported by @famewolf: he opened the auto-detect modal to browse, checked nothing, clicked "Import selected" — and a multi-container stack got imported anyway, because v1.21.2 pre-checked multi-container stacks by default. Nothing is checked by default now; "Import selected" imports only what you explicitly tick, and clicking it with nothing selected shows a "Nothing selected" toast instead of silently creating a group.

Upgrade

docker pull amayer1983/docksentry:latest
docker compose up -d

Tip: use image: amayer1983/docksentry:latest (not a pinned :X.Y.Z) so you track new releases automatically. If you're on a pinned tag, a plain /selfupdate will now tell you and offer /selfupdate latest.

v1.23.3 — Update all acts on the notification's set

15 Jun 19:40

Choose a tag to compare

Fixes a confirmed correctness bug from #2.

Fixed

"Update all" now updates exactly what the notification showed. Reported by @famewolf: he tapped "Update all" on a notification that listed only searxng, but it updated all five containers a later check had since written to the pending list. The button carried no reference to which notification it came from — it just re-read the global pending file at click time.

Each "Updates Available" notification now snapshots its exact container set, keyed by a short token in the button's callback data (update_all:<token>). Clicking it updates that snapshot's containers and removes only their names from pending (leaving any others). Snapshots are capped FIFO (last 20). If a snapshot is gone (evicted, or the bot restarted), you get a clear "this notification is stale, run /check" message instead of silently updating the wrong set.

Verified end-to-end: with [searxng] snapshotted and the pending list later overwritten with five containers, "Update all" updates only searxng.

Still open

  • Slow SIGTERM response (the bot blocks in Telegram's long-poll on shutdown). This is a tradeoff between poll frequency and shutdown speed rather than a clear-cut bug — deferred pending a decision on the right balance.

Upgrade

docker pull amayer1983/docksentry:latest
docker compose up -d

v1.23.2 — Fix Web UI JavaScript (broken in all browsers since v1.22.0)

15 Jun 19:30

Choose a tag to compare

Critical Web UI fix. Reported by @famewolf in #2 via his browser console: the backup-restore button "did absolutely nothing." The cause was far bigger than backup.

What was broken

The dsBackupImport confirm(...) string in the shared _BASE_JS block was written with a \n. _BASE_JS is a regular (non-raw) Python triple-quoted string, so Python turned that \n into a real newline in the rendered JavaScript:

if (!confirm('Restore from "' + file.name + '"?
This will overwrite...')) {   // ← SyntaxError: unescaped line break

An unescaped line break in a string literal is a hard SyntaxError that aborts the entire <script> block — so every function defined in _BASE_JS silently failed to define:

  • Tab switching
  • Theme toggle
  • Confirm dialogs (delete group, cleanup, selfupdate…)
  • Drag-and-drop reorder
  • Auto-detect modal
  • Webhook test button
  • Cron schedule preview
  • Toast notifications
  • Backup / restore

This affected every browser, from v1.22.0 through v1.23.1. Server-side features (favicon, Discord links, page rendering, the Telegram bot) were completely unaffected, which is why it wasn't obvious — and why the bot half of Docksentry kept working perfectly.

The fix

Removed the newline from the confirm string. Verified with a comment- and regex-aware scanner that no raw control character remains inside any JS string literal across /, /settings, and /groups.

Regression guard

scripts/pre-commit-check.py now parses _BASE_JS and fails the build if any raw newline/CR appears inside a JS string literal — the exact class of bug above. It passes clean now and was verified to catch a deliberately-injected break.

Why our earlier testing missed it

The v1.22.0 backup feature was "tested" three ways: the backend endpoint with curl (worked), a check that the function text was present in the served HTML (it was), and a browser-style upload to the endpoint (worked). None of those execute the page's JavaScript, so none caught that the script block fails to parse. @famewolf's browser console caught in one line what our tests structurally could not. The pre-commit guard closes that gap.

Upgrade

docker pull amayer1983/docksentry:latest
docker compose up -d

Hard-reload the Web UI (Ctrl+Shift+R) once.

v1.23.1 — Safe rollback + atomic update mutex (audit pass)

15 Jun 12:40

Choose a tag to compare

Proactive audit release. After the homarr deletion (v1.23.0), we swept the codebase for the same classes of bug — destructive operations without recovery, and concurrency on shared state — and fixed the two highest-risk findings before anyone hit them.

Fixed

Rollback could strand a container or destroy your only copy

All three rollback paths in the recreate flow (run-failed, unhealthy, exception handler) did docker rm <name> (no -f) then docker rename <old> <name>. Two failure modes:

  1. If the broken new container wouldn't stop, the non-forced docker rm silently failed and the rename collided — leaving you with the broken container and the old one orphaned as <name>_old.
  2. The exception handler blindly renamed <name>_old back even when no such backup existed.

New single _rollback_to_old() helper, used by all three sites. "Don't make it worse" first: if no backup exists it leaves your container completely alone (never destroys what might be your only copy); otherwise it force-removes the broken new container and restores the backup. Verified on a test host including the critical no-backup case.

Scheduler auto-update could race a manual update

The manual update paths guarded on a plain update_running bool, but the scheduler's auto-update pass ignored it entirely — so a cron tick could recreate the very container you were mid-updating from Telegram, two recreate flows racing on the same container. Replaced with a single threading.Lock claimed atomically by all four update entry points; the scheduler skips its pass when a manual update holds the lock (retries next tick).

Bonus: the lock is released in try/finally everywhere. The old update_running = False only ran at the end, so an exception outside the loop would have left the flag stuck True and blocked every future update — that latent bug is gone too.

Still open (confirmed, next)

  • "Update all" stale-snapshot (updates current pending, not the notification's set)
  • Slow SIGTERM response via long-poll block

Upgrade

docker pull amayer1983/docksentry:latest
docker compose up -d

v1.23.0 — Recover --rm containers lost during update

15 Jun 06:03

Choose a tag to compare

Critical data-loss fix. Reported by @famewolf in #2: his homarr container disappeared entirely during an update. Our first triage said "probably Docker daemon cleanup, not us." That was wrong — and we proved it by reproducing the exact mechanism on a test host.

The mechanism

  1. A --rm (AutoRemove=true) container that's slow to stop (homarr was wedged — our stop ran into the 90s timeout)
  2. Our docker stop eventually reaps the process
  3. Because --rm is set, Docker auto-removes the container the instant it stops
  4. Our docker kill fallback then reports cannot kill container: … No such container — the exact error family @famewolf saw
  5. The old code hit if not stop_ok: return False and walked away, leaving him with no container — even though we had its full inspect config in memory the whole time

The fix

We capture the container's config before stopping (we always have). Now: after the stop step we check whether the container still exists. If it vanished (the AutoRemove case), we recreate it directly from the captured config — the old container is already gone, so we skip the rename/rollback machinery and run the new one.

homarr would have been updated correctly instead of deleted.

Verified end-to-end on a test host: a wedged --rm container that vanishes on stop is now recreated with all labels and config preserved.

Honesty note

The first triage concluded "probably not us." When a user reports data loss during one of our operations, the burden is on us to prove we weren't involved, not to assume it. The empirical reproduction here came directly from re-investigating under that assumption.

Still open (confirmed, shipping separately)

  • "Update all" on a stale notification updates whatever is currently pending, not the set shown in that notification. Snapshot fix coming.
  • Bot is slow to respond to SIGTERM because it blocks in the Telegram long-poll. Fix coming.

Upgrade

docker pull amayer1983/docksentry:latest
docker compose up -d

v1.22.2 — Selfupdate history + helper false-negative fixes

14 Jun 12:41

Choose a tag to compare

Patch release addressing two #2 reports that didn't need additional diagnostic data to land.

Fixed

Selfupdate history (selfupdate vX → ?) placeholder now always patched

Reported by @famewolf. The post-boot fixup in main.py (since v1.17.6) that replaces the ? with the new VERSION was gated on post_selfupdate_restart, which depends on the deferred_check_file marker. That marker is only written by the auto-selfupdate path (cron + AUTO_SELFUPDATE=true); manual /selfupdate doesn't create it. So users running manual selfupdates — which is most users — saw the ? stick around forever in their history, making downgrade discovery harder ("what was I on before this update?").

Decoupled — the endswith("→ ?)") guard is itself the safety check, so running the fixup on every boot only ever touches the real placeholder and is a no-op otherwise.

Selfupdate no longer reports "❌ Unable to find image 'docker:cli' locally" when it actually worked

Reported by @NotRetarded. The helper-container launch (docker run docker:cli ...) was relying on Docker's implicit auto-pull when the image wasn't local. The auto-pull writes progress to stderr (`Unable to find image 'docker:cli' locally` + layer download lines). If the auto-pull went sideways — slow registry, transient blip, rate limit — the helper-launch subprocess surfaced that stderr as the failure message even when the update completed successfully seconds later.

Fixed: pre-pull docker:cli explicitly before launching the helper. Either the pull succeeds → helper launch is clean and silent → no false failure; or the pull genuinely fails → honest error pointing at the helper image (not at our update logic).

Still pending

@famewolf's homarr deletion report — waiting on diagnostic output (docker ps -a | grep homarr, full stop+kill stderr, journalctl -u docker.service excerpt) to determine whether it was a Docker-daemon cleanup race or a real bug in our code. See issue #2 thread for details.

Upgrade

docker pull amayer1983/docksentry:latest
docker compose up -d

v1.22.1 — Complete atomic-write coverage (11 more sites)

12 Jun 20:45

Choose a tag to compare

Honesty patch. Hours after v1.22.0 shipped, a careful re-audit found the atomic-write fix had been applied to only 2 of 13 sites. The other 11 are now patched too.

What v1.22.0 missed

Same bug class as v1.22.0 — `open(path, "w")` truncates the target file to 0 bytes immediately, before `json.dump` writes the new content. A kill between truncate and close leaves a partial file. v1.22.0 only fixed the high-symptom cases (`settings.json` + `groups.json` + 4 other dict files). The 11 sites missed:

File Affects
`container_store._save` `pinned.json`, `autoupdate.json`, `ask_before_major.json` (list-format files)
`maintenance._write` `maintenance.json` (active-window state)
`weekly_report._write_state` weekly-report last-sent timestamp
`update_checker` x3 history, pending updates, disk-warn state
`telegram_bot` x4 selfupdate history, pending after single update, pending after auto-update batch, deferred-check marker
`web_ui` x2 pending after Web UI single update, pending after Web UI bulk update
`main.py` post-selfupdate history fixup

For @famewolf's specific symptom (settings + groups gone) the v1.22.0 fix sufficed. But his pinned containers, autoupdate flags, history, or pending updates could have been wiped silently by the same bug class without him noticing.

Refactor

Shared `atomic_write_json(path, data, **dump_kwargs)` helper at module level in `container_store.py`. All 13 write sites now route through it instead of inlining the tmp+fsync+rename pattern. Single point of fix if we ever need to change the strategy.

Smoke-tested

  • Helper write+read roundtrip ✓
  • Helper cleans up `.tmp` after rename ✓
  • Helper forwards `indent=2` kwarg ✓
  • Helper overwrites existing file correctly ✓
  • Live Docksentry backup-export → backup-import roundtrip ✓

Lesson learned

When fixing a bug class (vs a single bug), grep for the entire pattern across the codebase before declaring the fix complete. The 30-second `grep -rn 'json.dump' app/` would have caught all 11 missed sites yesterday.

Upgrade

```bash
docker pull amayer1983/docksentry:latest
docker compose up -d
```

No env vars or breaking changes.

v1.22.0 — Atomic writes + backup/restore + data-loss alert

12 Jun 19:25

Choose a tag to compare

Major release driven by @famewolf's #2 reports. Three of his Docksentry hosts simultaneously rebooted (likely unattended-upgrades) and came back with completely empty config — container groups gone, web setup wizard re-appearing. We found and fixed the root cause, plus added two layers of defense-in-depth.

Fixed

Persistent state survives mid-write kills

Root cause: both container_store._save_dict and config.save_persistent used open(path, "w") which truncates the file to 0 bytes immediately, before the new content is written. A kill between truncate and close (host reboot, Docker daemon restart, OOM, power loss) left a 0-byte or partial-JSON file. Next boot: parse failed → empty defaults → wizard.

Bug existed since v1.7.0.

Fix: both write paths now write to <path>.tmp, flush() + os.fsync() to push bytes through the kernel page cache to disk, then os.replace() which is POSIX-atomic — either the new file is fully visible or the old one is still there, never a partial state. Applies to settings.json, groups.json, notes.json, links.json, update_windows.json, pending_major.json.

Added

"No persisted settings" Telegram alert on boot

When BOT_TOKEN is configured via env vars but /data/settings.json is missing on startup, surface a Telegram message warning of possible data loss. Users no longer discover the wizard accidentally hours later via the Web UI — the bot tells them immediately.

Backup & Restore via Web UI

New Backup & Restore card on the Settings page:

  • ⬇ Export backup — downloads a single JSON file containing every persisted state (settings, pinned, autoupdate, ask-major flags, container groups, notes, links, update windows) with a schema_version sentinel for forward compatibility
  • ⬆ Restore backup… — reads a previously-exported file and writes each section through the now-atomic save paths

Defense-in-depth against the kind of data loss that hit @famewolf. Also useful for host migrations or routine snapshots. Backup files contain webhook URLs and bot tokens — treat them like passwords.

Container Groups ordering in update notifications, plus 👑 HEAD badge

When a container is the first (head) member of a Container Group with ≥ 2 members, it now gets a 👑 badge in the "🔄 Updates Available" Telegram message. The same message also sorts updates by group position (head first, then dependents in order, orphans at the end). Mirrors the sort handle_autoupdates already did during execution but extends it to the pre-update notification. Reported by @famewolf: with a Gluetun+dependents stack, gluetun was showing up LAST in the notification, making cascade-debugging harder.

API

  • /api/backup_export (GET) — returns a docksentry-backup-YYYYMMDD-HHMMSS.json attachment with the full state bundle. Read-only.
  • /api/backup_import (POST, multipart/form-data with file field) — accepts a backup bundle, restores each known section, returns {ok, restored: [...], errors: [...], schema_version, from_version}. Unknown / missing sections are silently skipped (forward-compatible).

Cleanup

  • _groups_html removed (dead code since v1.21.1 when the legacy Settings → Groups card became a redirect banner).

Upgrade

docker pull amayer1983/docksentry:latest
docker compose up -d

Hard-reload the Web UI (Ctrl+Shift+R) once after pulling so the new Backup/Restore card's JS lands.

Strongly recommended: after pulling, open Web UI → Settings → Backup & Restore → Export backup and save the file somewhere safe. If anything goes sideways, you can restore in one click.