diff --git a/CHANGELOG.md b/CHANGELOG.md index 5341da1..ddde803 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,17 @@ All notable changes to [vmbackup](https://github.com/doutsis/vmbackup) will be d Format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Versions follow [Semantic Versioning](https://semver.org/). +## [Unreleased] + +### Added + +- **Native Slack notifications** — New `modules/slack_notification_module.sh` and `config/template/slack.conf` add first-class Slack incoming-webhook delivery alongside the existing email path. Mirrors the email module's load/send shape (`load_slack_config` / `send_slack_notification`) and is invoked at the same four call sites (`cleanup_on_exit`, `handle_sigterm`, replicate-only end, normal session end) so failure-path notifications fire even when the run aborts before the normal end-of-main email send. Independent enable/conditional flags (`SLACK_ENABLED`, `SLACK_ON_SUCCESS`, `SLACK_ON_FAILURE`) let operators run Slack-only, email-only, or both. Session totals (VMs ok/failed/skipped/excluded, total bytes, duration) are pulled from the same `sessions` row the email module uses via `sqlite_query_session_summary`. Only runtime dependency is `curl`. + +### Fixed + +- **Misleading "Failed to send email report" WARN on intentional skip** — `send_backup_report()` returns `2` when delivery is intentionally suppressed (module disabled, `EMAIL_ON_SUCCESS=no`, `EMAIL_ON_FAILURE=no`), distinct from `1` for real transport failure. All four call sites in `vmbackup.sh` used `if send_backup_report ...; then OK; else WARN; fi`, collapsing the intentional-skip case into the failure log line. New `_handle_notifier_rc()` helper interprets the three return codes correctly (`0`→info+sent-guard, `2`→debug+sent-guard, other→warn) and is now used at every email and Slack call site. Side benefit: the sent-guard flags (`_EMAIL_SENT` / `_SLACK_SENT`) are now also set on intentional skip, so later code paths don't retry a notification the operator explicitly suppressed. +- **`cleanup_on_exit` logged "SQLite session finalized as 'incomplete'" on successful runs** — The catch-all session finalizer ran unconditionally on every exit. On a normal successful session, `main()` had already called `sqlite_session_end` with `status='success'`; `sqlite_session_end()`'s `_SQLITE_SESSION_ENDED` idempotency guard correctly suppressed the duplicate DB write, but the surrounding `vmbackup.sh` log line still claimed the session was finalized as `incomplete`. The catch-all is now gated on `_SQLITE_SESSION_ENDED != 1` so the misleading WARN is no longer emitted when the normal exit path already finalized the session. + ## [0.5.6] - 2026-04-26 ### Changed diff --git a/config/template/slack.conf b/config/template/slack.conf new file mode 100644 index 0000000..3ea9568 --- /dev/null +++ b/config/template/slack.conf @@ -0,0 +1,62 @@ +#!/bin/bash +################################################################################# +# Slack Notification Configuration for vmbackup.sh +# +# TEMPLATE FILE - Copy to config//slack.conf and customize +# +# This file configures Slack incoming-webhook posts after backup operations. +# Independent of email — both modules can be enabled at the same time. +# +# Prerequisites: +# 1. Create a Slack incoming webhook: +# https://api.slack.com/messaging/webhooks +# 2. Paste the webhook URL into SLACK_WEBHOOK_URL below. +# 3. curl must be installed (already a vmbackup runtime dependency). +# +# Usage: Auto-sourced by slack_notification_module.sh +# +# Version: 1.0 +################################################################################# + +#============================================================================= +# ENABLE / DISABLE +#============================================================================= + +# Master enable/disable for Slack notifications +SLACK_ENABLED="no" + +#============================================================================= +# WEBHOOK +#============================================================================= + +# Slack incoming-webhook URL +# Format: https://hooks.slack.com/services/T.../B.../... +SLACK_WEBHOOK_URL="" + +#============================================================================= +# MESSAGE FORMATTING +#============================================================================= + +# Hostname displayed in the message title +# Leave empty to use short hostname: $(hostname -s) +SLACK_HOSTNAME="" + +# Title prefix (appears before host + status) +SLACK_TITLE_PREFIX="[vmbackup]" + +#============================================================================= +# CONDITIONAL POSTING +#============================================================================= + +# Post to Slack on successful sessions? +SLACK_ON_SUCCESS="yes" + +# Post to Slack on failed / partial sessions? +SLACK_ON_FAILURE="yes" + +#============================================================================= +# TRANSPORT +#============================================================================= + +# curl --max-time for the webhook POST (seconds) +SLACK_TIMEOUT="10" diff --git a/modules/slack_notification_module.sh b/modules/slack_notification_module.sh new file mode 100644 index 0000000..f6d9041 --- /dev/null +++ b/modules/slack_notification_module.sh @@ -0,0 +1,195 @@ +#!/bin/bash +################################################################################# +# Slack Notification Module for vmbackup.sh +# +# Posts a session summary to a Slack incoming webhook after a backup, +# replicate-only, or pre-flight-aborted run. Designed to mirror the call +# sites of email_report_module.sh so both can be enabled independently. +# +# Dependencies: +# - curl (transport) +# - lib/sqlite_module.sh (session totals; falls back to empty stats) +# - config//slack.conf (per-instance configuration) +# +# Usage: +# source slack_notification_module.sh +# load_slack_config +# send_slack_notification "$start" "$end" "$status" +# +# Status values handled: success, partial, failed, unknown +################################################################################# + +SLACK_MODULE_VERSION="1.0" +SLACK_MODULE_LOADED=0 +SLACK_MODULE_AVAILABLE=0 + +#------------------------------------------------------------------------------- +# load_slack_config - Load Slack configuration from instance config directory +# Returns: 0 on success, 1 if disabled or invalid +#------------------------------------------------------------------------------- +load_slack_config() { + local script_dir="${SCRIPT_DIR:-$(dirname "$(readlink -f "$0")")}" + local instance="${CONFIG_INSTANCE:-default}" + local config_file="$script_dir/config/${instance}/slack.conf" + + if [[ ! -f "$config_file" ]]; then + SLACK_MODULE_AVAILABLE=0 + SLACK_ENABLED="no" + return 1 + fi + + # shellcheck source=/dev/null + if ! source "$config_file" 2>/dev/null; then + echo "ERROR: Failed to load Slack config: $config_file" >&2 + SLACK_MODULE_AVAILABLE=0 + SLACK_ENABLED="no" + return 1 + fi + + if [[ "${SLACK_ENABLED:-no}" != "yes" ]]; then + SLACK_MODULE_AVAILABLE=0 + return 1 + fi + + if [[ -z "${SLACK_WEBHOOK_URL:-}" ]]; then + echo "ERROR: SLACK_WEBHOOK_URL not set in $config_file" >&2 + SLACK_MODULE_AVAILABLE=0 + return 1 + fi + + SLACK_HOSTNAME="${SLACK_HOSTNAME:-$(hostname -s)}" + SLACK_TITLE_PREFIX="${SLACK_TITLE_PREFIX:-[vmbackup]}" + SLACK_ON_SUCCESS="${SLACK_ON_SUCCESS:-yes}" + SLACK_ON_FAILURE="${SLACK_ON_FAILURE:-yes}" + SLACK_TIMEOUT="${SLACK_TIMEOUT:-10}" + + if ! command -v curl >/dev/null 2>&1; then + echo "WARNING: curl not found - Slack delivery will fail" >&2 + fi + + SLACK_MODULE_AVAILABLE=1 + SLACK_MODULE_LOADED=1 + return 0 +} + +#------------------------------------------------------------------------------- +# Helpers +#------------------------------------------------------------------------------- + +# Format bytes as TiB/GiB/MiB/KiB/B (no awk; integer math is fine for ranges). +_slack_format_bytes() { + local bytes="${1:-0}" + [[ "$bytes" =~ ^[0-9]+$ ]] || { echo "0 B"; return; } + if (( bytes >= 1099511627776 )); then printf '%d.%d TiB' $((bytes/1099511627776)) $(((bytes%1099511627776)*10/1099511627776)) + elif (( bytes >= 1073741824 )); then printf '%d.%d GiB' $((bytes/1073741824)) $(((bytes%1073741824)*10/1073741824)) + elif (( bytes >= 1048576 )); then printf '%d.%d MiB' $((bytes/1048576)) $(((bytes%1048576)*10/1048576)) + elif (( bytes >= 1024 )); then printf '%d.%d KiB' $((bytes/1024)) $(((bytes%1024)*10/1024)) + else printf '%d B' "$bytes" + fi +} + +# Compute duration in Xh Ym Zs given two "YYYY-MM-DD HH:MM:SS [TZ]" strings. +_slack_format_duration() { + local start_epoch end_epoch diff + start_epoch=$(date -d "$1" +%s 2>/dev/null) || return 1 + end_epoch=$(date -d "$2" +%s 2>/dev/null) || return 1 + diff=$(( end_epoch - start_epoch )) + (( diff < 0 )) && diff=0 + printf '%dh %dm %02ds' $((diff/3600)) $((diff%3600/60)) $((diff%60)) +} + +# JSON-escape a string for inline embedding in a payload. +_slack_json_escape() { + local s="$1" + s=${s//\\/\\\\} + s=${s//\"/\\\"} + s=${s//$'\n'/\\n} + s=${s//$'\r'/} + s=${s//$'\t'/\\t} + printf '%s' "$s" +} + +#------------------------------------------------------------------------------- +# send_slack_notification - Build and POST the session summary +# Args: +# $1 - start_time string +# $2 - end_time string +# $3 - overall_status (success|partial|failed|unknown) +# Returns: 0 on success, 1 on transport failure, 2 if intentionally skipped +#------------------------------------------------------------------------------- +send_slack_notification() { + local start_time="${1:-unknown}" + local end_time="${2:-$(date '+%Y-%m-%d %H:%M:%S %Z')}" + local overall_status="${3:-unknown}" + + if [[ "${SLACK_MODULE_AVAILABLE:-0}" -ne 1 ]]; then + return 2 + fi + + case "$overall_status" in + success) + [[ "${SLACK_ON_SUCCESS:-yes}" == "yes" ]] || return 2 + ;; + partial|failed|unknown) + [[ "${SLACK_ON_FAILURE:-yes}" == "yes" ]] || return 2 + ;; + esac + + local color emoji status_label + case "$overall_status" in + success) color="#36a64f"; emoji=":white_check_mark:"; status_label="SUCCESS" ;; + partial) color="#daa038"; emoji=":warning:"; status_label="PARTIAL" ;; + failed) color="#cc0000"; emoji=":rotating_light:"; status_label="FAILED" ;; + *) color="#888888"; emoji=":grey_question:"; status_label="UNKNOWN" ;; + esac + + # Pull session totals from SQLite if available; otherwise leave blank. + local total=0 ok=0 fail=0 skip=0 excl=0 bytes=0 + if declare -f sqlite_query_session_summary >/dev/null 2>&1; then + local row + row=$(sqlite_query_session_summary 2>/dev/null | head -1) + if [[ -n "$row" ]]; then + IFS='|' read -r total ok fail skip excl bytes _status _stype <<<"$row" + fi + fi + + local size_h + size_h=$(_slack_format_bytes "${bytes:-0}") + local dur_h + dur_h=$(_slack_format_duration "$start_time" "$end_time" 2>/dev/null) || dur_h="n/a" + + local instance="${CONFIG_INSTANCE:-default}" + local title="${SLACK_TITLE_PREFIX} ${SLACK_HOSTNAME} — ${status_label}" + local summary="VMs: ${ok:-0} ok / ${fail:-0} failed / ${skip:-0} skipped / ${excl:-0} excluded (total ${total:-0})" + local meta="Size: ${size_h} | Duration: ${dur_h} | Instance: ${instance}" + + local payload + payload=$(cat </dev/null) + + if [[ "$http_code" =~ ^2 ]]; then + return 0 + fi + echo "ERROR: Slack webhook returned HTTP ${http_code:-no-response}" >&2 + return 1 +} diff --git a/vmbackup.sh b/vmbackup.sh index 4bc78c3..c854d5f 100755 --- a/vmbackup.sh +++ b/vmbackup.sh @@ -5350,6 +5350,29 @@ _log_interrupted_chain() { fi } +# Interpret a notifier return code from send_backup_report / +# send_slack_notification: 0=sent, 1=transport failure, 2=intentionally +# skipped (disabled, or *_ON_SUCCESS/_ON_FAILURE=no). Logs appropriately +# and sets the named sent-guard variable so failure-path retries don't fire +# after an intentional skip. +# Args: $1 rc, $2 notifier label, $3 caller context, $4 guard var name +_handle_notifier_rc() { + local rc="$1" label="$2" ctx="$3" guard_var="$4" + case "$rc" in + 0) + printf -v "$guard_var" '%s' 'true' + log_info "vmbackup.sh" "$ctx" "$label sent" + ;; + 2) + printf -v "$guard_var" '%s' 'true' + log_debug "vmbackup.sh" "$ctx" "$label skipped (disabled or per policy)" + ;; + *) + log_warn "vmbackup.sh" "$ctx" "Failed to send $label (rc=$rc)" + ;; + esac +} + # MEDIUM FIX #3: Cleanup handler for signal exits to remove temporary files cleanup_on_exit() { local exit_code=$? @@ -5365,11 +5388,16 @@ cleanup_on_exit() { log_error "vmbackup.sh" "cleanup_on_exit" " 3. Next run will auto-cleanup stale locks and orphaned checkpoints" fi - # Finalize SQLite session if not already ended - # Normal exits finalize in main()/prune/replicate-only, but early errors, - # unhandled exits, or edge cases may skip those. This catch-all is safe - # because sqlite_session_end() has an idempotency guard (_SQLITE_SESSION_ENDED). - if sqlite_is_available 2>/dev/null && [[ -n "${SQLITE_CURRENT_SESSION_ID:-}" ]] && [[ "$DRY_RUN" != true ]]; then + # Finalize SQLite session if not already ended. + # main()/prune/replicate-only finalize on the normal exit path; this + # catch-all only fires for signal exits and early errors. Skip when the + # session was already finalized — otherwise the misleading "finalized as + # 'incomplete'" WARN is emitted even though sqlite_session_end()'s + # idempotency guard silently dropped the second call. + if sqlite_is_available 2>/dev/null && \ + [[ -n "${SQLITE_CURRENT_SESSION_ID:-}" ]] && \ + [[ "$DRY_RUN" != true ]] && \ + [[ "${_SQLITE_SESSION_ENDED:-0}" != "1" ]]; then if [[ $exit_code -eq 130 ]] || [[ $exit_code -eq 143 ]]; then # Signal exit — count results from what we processed so far local int_total=0 int_success=0 int_failed=0 int_skipped=0 int_excluded=0 @@ -5414,19 +5442,31 @@ cleanup_on_exit() { # shellcheck source=/dev/null source "${SCRIPT_DIR}/modules/email_report_module.sh" if load_email_config; then - local _session_end_time + local _session_end_time _rc=0 _session_end_time=$(date '+%Y-%m-%d %H:%M:%S %Z') - if send_backup_report "${session_start_time:-unknown}" "$_session_end_time" "failed"; then - _EMAIL_SENT=true - log_info "vmbackup.sh" "cleanup_on_exit" "Email report sent (cleanup path)" - else - log_warn "vmbackup.sh" "cleanup_on_exit" "Failed to send email report from cleanup path" - fi + send_backup_report "${session_start_time:-unknown}" "$_session_end_time" "failed" || _rc=$? + _handle_notifier_rc "$_rc" "email report (cleanup path)" "cleanup_on_exit" _EMAIL_SENT else log_debug "vmbackup.sh" "cleanup_on_exit" "Email disabled or not configured for this instance" fi fi + # Slack notification on non-zero exit (parallel to the email path above). + if [[ $exit_code -ne 0 ]] && \ + [[ -n "${SQLITE_CURRENT_SESSION_ID:-}" ]] && \ + [[ "${_SLACK_SENT:-false}" != "true" ]] && \ + [[ "$DRY_RUN" != true ]] && \ + [[ -f "${SCRIPT_DIR}/modules/slack_notification_module.sh" ]]; then + # shellcheck source=/dev/null + source "${SCRIPT_DIR}/modules/slack_notification_module.sh" + if load_slack_config; then + local _slack_end_time _rc=0 + _slack_end_time=$(date '+%Y-%m-%d %H:%M:%S %Z') + send_slack_notification "${session_start_time:-unknown}" "$_slack_end_time" "failed" || _rc=$? + _handle_notifier_rc "$_rc" "Slack notification (cleanup path)" "cleanup_on_exit" _SLACK_SENT + fi + fi + log_info "vmbackup.sh" "cleanup_on_exit" "Cleaning up temporary files before exit (exit code: $exit_code)" # Remove stale lock files — only those whose owning process is no longer running. @@ -5539,17 +5579,25 @@ handle_sigterm() { source "${SCRIPT_DIR}/modules/email_report_module.sh" if load_email_config; then log_info "vmbackup.sh" "handle_sigterm" "Sending email report before SIGTERM exit..." - if send_backup_report "${session_start_time:-unknown}" "$session_end_time" "failed"; then - log_info "vmbackup.sh" "handle_sigterm" "Email report sent successfully" - _EMAIL_SENT=true - else - log_warn "vmbackup.sh" "handle_sigterm" "Failed to send email report" - fi + local _rc=0 + send_backup_report "${session_start_time:-unknown}" "$session_end_time" "failed" || _rc=$? + _handle_notifier_rc "$_rc" "email report" "handle_sigterm" _EMAIL_SENT else log_debug "vmbackup.sh" "handle_sigterm" "Email disabled or not configured for this instance" fi fi - + + if [[ "$DRY_RUN" != true ]] && \ + [[ "${_SLACK_SENT:-false}" != "true" ]] && \ + [[ -f "${SCRIPT_DIR}/modules/slack_notification_module.sh" ]]; then + source "${SCRIPT_DIR}/modules/slack_notification_module.sh" + if load_slack_config; then + local _rc=0 + send_slack_notification "${session_start_time:-unknown}" "$session_end_time" "failed" || _rc=$? + _handle_notifier_rc "$_rc" "Slack notification" "handle_sigterm" _SLACK_SENT + fi + fi + exit 143 } @@ -5954,13 +6002,23 @@ _run_replicate_only() { source "${SCRIPT_DIR}/modules/email_report_module.sh" if load_email_config; then log_info "vmbackup.sh" "main" "Sending email report to $EMAIL_RECIPIENT" - send_backup_report "$session_start_time" "$session_end_time" "$final_status" || true - _EMAIL_SENT=true + local _rc=0 + send_backup_report "$session_start_time" "$session_end_time" "$final_status" || _rc=$? + _handle_notifier_rc "$_rc" "email report" "main" _EMAIL_SENT else log_debug "vmbackup.sh" "main" "Email disabled or not configured" fi fi + if [[ -f "${SCRIPT_DIR}/modules/slack_notification_module.sh" ]]; then + source "${SCRIPT_DIR}/modules/slack_notification_module.sh" + if load_slack_config; then + local _rc=0 + send_slack_notification "$session_start_time" "$session_end_time" "$final_status" || _rc=$? + _handle_notifier_rc "$_rc" "Slack notification" "main" _SLACK_SENT + fi + fi + log_info "vmbackup.sh" "main" "===== REPLICATE-ONLY MODE END (exit=$any_failed) =====" return $any_failed } @@ -6907,19 +6965,27 @@ main() { if load_email_config; then log_info "vmbackup.sh" "main" "Sending email report to $EMAIL_RECIPIENT" - if send_backup_report "$session_start_time" "$session_end_time" "$overall_status"; then - log_info "vmbackup.sh" "main" "Email report sent successfully" - _EMAIL_SENT=true - else - log_warn "vmbackup.sh" "main" "Failed to send email report (backup data preserved)" - fi + local _rc=0 + send_backup_report "$session_start_time" "$session_end_time" "$overall_status" || _rc=$? + _handle_notifier_rc "$_rc" "email report" "main" _EMAIL_SENT else log_debug "vmbackup.sh" "main" "Email disabled or not configured for this instance" fi else log_debug "vmbackup.sh" "main" "Email report module not found - skipping email notification" fi - + + if [[ "$DRY_RUN" == true ]]; then + log_info "vmbackup.sh" "main" "[DRY-RUN] Skipping Slack notification" + elif [[ -f "${SCRIPT_DIR}/modules/slack_notification_module.sh" ]]; then + source "${SCRIPT_DIR}/modules/slack_notification_module.sh" + if load_slack_config; then + local _rc=0 + send_slack_notification "$session_start_time" "$session_end_time" "$overall_status" || _rc=$? + _handle_notifier_rc "$_rc" "Slack notification" "main" _SLACK_SENT + fi + fi + if (( fail_count > 0 )); then log_error "vmbackup.sh" "main" "Session ended with failures - exit code 1" exit 1