feat: cross-platform CI matrix, sandbox hardening, and operator tooling#535
feat: cross-platform CI matrix, sandbox hardening, and operator tooling#535somethingwithproof wants to merge 19 commits intoCacti:developfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR significantly expands CI and platform validation for Spine by adding a Docker-based Linux distro build matrix plus additional OS lanes, along with local reproduction tooling and platform documentation. It also introduces substantial portability/runtime work (new platform abstraction layer, new unit/integration tests, and systemd sd_notify support) and updates packaging/build scripts toward the CMake/Ninja path.
Changes:
- Add a cross-distro CI build/test matrix (10 Linux containers + macOS + Windows advisory + FreeBSD) and supporting security/quality workflows.
- Add local reproduction tooling (
scripts/test-distros.sh) and new/expanded unit + integration tests for platform and networking behavior. - Add/extend platform portability code (platform layer, systemd notify wrapper) and update packaging/build scripts toward CMake/Ninja.
Reviewed changes
Copilot reviewed 112 out of 125 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
.github/actions/install-apt-deps/action.yml |
Composite action to install apt dependencies for workflows. |
.github/cppcheck-baseline.txt |
Placeholder baseline for cppcheck outputs. |
.github/instructions/instructions.md |
Updates repo contribution/CI guardrail guidance. |
.github/nightly-leak-baseline.json |
Baseline thresholds for leak-trend gating. |
.github/perf-baseline.json |
Baseline thresholds for perf regression checks. |
.github/scripts/check-leak-trend.py |
Parses sanitizer/valgrind logs and enforces thresholds. |
.github/scripts/check-unsafe-api-additions.sh |
CI guard against introducing banned unsafe C APIs. |
.github/scripts/check-workflow-policy.py |
Workflow hygiene enforcement (pinned actions, set -euo pipefail, no `curl |
.github/scripts/clang_tidy_to_sarif.py |
Converts clang-tidy output into SARIF. |
.github/scripts/cppcheck_to_sarif.py |
Converts cppcheck output into SARIF. |
.github/workflows/codeql.yml |
Adds CodeQL scanning workflow. |
.github/workflows/coverage.yml |
Adds gcc+lcov coverage workflow and minimum coverage gate. |
.github/workflows/distro-matrix.yml |
Adds the multi-distro container build matrix plus macOS/Windows/FreeBSD lanes. |
.github/workflows/fuzzing.yml |
Adds sanitizer-based CLI fuzz smoke workflow. |
.github/workflows/integration.yml |
Adds DB integration, net-snmp compatibility builds, and Docker-based integration tests. |
.github/workflows/release-verification.yml |
Adds hardened release verification and install smoke checks. |
.github/workflows/security-posture.yml |
Adds security posture checks (TruffleHog/Semgrep/Scorecard + workflow policy gate). |
.github/workflows/weekly.yml |
Adds weekly deep checks (repro build, include graph, spell check, etc.). |
.clang-tidy |
Enables clang-tidy check set for the repo. |
.dockerignore |
Updates Docker build context exclusions (e.g., ignore build/). |
CHANGELOG |
Records platform/build/security/testing updates in the changelog. |
CMakePresets.json |
Adds CI-oriented CMake preset definitions. |
Dockerfile |
Moves container build to CMake/Ninja-based build/install. |
Dockerfile.dev |
Moves dev container build to CMake/Ninja and sanitizer-friendly flags. |
Makefile.am |
Removes autotools Makefile.am (transition away from autotools). |
SECURITY.md |
Adds a project security policy and deployment recommendations. |
VERSION |
Adds/updates the canonical project version file. |
bootstrap |
Removes legacy autotools bootstrap script. |
config/config.h.cmake.in |
Adds a CMake-generated config.h template. |
docs/platform-idioms.md |
Documents platform-specific implementation idioms. |
docs/platforms.md |
Documents supported platforms and per-distro dependency install commands. |
etc/spine.1 |
Adds a generated man page for spine. |
etc/spine.conf.dist |
Adds a distributed default config file. |
etc/systemd/spine.service |
Adds a systemd service unit for Spine with Type=notify and hardening. |
etc/systemd/spine.timer |
Adds a systemd timer unit for periodic poll cycles. |
package |
Updates packaging script to validate via CMake instead of autotools. |
packaging/README.md |
Updates packaging docs to reflect CMake/Ninja build path. |
packaging/debian/README |
Notes Debian packaging uses CMake+Ninja. |
packaging/debian/control |
Switches Debian build-deps from autotools to cmake/ninja. |
packaging/debian/rules |
Switches Debian build rules to cmake+ninja and CMake options. |
packaging/rpm/README |
Notes RPM spec uses CMake+Ninja. |
packaging/rpm/spine.spec |
Switches RPM build/install steps from autotools to CMake+Ninja. |
scripts/copyright_year.sh |
Refactors/normalizes formatting and flow of copyright updater. |
scripts/test-distros.sh |
Adds local Docker-based multi-distro build/test driver. |
scripts/verify.sh |
Updates verification script to use CMake build + scan-build lanes. |
src/common.h |
Removes older Cygwin workarounds and includes platform header. |
src/error.c |
Uses platform localtime wrapper in signal handler path. |
src/error.h |
Adds header for signal handler install/uninstall APIs. |
src/keywords.c |
Adds/refactors keyword mapping implementation into its own TU. |
src/keywords.h |
Adds header for keyword lookup APIs. |
src/locks.h |
Adds header for lock/mutex APIs. |
src/nft_popen.h |
Adds spine_build_child_env() API documentation declaration. |
src/php.h |
Adds header for PHP process APIs. |
src/ping.h |
Removes large Cygwin ICMP struct block and adds ping_init() declaration. |
src/ping_ipv6_scope.c |
Adds standalone IPv6 link-local scope-id resolver TU. |
src/ping_validate.c |
Adds standalone ICMP echo payload validator TU. |
src/ping_wire.h |
Defines on-wire ping payload layout and validator signature. |
src/poller.c |
Uses platform fd wrappers, expands buffers safely, improves OOM/DB-conn failure handling. |
src/poller.h |
Adds header for poller APIs. |
src/snmp.c |
Uses platform setenv wrapper; tightens initialization and sensitive-zeroing behavior. |
src/snmp.h |
Adds header for SNMP APIs. |
src/spine.c |
Adds systemd notify hooks, SIGHUP/SIGTERM behavior, platform init/cleanup, and portability tweaks. |
src/spine.h |
Enforces include ordering, widens SNMP protocol buffers, updates PID type, removes cygwin fields. |
src/spine_sem.h |
Adds pthread-based portable counting semaphore wrapper. |
src/sql.c |
Replaces sleep/usleep with platform sleep wrappers; improves SSL option typing and db_escape bounds logic. |
src/sql.h |
Minor macro formatting/indent changes for MySQL option-setting helper. |
src/systemd_notify.c |
Adds systemd sd_notify wrapper implementation with no-op stubs when absent. |
src/systemd_notify.h |
Adds systemd notify wrapper header and API docs. |
src/util.h |
Removes add_slashes() declaration (Cygwin-related removal). |
src/platform/platform.h |
Adds platform API header (init, sleeps, env, localtime, pid, tty detection). |
src/platform/platform_common.c |
Adds init/cleanup reference counting. |
src/platform/platform_error.h |
Adds platform error-string API header. |
src/platform/platform_error_posix.c |
Adds POSIX strerror_r wrapper implementation. |
src/platform/platform_error_win.c |
Adds Windows FormatMessageA error-string wrapper implementation. |
src/platform/platform_fd.h |
Adds portable fd read/write/wait API header. |
src/platform/platform_fd_posix.c |
Adds POSIX fd wrapper implementation (select-based wait). |
src/platform/platform_fd_win.c |
Adds Windows fd wrapper implementation (PeekNamedPipe wait). |
src/platform/platform_icmp.h |
Adds platform ICMP abstraction header. |
src/platform/platform_icmp_posix.c |
Adds POSIX ICMP abstraction implementation (forwards into ping.c helpers). |
src/platform/platform_posix.c |
Adds POSIX platform backend (env, localtime, sleep, pid, tty checks). |
src/platform/platform_process.h |
Adds portable process API header (pipe/spawn/wait/terminate). |
src/platform/platform_process_posix.c |
Adds POSIX process backend (pipe2/pipe+cloexec, posix_spawn retry). |
src/platform/platform_socket.h |
Adds portable socket API header. |
src/platform/platform_socket_posix.c |
Adds POSIX socket wrapper implementation. |
src/platform/platform_socket_win.c |
Adds Windows socket wrapper implementation and error-domain helpers. |
src/platform/platform_win.c |
Adds Windows platform backend (WSA init, sleeps incl. sub-ms, pid, tty checks). |
tests/integration/test_ipv6_transport.sh |
Adds integration test for IPv6-targeted poll path behavior. |
tests/integration/test_output_regex.sh |
Refactors helper functions/compose usage and improves robustness. |
tests/snmpv3/db/init.sql |
Updates SNMPv3 auth protocol from SHA-256 to SHA. |
tests/snmpv3/docker-compose.yml |
Updates SNMPv3 healthcheck auth protocol from SHA-256 to SHA. |
tests/snmpv3/scripts/run_test.sh |
Formatting/robustness improvements in SNMPv3 test driver script. |
tests/snmpv3/snmpd/snmpv3.conf |
Updates user creation to SHA (compatibility). |
tests/unit/Makefile |
Reworks unit test Makefile to build/run multiple platform smoke tests. |
tests/unit/test_build_fixes.c |
Removes cygwinshloc field usage in unit fixture struct. |
tests/unit/test_ping_ipv6_scope.c |
Adds unit tests for IPv6 scope-id behavior (POSIX; no-op main on Windows). |
tests/unit/test_ping_reply_validation.c |
Adds unit tests for ICMP reply payload validator. |
tests/unit/test_platform_dns.c |
Adds unit tests around getaddrinfo numeric/dual-stack behavior. |
tests/unit/test_platform_env.c |
Adds unit tests for platform setenv wrapper semantics. |
tests/unit/test_platform_error.c |
Adds unit tests for platform error-string wrapper behavior. |
tests/unit/test_platform_fd.c |
Adds unit tests for fd read/write/wait wrappers. |
tests/unit/test_platform_helpers.h |
Adds minimal assertion helpers for platform/unit tests. |
tests/unit/test_platform_process.c |
Adds unit tests for process helpers (spawn/wait/terminate, Windows UTF-8 path). |
tests/unit/test_platform_time.c |
Adds unit tests for platform init/localtime/sleep helpers. |
tests/unit/test_systemd_notify.c |
Adds unit tests for systemd notify wrapper idempotency/null-safety. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Latest push extends the matrix to cover every supported OS as a first-class citizen with explicit tier classification.
Source side: extended the BSD macro guards in `src/ping.c` and `src/platform/platform_process_posix.c` to include `DragonFly` so DragonFly inherits the same `arc4random`/`pipe2` paths as the other BSDs. `docs/platforms.md` is rewritten around the four tiers with install commands for every entry and an issue-reporting label scheme. |
|
You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool. What Enabling Code Scanning Means:
For more information about GitHub Code Scanning, check out the documentation. |
There was a problem hiding this comment.
cppcheck found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
somethingwithproof
left a comment
There was a problem hiding this comment.
This PR significantly enhances Spine's security posture by introducing platform-specific sandboxing (pledge, unveil, PR_SET_NO_NEW_PRIVS) and a comprehensive multi-distro CI matrix. The automation infrastructure is well-structured and follows modern CI/CD best practices.
🔍 General Feedback
- Sandbox Integration: Excellent addition of
spine_sandbox_*hooks. The OpenBSD implementation correctly usesunveilto restrict filesystem access andpledgeto narrow the syscall surface. - Linux Hardening: Using
PR_SET_NO_NEW_PRIVSon Linux is a great low-cost win to prevent privilege escalation viaexecve. - CI Hygiene: Workflows use pinned SHAs for actions and limited
GITHUB_TOKENpermissions (contents: read). - Matrix Coverage: The Docker-based distro matrix ensures consistent security behavior across diverse toolchains (glibc vs musl, different GCC versions).
Detailed Findings
-
LOW: Command Injection Risk in Developer Scripts (scripts/test-distros.sh, scripts/test-workflows.sh)
- These scripts pass command-line arguments directly to shell execution points (
docker run ... sh -candact -j). - While these are local developer tools, they could be exploited if a developer is tricked into running them with malicious arguments.
- Recommendation: Consider simple validation or quoting for arguments used in shell contexts.
- These scripts pass command-line arguments directly to shell execution points (
-
LOW: Supply Chain Risk (GitHub Workflows)
- The
linuxjob indistro-matrix.ymluses distro tags (e.g.,rockylinux:9) without SHA-256 digests. - Recommendation: For maximum security, pin container images by digest to prevent poisoning of the base image.
- The
| return HOST_DOWN; | ||
| } | ||
| #else | ||
| static int ping_icmp_ipv6(spine_spine_host_t *host, ping_t *ping) { |
| * | ||
| */ | ||
| int ping_tcp(spine_spine_host_t *host, ping_t *ping) { | ||
| double begin_time, end_time, total_time; |
c8e7207 to
4f66b29
Compare
This squashes the entire asynchronous architecture rewrite, including: - libuv event loop foundation and stage-based pipeline - Persistent PHP script server socket pool - PCRE2 JIT regex cache - Global OID multiplexing with WRR scheduling - Adaptive token-bucket governors - Safe libuv memory lifecycles and telemetry sink
4f66b29 to
0d064c7
Compare
The dummy 'SELECT 1' that used to live in stage_flush was test scaffolding I removed, but that was also the only thing forcing the batcher to commit on cycle end. With just the 500ms internal timer, a cycle that completed in <500ms left pending writes in memory; process exit and db_disconnect then ran before they hit MySQL. Call spine_async_batch_flush() at pipeline tear-down, before the uv_walk/force-close pass, and let uv_run drain the resulting mysql_real_query_cont callbacks. The real flush now runs at the one point in the process lifecycle where we know every poll result is in-queue and the DB handle is still live. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
ares_process_fd can call back into sock_state_cb -> spine_async_dns_upsert_watcher -> spine_async_dns_remove_watcher, which frees the current watcher mid-call. Any dereference of `watcher` after ares_process_fd is UAF. The current code happens to read fields only before the call, but there is no comment documenting that invariant and a future refactor would break it silently. Capture the fd explicitly before the call, document that runtime is the only pointer safe to use post-call, and make spine_async_dns_on_poll self-contained. No functional change. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
Before uv_loop_close, spine runs db_disconnect on every MYSQL handle. An async query that arrives between the flush drain and the db close would succeed at submission, then fire its callback after the handle was freed - use-after-free. Add a process-wide shutdown fence: - spine_async_mysql_shutdown_begin() flips a volatile flag; new queries via spine_async_mysql_query return -ESHUTDOWN. - In-flight queries keep their existing uv_poll chain so the uv_run drain completes them normally. - Non-HAVE_MYSQL_ASYNC build provides the symbol as a no-op so callers in spine.c do not need the compile-time guard. Spine calls the fence immediately before the batch flush so the flush's own callbacks still run, and then no further submissions are possible while the drain is in progress. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
M2: setrlimit(RLIMIT_CORE, 0) silently ignored its return. On systems with /etc/security/limits.conf enforcing a non-zero hard minimum, it left the limit unchanged and the operator had no signal. Log to stderr on failure so post-mortem cores have a documented cause. M6: Dependabot weekly cadence for docker left a 7-day exposure window on base-image CVEs. Switch to daily for docker; github-actions stays weekly because action SHAs churn slowly and noisy. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
Two #define SPINE_BZ / #undef pairs existed — one per scrub helper — each selecting explicit_bzero vs spine_volatile_bzero based on HAVE_EXPLICIT_BZERO. Collapse them into one static inline spine_bzero with a single #ifdef. Every scrub call routes through it; adding a new scrubbable field no longer requires remembering which macro pair a file is using. Also extend target_t scrub to cover snmp_engine_id and snmp_context. They are not strictly secret, but they are per-device identifiers operators prefer not to surface in a core dump alongside a password. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
The prior ordering flipped the async-mysql shutdown fence BEFORE calling spine_async_batch_flush. spine_async_batch_flush submits pending batched writes through spine_async_mysql_query; with the fence already set, those submissions returned -ESHUTDOWN and the entire pending batch was silently dropped - the same data-loss mode the dummy-flush removal was supposed to fix. Swap the order: flush first, drain callbacks, then close the gate. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
The prior flag list included -3p / -3x / --3authPassphrase / --3privPassphrase. Those are not real net-snmp options; they were added based on assumption, not documentation. Tests asserted redaction against strings that never appear in real invocations. Replace with the actual net-snmp key-carrying short flags from snmp.conf(5): -3m master auth key -3M master priv key -3k localized auth key -3K localized priv key Drop the fabricated long flags. Update tests to exercise the real flag forms. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
Prior code used UV_RUN_NOWAIT in a 300-iteration spin, which returned immediately regardless of outstanding requests. The 3-second deadline was fiction: 300 iterations completed in microseconds, not seconds, and the runtime was freed while requests were still in flight. The next on_resolved dereferenced ctx->runtime (now freed) to decrement inflight - UAF. Switch to UV_RUN_ONCE driven by a wall-clock deadline (time(NULL)+3). UV_RUN_ONCE blocks until one libuv event fires or the loop has no pending work, so the 3-second budget is actually spent waiting. If the deadline still expires with requests in flight, DO NOT free runtime. The outstanding callbacks carry a pointer to it; freeing now is UAF. Leak the struct (process is terminating) and log a warning so operators can see the stuck resolver. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
Three issues flagged in self-review: 1. The reap pass only fired from spine_cb_record. On an idle daemon (no poll failures), the reap never ran and old entries from a prior failure burst sat in the hash for the process lifetime. Add the same reap call to spine_cb_should_skip, which is invoked every cycle for every host. Rate limiting via SPINE_CB_REAP_INTERVAL already prevents thrash. 2. The reap sweep held the table mutex O(N). A large unhealthy population stalled other threads. Add SPINE_CB_REAP_BATCH (256) cap; unscanned entries wait for the next sweep. 3. Tests could not drive the reap deterministically because it used time(2). Add spine_cb_set_clock_for_test(fn) and thread the clock through every time(NULL) caller via spine_cb_clock(). Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
Two self-review findings: - g_spine_uv_leaked_handles was a plain int. Single-threaded today, but a future refactor that walks the loop from a worker thread would race on increment. Switch to _Atomic int with relaxed-order fetch_add; this is only an operator diagnostic so no synchronisation guarantees are required beyond 'no lost increments'. - g_async_mysql_shutting_down was a plain volatile int. Worker threads on weakly-ordered architectures (ARM, POWER) might observe stale 0 after the main thread sets it. Switch to atomic_int with explicit release on the set and acquire on the read so the happens-before ordering is enforced across thread boundaries. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
Prior test hardcoded 512. If spine.h BUFSIZE rises, the test stops exercising the production-width truncation path. Use BUFSIZE with a fallback for the case where the production header is not yet available to the test TU, and guard with #error if BUFSIZE is ever shrunk below the fallback. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
Operators had no visibility into: - how many CB entries exist, how many have been reaped over time, and how many trips have occurred (spine_cb_get_stats) - how many async queries got refused after the shutdown fence flipped (spine_async_mysql_shutdown_refused_count) Both are cheap atomic counters. The CB counters live behind the table lock since they are read together with an exact HASH_COUNT; the mysql refusal counter uses atomic_ulong for lock-free increment on the hot submission path. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
Three test files fill the gaps called out in the self-review: - test_spine_cb_reap.c: drives the age-reap path through the mock clock seam. Asserts a healthy host occupies no hash entry, a trip records one, eviction at idle-threshold drops a recovered entry, and a still-cooling entry survives the idle threshold. - test_async_mysql_shutdown.c: verifies spine_async_mysql_query returns a negative error after spine_async_mysql_shutdown_begin. API-only; works on both HAVE_MYSQL_ASYNC and fallback builds. - test_async_exec_shell.c: source-scan invariant that async_exec.c still uses /bin/sh -c unconditionally. If anyone reintroduces the dropped metachar-detect hybrid (needs_shell, async_exec_parse_argv) the test fails. Defensive against review-refresh regressions. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
d86971b to
5a33412
Compare
CRITICAL: tests/unit/test_async_exec_shell.c, test_spine_cb_reap.c, and test_async_mysql_shutdown.c were not listed in cmake/SpineTests.cmake and therefore did not build or run. Add matching add_executable + add_test entries with the correct link libraries and source-file dependencies. HIGH: test_async_exec_shell.c now reads SPINE_SOURCE_ROOT from a compile-time define set by CMake to CMAKE_SOURCE_DIR, so the source- scan works from any ctest CWD rather than relying on 'run from project root'. HIGH: add spine_async_mysql_shutdown_reset_for_test so unit tests can flip the fence more than once. Production callers must not use this; the production fence stays a one-way latch. HIGH: replace ASSERT_TRUE(count >= 0) tautology in the shutdown test with a before/after delta and build-matrix-aware expected values (HAVE_MYSQL_ASYNC bumps the counter by 1; fallback leaves it zero). HIGH: add a source-scan test that asserts spine_async_batch_flush() appears before spine_async_mysql_shutdown_begin() in spine.c. Reversing the order silently drops the pending batch at shutdown and now fails the test. HIGH: spine_uv_run_bounded caps each shutdown drain at 3 wall-clock seconds. Three previously-unbounded uv_run(UV_RUN_DEFAULT) calls could hang shutdown indefinitely on a wedged callback. MEDIUM: spine_cb_init resets spine_cb_last_reap so tests cycling init/shutdown/init under a rewound mock clock do not carry stale reap-cadence state forward. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
DRY: the 3-second per-phase drain budget was repeated 4 times in spine.c and once in async_dns.c with a magic literal. Define SPINE_SHUTDOWN_DRAIN_SECS next to spine_uv_run_bounded and point the async_dns comment at it. The helper no longer takes a seconds parameter - every caller wants the same budget. KISS: drops four repeated '(loop, 3)' call-site arguments in spine.c in favour of a single-argument helper. Behaviour unchanged. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
967aea2 to
303ff25
Compare
atoi has three silent failure modes on the DB-sourced config values: - Non-numeric input returns 0 without any signal. A typo in the DB silently resets a setting to zero. - Integer overflow is undefined behaviour on C99+. A value of "9999999999" invokes UB on 32-bit int platforms. - No distinction between "0" and "bogus". spine_atoi uses strtol with errno detection: ERANGE clamps to INT_MAX / INT_MIN rather than wrapping, non-numeric input (end == s) returns 0 preserving the original contract for callers that already treat 0 as 'use default'. 20 call sites in util.c swapped. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
a317eb4 to
cfd00ac
Compare
CRITICAL: spine_atoi (now spine_parse_int) returned INT_MAX on large negative overflow because the ERANGE branch dominated the sign check. A config value of '-9999999999999999999' silently became INT_MAX, a sign-flip that corrupts every setting passed through it. Split the ERANGE branch on sign so overflow in either direction clamps to the matching bound. Renamed to spine_parse_int so 'grep atoi' is no longer ambiguous. HIGH: new test CMake entries linked only against spine_hardening. util.c / circuit_breaker.c / async_mysql.c transitively depend on mysql, net-snmp, and libaudit; without linking them, the test binaries fail the undefined-reference stage. Add spine_mysql, spine_netsnmp, Threads, and the conditional audit target to every new test target. Call spine_require_mysql / spine_require_netsnmp first to register those INTERFACE libraries. HIGH: atomic-style consistency. g_spine_uv_leaked_handles was _Atomic int while the other atomics used atomic_int / atomic_ulong. Switch to atomic_int so every counter matches the stdatomic.h typedefs. MEDIUM: spine_uv_run_bounded now measures the deadline with uv_hrtime (monotonic nanoseconds) instead of time(2), so the 3- second budget is real sub-second resolution rather than the 0-1s jitter time(2) produces. MEDIUM: PATH_MAX fallback for AIX/HP-UX builds that leave it undefined. 4096 matches Linux default. MEDIUM: shutdown metrics log line. One SPINE_LOG at teardown prints libuv_leaked, cb_entries, cb_trips, cb_reaped, and mysql_refused_after_shutdown. Grep-friendly for operators without a telemetry endpoint. LOW: SPINE_SHUTDOWN_DRAIN_SECS moved to src/spine_shutdown.h so spine.c and async_dns.c share the constant. Raising the budget in one place updates every subsystem. LOW: test_async_mysql_shutdown uses a real uv_loop_init rather than a synthetic (uv_loop_t *)1. Future refactor that dereferences the loop before the fence check now fails cleanly against an empty loop instead of segfaulting. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
HIGH: shutdown metrics SPINE_LOG always at INFO flooded the operator log on cron-driven restarts. Emit at INFO only when any counter is non-zero (the signal worth preserving); otherwise log at DEBUG. HIGH: promote spine_parse_int from static to extern with declaration in util.h. Future config-parser TUs reuse the safety contract instead of duplicating it. LOW: hoist 1000000000ULL literal in spine_uv_run_bounded to SPINE_NS_PER_SEC in spine_shutdown.h alongside the drain-secs macro. Matches naming of surrounding timeouts; readers do not have to count zeros. Signed-off-by: Thomas Vincent <thomasvincent@gmail.com>
Consolidated PR for Cacti 1.3 platform modernization, sandbox hardening, operator tooling, and security review follow-up. Matches maintainer's consolidation pattern from #523.
Scope
A. Cross-platform CI matrix (original)
Adds a CI lane that builds spine inside 10 Linux distro containers plus macOS, Windows (advisory), and FreeBSD 14. Ships
scripts/test-distros.shfor local reproduction anddocs/platforms.mdlisting supported platforms with per-distro build-dep install commands.Matrix: rockylinux:9, rockylinux:8, almalinux:9, fedora:latest, debian:12, debian:trixie, ubuntu:22.04, ubuntu:24.04, opensuse/leap:15, alpine:3.20.
B. Sandbox and hardening (original + expanded)
--mlockflag to pin credentials against swapC. Security review follow-up (new)
Findings from an in-depth review, consolidated from closed #538/#539/#540:
poller.c: guardkill(pid, SIGKILL)againstnft_pchildreturning -1 (broadcast-kill prevention)util.c: enforce SECURITY.md-documented 0600 and owner match on spine.conf; open withO_NOFOLLOW|O_CLOEXECerror.c: async-signal-safe fatal handler (pre-formatted messages +write(2)+_exit); SIGPIPE ignored process-wideFD_CLOEXECon sockets, MySQL session, audit netlink, log fdutil.c: length-bounded config tokenizer replacingsscanf("%15s %255s"); range checks on DB_Port / CircuitBreakerThresholdping.c:strncopyinget_namebyhost(fixes 1-byte OOB write on overlength token)snmp.c: SNMPv3 contextEngineID carried as binary with explicit length (nostrlentruncation at0x00)umask(027)at startup;spine_scrub_secrets()called from every exit path; SNMP community/password redaction in command-error logs;mysql_fetch_rownull guards;snprintfaccumulator truncation guards;php_readpipebuffer-full break; circuit breaker threshold clamp; nft_popen close_cleanup null check replacingassertD. Build fixes (applied during CI iteration)
<sys/prctl.h>under__linux__for prctl calls<stdint.h>underHAVE_LANDLOCKfor uint64_t_GNU_SOURCEguard inplatform_posix.candplatform_process_posix.cforpthread_setname_np/pipe2on clangtr -dset,-escaped) so package lists with spaces are acceptedstrncpy→memcpyin strncopy/die/spine_log to silence-Wstringop-truncationdb_escapetrim_limittightened for-Wformat-truncationTests
New unit tests:
test_poller_pid_guard,test_config_perms,test_signal_handler,test_cloexec,test_get_namebyhost,test_snmp_engine_id,test_config_parser,test_startup_umask,test_scrub_secrets,test_snprintf_guard,test_redact_args,test_spine_audit. Extendedtest_circuit_breakerfor threshold clamp. Extendedtest_env_scrubfor LD_/DYLD_ prefix strip and interpreter init vars.Full suite:
ctest --test-dir build --output-on-failure→ 30/30 pass locally.Local reproduction
Build logs land under
build-reports/.Fixes
openSUSE Leap 15 default gcc 7.5 rejects
-std=c17; workflow and script installgcc13and exportCC=gcc-13.The Rocky 9
pipe2fix landed on #523 separately since it addresses a Rocky-Linux compile failure the maintainer flagged.Consolidation note
Superseded: #538 (critical-security), #539 (config-parser), #540 (memory-concurrency). All commits fast-forwarded onto this branch. Matches the consolidation pattern from #523 to avoid merge-order conflict risk across util.c / spine.c / platform layers / CI files.