Skip to content

Full-mod audit hardening: alias hot-reload, recursion guard, lifecycle state, atomic config writes#2

Merged
THEFricadelle merged 2 commits into
mainfrom
fix/audit-hardening
Jun 10, 2026
Merged

Full-mod audit hardening: alias hot-reload, recursion guard, lifecycle state, atomic config writes#2
THEFricadelle merged 2 commits into
mainfrom
fix/audit-hardening

Conversation

@Blushister

Copy link
Copy Markdown
Collaborator

Summary

Follow-up to the LuckPerms audit (#1): a full-mod bug hunt covering the command-tree rewriter, alias engine, config manager, and lifecycle handling. This PR fixes 3 high-severity and 4 medium-severity bugs plus several small hardening items. All fixes are listed in CHANGELOG.md under Unreleased.

High severity

H1 - /customperm reload did not apply aliases.json changes

onConfigReload only repaired command wrappers; AliasManager was never invoked. Aliases added to the file never appeared, removed aliases stayed executable, and - worse - edited steps were ignored because the execution closure captures the step list at registration time, so a step removed for being dangerous kept running with op-4 elevation. AliasManager.applyConfig() now re-registers the union of configured and currently-registered aliases on reload (restoring shadowed commands for deletions), followed by a repair() pass.

H2 - No recursion guard on aliases

/customperm alias add boom boom (or a cycle a -> b -> a) recursed infinitely with an op-4 elevated source until StackOverflowError, which was then swallowed level-by-level by the catch (Throwable) in the step loop, spamming thousands of failure messages. Alias execution now tracks nesting depth in a ThreadLocal and aborts beyond depth 8 with a clear message.

H3 - Static dispatcher state leaked across server lifecycles

ORIGINAL_ROOTS/WRAPPED_NODES (CommandTreeRewriter) and SHADOWED_ORIGINALS/REGISTERED_ALIASES (AliasManager) were never cleared. Consequences: old command trees retained in memory after every /reload and embedded restart, and - after a same-JVM server restart - removing a shadowing alias restored the stale node from the previous server instance. State is now cleared at the top of every RegisterCommandsEvent (fresh dispatcher) and on ServerStoppedEvent.

Medium severity

  • M1 - Registration ordering not guaranteed. The README claims the wrapper handler runs after every other, but it subscribed at NORMAL priority and ServerStartedEvent did no catch-up. Mods registering commands after CustomPerm were never wrapped at boot. Now: EventPriority.LOWEST + a repair() safety net at server start.
  • M2 - Non-atomic config writes. save() used direct Files.writeString (truncate-then-write); a crash mid-save corrupted the file - and load() calls save(), so every reload was a corruption window. Writes now go through temp file + ATOMIC_MOVE (with non-atomic move fallback).
  • M3 - Restored shadowed commands were not re-wrapped. Removing an alias that shadowed an exposed command restored the original unwrapped node, silently disabling its permission nodes until the next reload. refreshAlias now runs repair().
  • M4 - Direct exposure keyed on LP presence instead of the selected backend. With LuckPerms present but below the minimum version and luckPermsFallbackMode=internal, the internal backend was active yet exposure stayed disabled, making grade command nodes useless. isDirectCommandExposureEnabled() now checks whether the LP backend was actually selected. (CI/no-LP and LP-active behaviours are unchanged.)

Low severity

  • Alias step normalization strips a single leading slash instead of all of them (WorldEdit-style //wand steps now work).
  • alias addstep emits the same shadowing warning as alias add when it creates the alias.
  • Permission nodes from grade addperm|removeperm are trimmed.
  • Fatal JVM Errors thrown by alias steps are rethrown (matches the D1 policy already applied in LuckPermsService).

Known issues documented but intentionally not fixed

  • Brigadier redirect nodes are not re-wired during wrapping (acknowledged in code); /tp vs /teleport exposure remains asymmetric.
  • word() argument type prevents debugging/exposing namespaced command roots containing :.
  • Backup timestamps have second resolution (collisions skew rotation).
  • load() reports failure when the post-load save() fails even though the snapshot was applied.

Validation

  • ./gradlew compileJava compileGameTestJava test - passed.
  • ./gradlew runGameTestServer - see PR checks / comment below.

Related: #1 (LuckPerms-specific audit). Note for the merge: both PRs add a ServerStoppedEvent listener in CustomPerm.java and an Unreleased CHANGELOG section - trivial conflicts expected whichever lands second.

High severity:
- /customperm reload now applies aliases.json changes to the live
  dispatcher (additions, removals with shadowed-node restoration, and
  edited steps - the execution closure captured the step list at
  registration time and kept running stale steps with op-4 elevation).
- Guard alias execution against recursive alias chains (self-invoking
  alias or cycle) with a max nesting depth of 8 instead of recursing to
  StackOverflowError.
- Clear static dispatcher state (ORIGINAL_ROOTS, WRAPPED_NODES,
  SHADOWED_ORIGINALS, REGISTERED_ALIASES) on every RegisterCommandsEvent
  and on server stop: no more old-tree memory leaks across /reload and
  same-JVM restarts, and no restoration of stale shadowed nodes from a
  previous server instance.

Medium severity:
- Subscribe to RegisterCommandsEvent at EventPriority.LOWEST and run a
  catch-up repair() at ServerStartedEvent so commands registered by
  later mod handlers are still wrapped at boot.
- Write config files atomically (temp file + ATOMIC_MOVE) so a crash
  mid-save can no longer truncate them.
- Re-wrap a restored shadowed command immediately after alias removal.
- Key direct command exposure on the selected backend instead of mere
  LuckPerms presence, so the internal fallback can gate commands when
  LP is present but outdated/broken.

Low severity:
- Strip a single leading slash in alias steps (WorldEdit-style //wand).
- Emit the shadowing warning when alias addstep creates a new alias.
- Trim permission nodes in grade addperm/removeperm.
- Rethrow fatal JVM Errors from alias steps (D1 policy).
@Blushister

Copy link
Copy Markdown
Collaborator Author

Local validation: ./gradlew runGameTestServer — all 32 required GameTests passed (606.9 ms), ./gradlew test and compileGameTestJava passed.

@THEFricadelle THEFricadelle merged commit 594fa03 into main Jun 10, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants