Skip to content

Stub /sys/devices/system/cpu and harden checks#20

Merged
jserv merged 1 commit intomainfrom
sysfs
May 9, 2026
Merged

Stub /sys/devices/system/cpu and harden checks#20
jserv merged 1 commit intomainfrom
sysfs

Conversation

@jserv
Copy link
Copy Markdown
Contributor

@jserv jserv commented May 9, 2026

Java GC, Go scheduler init, and libnuma probe
/sys/devices/system/cpu/{online,possible,present} plus per-CPU dirs to size thread pools. macOS has no /sys, so the lack of these files made those probes fall back to suboptimal heuristics or fail outright.

ensure_syscpu_dir lazily builds /tmp/elfuse-syscpu-XXXXXX/ on first access, populated with online/possible/present cpumask range files (sysconf(_SC_NPROCESSORS_ONLN) gives "0" for one CPU, "0-N-1" for N) and one empty cpuN/ directory per host CPU. The cache/topology subtrees stay empty so deeper queries return ENOENT until a real consumer asks. Population is one-shot: the host CPU count does not change at runtime so refresh is unnecessary.

Hardening guards on the open and stat paths:

  • syscpu_open_is_readonly rejects non-RDONLY accmode plus O_CREAT and O_TRUNC with EACCES so the stub stays read-only as a real sysfs would, covering both the bare cpu root and child paths.
  • syscpu_suffix_safe rejects any '..' component before path join so a guest open of /sys/devices/system/cpu/../../etc/passwd cannot pivot the lstat/open onto an arbitrary host file.
  • ensure_syscpu_dir tears down the partial scratch dir on any write_file/mkdir failure instead of caching a half-built state with syscpu_dir_ok=true.
  • A getpid()-vs-syscpu_owner_pid guard in syscpu_dir_cleanup keeps clone(CLONE_VM) children from rmdir'ing the parent's still-active scratch tree at exit.
  • path_prefix_match in path.c tightens the prefix test so /sys/devices/system/cpufoo no longer falls into the intercept layer.
  • syscpu_classify centralizes SYSFS_CPU prefix handling between proc_intercept_open and proc_intercept_stat as one source of truth.

While auditing access(2) for the new stub, the previous "intercept matched, return 0" shortcut leaked false positives: a guest probing W_OK on an intercepted path received 0 even when no W bit was set in the synthesized stat. path_check_intercept_access now does proper POSIX mode-bit checking against the stat result, with standard owner/group/other selection plus a CAP_DAC_OVERRIDE-style root branch that grants RW always and X if any X bit is set. The synthetic stat fillers now populate st_uid/st_gid from proc_get_uid/proc_get_gid so the owner branch matches.

faccessat SYS 48 dispatch passed x3 to sys_faccessat even though Linux's 3-arg faccessat has no flags parameter; x3 carried whatever garbage was in the caller's register state, and
translate_faccessat_flags would set AT_EACCESS or
AT_SYMLINK_NOFOLLOW semi-randomly. SYS 48 now forces flags=0; SYS 439 (faccessat2) keeps x3 as before.


Summary by cubic

Adds a synthetic /sys/devices/system/cpu on hosts without sysfs so Java, Go, and libnuma can detect CPU count reliably. Also hardens open/stat/access handling and fixes the faccessat flags bug.

  • New Features

    • Lazy temp dir at /tmp/elfuse-syscpu-XXXXXX.
    • Populate online, possible, present with "0" or "0-N-1" from _SC_NPROCESSORS_ONLN.
    • Create one empty cpuN/ per host CPU; cache/ and topology/ return ENOENT.
    • Read-only tree; no refresh needed since CPU count is static.
  • Bug Fixes

    • Enforce read-only opens (reject O_WRONLY/O_RDWR/O_CREAT/O_TRUNC) and add O_NOFOLLOW.
    • Block ".." traversal and tighten prefix matching so unrelated paths don’t hit the intercept.
    • Make access(2)/faccessat honor synthetic mode bits with correct uid/gid; force flags=0 for SYS 48.
    • Robust scratch-dir teardown on init failure; only the creator process cleans it up at exit.
    • Add test-sysfs-cpu; included in the suite and skipped in the qemu reference run.

Written for commit 451b77e. Summary will update on new commits.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 8 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/syscall/path.c">

<violation number="1" location="src/syscall/path.c:93">
P1: Root access check does not match CAP_DAC_OVERRIDE semantics: read and write should be granted unconditionally for uid 0, with only execute requiring at least one x-bit set.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread src/syscall/path.c
Java GC, Go scheduler init, and libnuma probe
/sys/devices/system/cpu/{online,possible,present} plus per-CPU dirs to
size thread pools. macOS has no /sys, so the lack of these files made
those probes fall back to suboptimal heuristics or fail outright.

ensure_syscpu_dir lazily builds /tmp/elfuse-syscpu-XXXXXX/ on first
access, populated with online/possible/present cpumask range files
(sysconf(_SC_NPROCESSORS_ONLN) gives "0" for one CPU, "0-N-1" for N)
and one empty cpuN/ directory per host CPU. The cache/topology
subtrees stay empty so deeper queries return ENOENT until a real
consumer asks. Population is one-shot: the host CPU count does not
change at runtime so refresh is unnecessary.

Hardening guards on the open and stat paths:
- syscpu_open_is_readonly rejects non-RDONLY accmode plus O_CREAT
  and O_TRUNC with EACCES so the stub stays read-only as a real
  sysfs would, covering both the bare cpu root and child paths.
- syscpu_suffix_safe rejects any '..' component before path join so
  a guest open of /sys/devices/system/cpu/../../etc/passwd cannot
  pivot the lstat/open onto an arbitrary host file.
- ensure_syscpu_dir tears down the partial scratch dir on any
  write_file/mkdir failure instead of caching a half-built state
  with syscpu_dir_ok=true.
- A getpid()-vs-syscpu_owner_pid guard in syscpu_dir_cleanup keeps
  clone(CLONE_VM) children from rmdir'ing the parent's still-active
  scratch tree at exit.
- path_prefix_match in path.c tightens the prefix test so
  /sys/devices/system/cpufoo no longer falls into the intercept
  layer.
- syscpu_classify centralizes SYSFS_CPU prefix handling between
  proc_intercept_open and proc_intercept_stat as one source of
  truth.

While auditing access(2) for the new stub, the previous "intercept
matched, return 0" shortcut leaked false positives: a guest probing
W_OK on an intercepted path received 0 even when no W bit was set
in the synthesized stat. path_check_intercept_access now does
proper POSIX mode-bit checking against the stat result, with
standard owner/group/other selection plus a CAP_DAC_OVERRIDE-style
root branch that grants RW always and X if any X bit is set. The
synthetic stat fillers now populate st_uid/st_gid from
proc_get_uid/proc_get_gid so the owner branch matches.

faccessat SYS 48 dispatch passed x3 to sys_faccessat even though
Linux's 3-arg faccessat has no flags parameter; x3 carried whatever
garbage was in the caller's register state, and
translate_faccessat_flags would set AT_EACCESS or
AT_SYMLINK_NOFOLLOW semi-randomly. SYS 48 now forces flags=0; SYS
439 (faccessat2) keeps x3 as before.
@jserv jserv merged commit 223114a into main May 9, 2026
4 checks passed
@jserv jserv deleted the sysfs branch May 9, 2026 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant