Skip to content

ffi: fix sandlock_spawn failure under multi-threaded callers with restricted seccomp (#47)#49

Open
congwang-mk wants to merge 3 commits into
mainfrom
dev
Open

ffi: fix sandlock_spawn failure under multi-threaded callers with restricted seccomp (#47)#49
congwang-mk wants to merge 3 commits into
mainfrom
dev

Conversation

@congwang-mk
Copy link
Copy Markdown
Contributor

Summary

Fixes #47. Sandbox(policy).run([...]) was failing with sandlock_spawn failed when called from a multi-threaded Python host (uvicorn/asyncio) on a Kubernetes pod with the RuntimeDefault seccomp profile.

Root cause: every FFI entry point built a fresh tokio::runtime::Runtime::new(), which is new_multi_thread() and spawns worker threads eagerly via pthread_create. Kubernetes' RuntimeDefault blocks clone3 with ENOSYS, and the multi-thread builder's eager-spawn path returned Err before glibc's fallback could help, surfacing to the caller as a NULL handle.

Fix: switch FFI entry points to a per-thread cached current_thread Tokio runtime, which spawns no threads at construction. Three runtime shapes:

Call site Runtime Why
sandlock_run and the rest of the one-shot entry points thread-local current_thread One-shot block_on, no tasks need to outlive the call
sandlock_create (live handle) per-handle multi_thread, 1 worker Supervisor must keep ticking between start and wait; one persistent worker is unavoidable here
sandlock_create_for_run (new) per-handle current_thread Python's Sandbox.run() is start then wait back-to-back, so suspension across the gap is fine and avoids the one worker thread sandlock_create would have spawned

Sandbox.run() is wired to sandlock_create_for_run. The seccomp supervisor's blocking SECCOMP_IOCTL_NOTIF_RECV thread is left as-is: it spawns through pthread_create, which the reporter's diagnostic confirms works in their environment ("new_thread": "ok").

Test plan

  • cargo build --release clean on dev
  • pytest python/tests/ 249/249 pass, including the new TestSandlockRunCAbiMultiThreaded and TestSandboxRunMultiThreaded regression tests
  • Reporter retests against this branch on their k8s pod with RuntimeDefault seccomp (the only environment where the original failure was empirically observable)

Note: the new tests are not red-on-pristine on an unrestricted dev box because glibc's clone3 → clone(2) fallback masks the failure mode locally. The docstring on TestSandlockRunCAbiMultiThreaded documents this honestly.

Signed-off-by: Cong Wang <cwang@multikernel.io>
Signed-off-by: Cong Wang <cwang@multikernel.io>
Signed-off-by: Cong Wang <cwang@multikernel.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sandlock_spawn fails with ENOSYS (clone3) when called from a multi-threaded Python process (uvicorn/asyncio + Kubernetes RuntimeDefault seccomp)

1 participant