Summary
Calling Sandbox(policy).run(...) from a uvicorn server process returns exit_code=-1, error=\"sandlock_spawn failed\" every time. The identical call succeeds from a fresh single-threaded Python process in the same container.
Context
I was setting up sandlock as the execution backend for an MCP tool server — following the recommendation in lobehub/lobehub#12472 to use sandlock as a self-hosted alternative to LobeHub's cloud sandbox. Because LobeHub requires Streamable HTTP MCP transport (not SSE), I wrote a thin FastMCP wrapper around Sandbox.run().
The server runs as a sidecar container in a Kubernetes k3s pod.
Environment
- Python 3.12, sandlock 0.7.0 (pip)
- uvicorn + FastMCP (Streamable HTTP transport)
- Kubernetes k3s, kernel 6.18.18, Landlock ABI v7
- Pod seccomp:
RuntimeDefault (Kubernetes PSS restricted)
- Container: UID 1000,
readOnlyRootFilesystem: true, allowPrivilegeEscalation: false, capabilities: drop ALL
Reproduction
Any FastMCP/uvicorn server that calls Sandbox(policy).run() from its request handler:
@mcp.tool()
async def execute_python(code: str) -> str:
ws = pathlib.Path("/tmp/sessions/default")
policy = Policy(fs_readable=["/usr","/lib","/etc"], fs_writable=[str(ws)], ...)
loop = asyncio.get_event_loop()
return await loop.run_in_executor(None, lambda: Sandbox(policy).run(["python3", "-c", code]))
Result: Result(success=False, exit_code=-1, error='sandlock_spawn failed')
Diagnosis
I am not a kernel developer or Python internals expert — I figured this out in collaboration with Claude Sonnet 4.6, so please correct any mistakes in the analysis.
A diagnostic endpoint injected into the running server process revealed:
{
"pid": 1,
"active_threads": 2,
"fork": "ok",
"clone3": "ret=-1 errno=38 (Function not implemented)",
"new_thread": "ok",
"minimal_policy": {"ok": false, "error": "sandlock_spawn failed"}
}
Key observations:
fork() works fine from the server process
clone3 returns ENOSYS — it is blocked by Kubernetes' RuntimeDefault seccomp profile
- Python's
threading.Thread still works because glibc falls back from clone3 to clone
sandlock_spawn fails even with the most minimal policy
Reading crates/sandlock-ffi/src/lib.rs:
let rt = match tokio::runtime::Runtime::new() { // = new_multi_thread()
Ok(rt) => rt,
Err(_) => return ptr::null_mut(), // → "sandlock_spawn failed"
};
Runtime::new() calls new_multi_thread(), which spawns OS worker threads. Our hypothesis: when called from a multi-threaded parent process (uvicorn has 2 threads — event loop + thread pool), Tokio's worker thread spawning fails. Either clone3 is blocked and the fallback doesn't work reliably in a multi-threaded context, or glibc's pthread_atfork handlers deadlock in the forked child. Python itself warns:
DeprecationWarning: This process (pid=1) is multi-threaded,
use of fork() may lead to deadlocks in the child.
The same issue exists in the current source at lib.rs lines ~694, ~744, ~890, ~1042, ~1224, ~1330, ~1628, ~1679, ~1710 and handler/run.rs.
Workaround
Spawn a fresh single-threaded Python subprocess per sandlock call. The subprocess has no active event loop or thread pool, so Tokio's runtime creation succeeds:
def _run_sandboxed_sync(cmd, ws, timeout):
helper = r"""
import sys, json, pathlib
from sandlock import Sandbox, Policy
req = json.loads(sys.stdin.read())
# build policy, call Sandbox(policy).run(), return JSON
"""
proc = subprocess.run(
[sys.executable, "-c", helper],
input=json.dumps({"cmd": cmd, "ws": str(ws), "timeout": timeout}),
capture_output=True, text=True, timeout=timeout + 5,
)
return json.loads(proc.stdout)["output"]
This works, but adds ~50ms overhead (Python startup time) and an extra unconfined intermediary process.
Suggested fix
Replace Runtime::new() with a current-thread runtime at every call site in the FFI layer:
// Before
let rt = match tokio::runtime::Runtime::new() {
// After
let rt = match tokio::runtime::Builder::new_current_thread()
.enable_all()
.build() {
A current-thread runtime runs entirely on the calling thread — no worker thread spawning, no clone3, no fork-safety issues. The async operations sandlock performs (waiting for child process I/O) are I/O-bound, not CPU-parallel, so there is no functional regression from dropping the multi-thread scheduler.
Happy to provide any additional diagnostic information or test a patched build.
Summary
Calling
Sandbox(policy).run(...)from a uvicorn server process returnsexit_code=-1, error=\"sandlock_spawn failed\"every time. The identical call succeeds from a fresh single-threaded Python process in the same container.Context
I was setting up sandlock as the execution backend for an MCP tool server — following the recommendation in lobehub/lobehub#12472 to use sandlock as a self-hosted alternative to LobeHub's cloud sandbox. Because LobeHub requires Streamable HTTP MCP transport (not SSE), I wrote a thin FastMCP wrapper around
Sandbox.run().The server runs as a sidecar container in a Kubernetes k3s pod.
Environment
RuntimeDefault(Kubernetes PSSrestricted)readOnlyRootFilesystem: true,allowPrivilegeEscalation: false,capabilities: drop ALLReproduction
Any FastMCP/uvicorn server that calls
Sandbox(policy).run()from its request handler:Result:
Result(success=False, exit_code=-1, error='sandlock_spawn failed')Diagnosis
I am not a kernel developer or Python internals expert — I figured this out in collaboration with Claude Sonnet 4.6, so please correct any mistakes in the analysis.
A diagnostic endpoint injected into the running server process revealed:
{ "pid": 1, "active_threads": 2, "fork": "ok", "clone3": "ret=-1 errno=38 (Function not implemented)", "new_thread": "ok", "minimal_policy": {"ok": false, "error": "sandlock_spawn failed"} }Key observations:
fork()works fine from the server processclone3returnsENOSYS— it is blocked by Kubernetes'RuntimeDefaultseccomp profilethreading.Threadstill works because glibc falls back fromclone3toclonesandlock_spawnfails even with the most minimal policyReading
crates/sandlock-ffi/src/lib.rs:Runtime::new()callsnew_multi_thread(), which spawns OS worker threads. Our hypothesis: when called from a multi-threaded parent process (uvicorn has 2 threads — event loop + thread pool), Tokio's worker thread spawning fails. Eitherclone3is blocked and the fallback doesn't work reliably in a multi-threaded context, or glibc'spthread_atforkhandlers deadlock in the forked child. Python itself warns:The same issue exists in the current source at
lib.rslines ~694, ~744, ~890, ~1042, ~1224, ~1330, ~1628, ~1679, ~1710 andhandler/run.rs.Workaround
Spawn a fresh single-threaded Python subprocess per sandlock call. The subprocess has no active event loop or thread pool, so Tokio's runtime creation succeeds:
This works, but adds ~50ms overhead (Python startup time) and an extra unconfined intermediary process.
Suggested fix
Replace
Runtime::new()with a current-thread runtime at every call site in the FFI layer:A current-thread runtime runs entirely on the calling thread — no worker thread spawning, no
clone3, no fork-safety issues. The async operations sandlock performs (waiting for child process I/O) are I/O-bound, not CPU-parallel, so there is no functional regression from dropping the multi-thread scheduler.Happy to provide any additional diagnostic information or test a patched build.