Symptom
Roughly 150 Test262 tests in the compiled-mode subset bucket as Timeout despite running to completion in well under 1 second when executed standalone. The regressions are concentrated in:
test/built-ins/String/prototype/* (~58 tests)
test/language/expressions/call/* (~23 tests)
test/built-ins/Array/length/* (~20 tests)
- Smaller chunks in
Promise/race, Promise/all, RegExp/prototype, Array/from, Array/of
Each of these tests passes cleanly in dotnet SharpTS.Test262.Worker.dll < single_path.txt (≤ 1.4 s end-to-end). But during the parallel regen (6 workers via BatchedSubprocessRunner) they cross the 15 s per-test timeout and bucket as Timeout.
Root cause
CPU + memory contention between the 6 concurrent worker subprocesses. Each worker:
- Compiles each Test262 file via Reflection.Emit (~50-100 ms),
- Loads the resulting assembly into a collectible ALC (~10-20 ms),
- Invokes
$Program.Main which JITs the entire compiled IL (~50-300 ms),
- Unloads + GC.
Under 6-way load:
- ALC dynamic-assembly counts grow into the thousands per worker.
- Reflection caches accumulate despite the
GcEveryNTests = 50 sweep.
- After ~2000 tests, individual workers hit ~430 MB working set.
- JIT throughput per core drops as memory bandwidth saturates.
- Per-test latency spikes from ~200 ms to >15 s for the unlucky tests.
What we tried
- Worker recycling (worker exits every 500 tests, parent respawns). Memory stayed bounded (<150 MB) but JIT cold-start on each respawn pushed more tests past the timeout. Net: 1046 timeouts vs the current 692. Reverted.
- N=4 instead of N=6. Workers grew larger per slot (longer-lived), per-test latency worse. Killed at 2 h with no completion. Reverted.
- 15 s per-test timeout (up from 5 s) — reduced timeouts from ~3100 to ~700. Currently shipping. Diminishing returns past 15 s.
What's needed for a real fix
Probably one or more of:
- Tier-up JIT control — keep tier-0 fast, avoid the tier-1 promotion that's pulling JIT throughput across all 6 workers.
- In-process incremental compile — share a single ALC across many tests within a worker so the runtime helpers ($Object, $Runtime, etc.) JIT exactly once.
- Reuse compiled assembly metadata across tests — strip the per-test boilerplate.
- CPU affinity / quota — pin workers to a subset of cores so OS scheduling doesn't thrash.
None of these are quick fixes; each is a multi-day investigation.
Repro
git checkout 84a3b94
dotnet build SharpTS.Test262.Worker/SharpTS.Test262.Worker.csproj
SHARPTS_TEST262_UPDATE_BASELINE=1 dotnet test SharpTS.Test262/SharpTS.Test262.csproj \
--filter FullyQualifiedName~CompiledBaseline
The regen will produce ~150 Pass→Timeout regressions vs running each test standalone.
Workaround
Per-test timeout tunable via config/subset.json timeoutSeconds. Bumping further (20-30 s) recovers some at the cost of slower regen on genuinely-stuck tests.
Acceptance
The 150 Pass→Timeout regressions disappear from the parallel-regen baseline.
Symptom
Roughly 150 Test262 tests in the compiled-mode subset bucket as
Timeoutdespite running to completion in well under 1 second when executed standalone. The regressions are concentrated in:test/built-ins/String/prototype/*(~58 tests)test/language/expressions/call/*(~23 tests)test/built-ins/Array/length/*(~20 tests)Promise/race,Promise/all,RegExp/prototype,Array/from,Array/ofEach of these tests passes cleanly in
dotnet SharpTS.Test262.Worker.dll < single_path.txt(≤ 1.4 s end-to-end). But during the parallel regen (6 workers viaBatchedSubprocessRunner) they cross the 15 s per-test timeout and bucket asTimeout.Root cause
CPU + memory contention between the 6 concurrent worker subprocesses. Each worker:
$Program.Mainwhich JITs the entire compiled IL (~50-300 ms),Under 6-way load:
GcEveryNTests = 50sweep.What we tried
What's needed for a real fix
Probably one or more of:
None of these are quick fixes; each is a multi-day investigation.
Repro
The regen will produce ~150 Pass→Timeout regressions vs running each test standalone.
Workaround
Per-test timeout tunable via
config/subset.jsontimeoutSeconds. Bumping further (20-30 s) recovers some at the cost of slower regen on genuinely-stuck tests.Acceptance
The 150 Pass→Timeout regressions disappear from the parallel-regen baseline.