Problem
When running tests with --parallel flag in CI environments, intermittent test failures occur due to native library download race conditions.
Root Cause
Multiple subprocess instances simultaneously attempt to download the same native library (e.g., SQLite3 via @db/sqlite), resulting in:
- Error:
Could not open library: file too short
- Cause: One process tries to open an incomplete library file while another is still downloading it
Evidence
From CI logs:
Failed to load SQLite3 Dynamic Library
Caused by Error: Could not open library: Could not open library:
/home/runner/.cache/deno/plug/.../libsqlite3.so: file too short
Affected Libraries
Any library using FFI (Foreign Function Interface):
@db/sqlite
@db/duckdb
- Other native libraries loaded via
@denosaurs/plug
Why This is Probitas-Specific
This issue affects any user running probitas tests in parallel in CI environments. It's not specific to this repository's tests, but a general problem that probitas users will encounter.
Potential Solutions
1. Retry Logic with Exponential Backoff ⭐ Recommended
Automatically retry imports on library race condition errors.
Pros:
- Transparent to users
- Works automatically
- Minimal performance impact
Cons:
- Adds complexity to loader
- May mask other issues
Implementation approach:
// In @probitas/scenario loader
async function retryImport(url: string, maxRetries = 3) {
for (let attempt = 0; attempt <= maxRetries; attempt++) {
try {
return await import(url);
} catch (err) {
if (!isLibraryRaceConditionError(err) || attempt === maxRetries) {
throw err;
}
await exponentialBackoff(attempt);
}
}
}
2. Serial Execution Option
Add --jobs=1 flag or option to disable parallel execution.
Pros:
- Simple
- Guaranteed to work
Cons:
- Tests run slower
- Users must opt-in
3. Pre-load Libraries
Document workaround for users to pre-cache libraries before tests.
Pros:
Cons:
- Users must manually configure
- Not automatic
4. File Locking
Use file locks to coordinate library downloads.
Pros:
- Prevents conflicts at OS level
Cons:
- Complex cross-platform implementation
- May not work with Deno's caching mechanism
Workaround (Current)
For now, users can add this to their CI workflow:
- name: Pre-load native libraries
run: |
deno eval "import { Database } from 'jsr:@db/sqlite@0.12'; const db = new Database(':memory:'); db.close();"
Discussion
Should probitas handle this automatically, or is documenting the workaround sufficient?
For automatic handling, retry logic seems most appropriate as it:
- Doesn't slow down the common case
- Handles the error gracefully
- Requires no user configuration
Problem
When running tests with
--parallelflag in CI environments, intermittent test failures occur due to native library download race conditions.Root Cause
Multiple subprocess instances simultaneously attempt to download the same native library (e.g., SQLite3 via
@db/sqlite), resulting in:Could not open library: file too shortEvidence
From CI logs:
Affected Libraries
Any library using FFI (Foreign Function Interface):
@db/sqlite@db/duckdb@denosaurs/plugWhy This is Probitas-Specific
This issue affects any user running probitas tests in parallel in CI environments. It's not specific to this repository's tests, but a general problem that probitas users will encounter.
Potential Solutions
1. Retry Logic with Exponential Backoff ⭐ Recommended
Automatically retry imports on library race condition errors.
Pros:
Cons:
Implementation approach:
2. Serial Execution Option
Add
--jobs=1flag or option to disable parallel execution.Pros:
Cons:
3. Pre-load Libraries
Document workaround for users to pre-cache libraries before tests.
Pros:
Cons:
4. File Locking
Use file locks to coordinate library downloads.
Pros:
Cons:
Workaround (Current)
For now, users can add this to their CI workflow:
Discussion
Should probitas handle this automatically, or is documenting the workaround sufficient?
For automatic handling, retry logic seems most appropriate as it: