Skip to content

Add optional compile-time thread-safety for the forwarding API#179

Merged
ViralBShah merged 5 commits into
mainfrom
feature-optional-thread-safety
Jun 7, 2026
Merged

Add optional compile-time thread-safety for the forwarding API#179
ViralBShah merged 5 commits into
mainfrom
feature-optional-thread-safety

Conversation

@ViralBShah

Copy link
Copy Markdown
Member

Motivation

The lbt_* mutation API (lbt_forward(), lbt_set_forward(), lbt_set_forward_by_index()) is documented as thread-unsafe — it mutates global forwarding tables with no locking, and the README explicitly warns against loading BLAS libraries from two threads at once. This adds an opt-in process-global lock so that the common "configure once at startup from racing initializers" case is safe.

Design

  • New lbt_lock() / lbt_unlock() helpers:
    • non-Windows: a pthread_mutex_t with a static initializer
    • Windows: a CRITICAL_SECTION initialized in DllMain (DLL_PROCESS_ATTACH)
    • when LBT_THREADSAFE is undefined, both compile to no-ops, so the default build is behaviorally identical to today.
  • Public mutators now wrap unlocked *_impl workers. Internal callers (lbt_forward_impl, the startup constructor) call the workers directly, so there's no re-entrant locking / deadlock.
  • Make.inc gains an LBT_THREADSAFE toggle (default 0); when 1 it adds -DLBT_THREADSAFE and -pthread (non-Windows).

Scope

Only the mutators are locked. Read-only accessors (lbt_get_config(), lbt_get_forward()) are intentionally left unlocked, so callers must still avoid racing reads against concurrent reconfiguration. (Happy to extend coverage if reviewers prefer — that's partly why this is a draft.)

Verification

  • Builds cleanly both with and without LBT_THREADSAFE=1.
  • Public symbols remain exported; the *_impl workers stay internal (not in nm -D).
  • A dlopen smoke test calls the locked entry points repeatedly without deadlocking (proves lock acquire/release).

Draft for review.

ViralBShah and others added 2 commits June 6, 2026 16:12
The `lbt_*` mutation API (`lbt_forward()`, `lbt_set_forward()`,
`lbt_set_forward_by_index()`) is documented as thread-unsafe: it mutates global
forwarding tables with no locking. This adds an opt-in process-global lock,
enabled with `make LBT_THREADSAFE=1`, that serializes those mutators so racing
initializers cannot corrupt the tables.

- New `lbt_lock()`/`lbt_unlock()` helpers: a `pthread_mutex_t` (static
  initializer) elsewhere, a `CRITICAL_SECTION` (initialized in `DllMain`) on
  Windows. When `LBT_THREADSAFE` is undefined they compile to no-ops, so the
  default build is byte-for-byte equivalent in behavior to before.
- The public mutators now wrap unlocked `*_impl` workers; internal callers
  (`lbt_forward_impl`, the constructor) call the workers directly to avoid
  re-entrant locking.
- `Make.inc` gains the `LBT_THREADSAFE` toggle (default 0), adding `-pthread`
  on non-Windows when enabled.

Scope note: only the mutators are locked; read-only accessors
(`lbt_get_config()`, `lbt_get_forward()`) remain unlocked, so callers must
still avoid racing reads against concurrent reconfiguration.

Verified: builds with and without the flag; public symbols still exported and
the `*_impl` workers stay internal; a dlopen smoke test exercises the lock path
without deadlock.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The locking added in this PR was previously never compiled or run by the tests:
`build_libblastrampoline()` always built the default, no-op build.

Make the harness build with the optional internal locking compiled in by
default, so every backend's tests run against a `-DLBT_THREADSAFE` library and
each `lbt_forward()`/`lbt_set_forward()` goes through `lbt_lock()`/`lbt_unlock()`
across the whole existing CI matrix (all OSes/interfaces). The value is read
from the `LBT_THREADSAFE` env var, so `LBT_THREADSAFE=0` still tests the plain
build that ships by default.

Verified locally: the `direct` backend passes (74/74) with the library built
`-DLBT_THREADSAFE -pthread`, both with the env var set and via the new default.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ViralBShah ViralBShah marked this pull request as ready for review June 6, 2026 16:32
ViralBShah and others added 3 commits June 6, 2026 23:30
Mention `make LBT_THREADSAFE=1` in the thread-safety note: it compiles in a
process-global lock around the mutating API (pthread mutex / CRITICAL_SECTION),
is off by default, and guards mutators only (readers and call forwarding stay
lock-free).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ad-safety

# Conflicts:
#	src/libblastrampoline.c
Promote the thread-safety note to a `### Threading` section and trim it to the
essentials: thread-unsafe by default, `make LBT_THREADSAFE=1` adds a lock around
the mutating API, readers/forwarding stay lock-free.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ViralBShah ViralBShah merged commit e6104b4 into main Jun 7, 2026
45 checks passed
@ViralBShah ViralBShah deleted the feature-optional-thread-safety branch June 7, 2026 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant