Skip to content

fix: add buffered DMMF API to bypass V8 string length limit#5757

Merged
jacek-prisma merged 4 commits intoprisma:mainfrom
chris-tophers:fix/dmmf-v8-string-limit
Mar 24, 2026
Merged

fix: add buffered DMMF API to bypass V8 string length limit#5757
jacek-prisma merged 4 commits intoprisma:mainfrom
chris-tophers:fix/dmmf-v8-string-limit

Conversation

@chris-tophers
Copy link
Copy Markdown
Contributor

@chris-tophers chris-tophers commented Feb 6, 2026

Summary

Adds three new wasm_bindgen exports to prisma-schema-wasm that allow DMMF JSON to be returned as chunked Uint8Array data instead of a single JS string, bypassing V8's hard string length limit of 0x1fffffe8 characters (~536MB).

  • get_dmmf_buffered(params) -> usize — serializes DMMF to an internal buffer using serde_json::to_vec(), returns total byte count
  • read_dmmf_chunk(offset, length) -> Vec<u8> — returns a chunk as Uint8Array (no V8 string limit)
  • free_dmmf_buffer() — releases the internal buffer

The existing get_dmmf() is unchanged for backward compatibility.

Root Cause

dmmf_json_from_validated_schema() in query-compiler/dmmf/src/lib.rs uses serde_json::to_string(), producing a single Rust String. When wasm-bindgen converts this to a JS string across the WASM FFI boundary, V8 rejects it if the string exceeds 0x1fffffe8 characters. No Node.js flags can change this limit — it's a V8 engine constant.

Changes

File Change
query-compiler/dmmf/src/lib.rs Add dmmf_json_bytes_from_validated_schema() using serde_json::to_vec()
prisma-fmt/src/get_dmmf.rs Add get_dmmf_bytes() returning Vec<u8>
prisma-fmt/src/lib.rs Export get_dmmf_bytes()
prisma-schema-wasm/src/lib.rs Add 3 new wasm_bindgen exports with Mutex<Vec<u8>> buffer

+76 lines, 4 files changed. No existing behavior modified.

Test Results

Tested with a production schema (1,600+ models, 1,100+ enums, 111K lines):

Test Result Details
Original get_dmmf() FAILS Cannot create a string longer than 0x1fffffe8 characters
get_dmmf_buffered() + chunked read PASSES 571MB DMMF, 35 chunks x 16MB each
Streaming JSON parse of chunks PASSES Full parse completed in ~21s

Limitations

This fix addresses the V8 string length limit but has an upper bound: WASM32 linear memory is capped at ~4GB. For schemas producing DMMF larger than ~4GB, the buffered approach would
also fail. An alternative for extreme cases would be using the native schema engine binary path (which streams over stdio with no memory limit), but the buffered API should cover schemas
well into the tens of thousands of models.

Companion Change Required

The TypeScript side (prisma/prismagetDmmfWasm in packages/internals) needs to detect the V8 string limit error and fall back to the buffered API with a streaming JSON parser.
Both this PR and the companion TypeScript PR are required for a complete fix. We will submit the companion PR to prisma/prisma as well.

Fixes: prisma/prisma#29111

Summary by CodeRabbit

  • New Features
    • Added new DMMF API supporting chunked data reading from JavaScript applications
    • Implemented buffer-based retrieval mechanism enabling efficient access to schema metadata
    • New functionality includes size checking methods and safe bounds-checked reading operations
    • Improved memory efficiency for large schema operations

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Feb 6, 2026

CLA assistant check
All committers have signed the CLA.

chris-tophers added a commit to chris-tophers/prisma that referenced this pull request Feb 6, 2026
When prisma generate processes schemas that produce DMMF larger than
~536MB, the existing get_dmmf() WASM call fails with V8's hard-coded
string length limit (0x1fffffe8 characters). This adds automatic
fallback to the buffered DMMF API (get_dmmf_buffered + read_dmmf_chunk)
which returns data as chunked Uint8Array, bypassing the V8 string limit.

The fallback is transparent — it only activates when the V8 string limit
error is detected, so there is no behavior change for schemas that work
with the existing API.

Companion to: prisma/prisma-engines#5757
Fixes: prisma#29111
chris-tophers added a commit to chris-tophers/prisma-engines that referenced this pull request Feb 6, 2026
Add a `get-dmmf` subcommand to the prisma-fmt CLI that streams DMMF JSON
directly to stdout via serde_json::to_writer(). This approach has no
memory ceiling — unlike WASM (limited to ~4GB linear memory), the native
binary can stream arbitrarily large DMMF with only 1x peak memory
(the in-memory DMMF struct, no serialized buffer).

Changes:
- dmmf crate: add dmmf_json_to_writer() using serde_json::to_writer()
- prisma-fmt: add get_dmmf_to_writer() that validates + streams
- prisma-fmt: export get_dmmf_to_writer() from lib.rs
- prisma-fmt: add GetDmmf CLI variant, reads stdin params, streams to stdout

Alternative approach to the buffered WASM API in prisma#5757.
Fixes: prisma/prisma#29111

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@chris-tophers
Copy link
Copy Markdown
Contributor Author

Alternative approach submitted: #5761

I've also submitted a companion binary streaming approach as an alternative to this WASM buffered API. The key difference:

Aspect WASM Buffered (this PR) Binary Streaming (#5761)
Memory ceiling ~1.5-2GB (WASM32 limit) Unlimited
Peak memory 2x DMMF (struct + buffer) 1x (streams, no buffer)
New binary No Yes (prisma-fmt)
Rust complexity Medium (Mutex, chunks) Low (to_writer + CLI cmd)

The binary approach uses serde_json::to_writer() to stream DMMF JSON directly to stdout from a native prisma-fmt get-dmmf subcommand — no intermediate String or Vec<u8> allocation. The TypeScript side spawns the binary and stream-parses stdout via @streamparser/json.

Both approaches solve the immediate V8 string limit issue. The Prisma team can choose whichever fits best, or combine both (WASM buffered as primary fallback, binary streaming as secondary for schemas that exceed WASM32's ~4GB limit).

Companion TypeScript PRs:

Comment thread prisma-schema-wasm/src/lib.rs Outdated
@jacek-prisma
Copy link
Copy Markdown
Contributor

jacek-prisma commented Feb 9, 2026

Thanks for submitting the PR! I think we can merge this change (along with the prisma part), conditional on my comment getting resolved

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Feb 9, 2026

Merging this PR will not alter performance

✅ 11 untouched benchmarks
⏩ 11 skipped benchmarks1


Comparing chris-tophers:fix/dmmf-v8-string-limit (ee8c8a0) with main (280c870)

Open in CodSpeed

Footnotes

  1. 11 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Add get_dmmf_buffered(), read_dmmf_chunk(), and free_dmmf_buffer() to
prisma-schema-wasm. These allow the DMMF JSON to be returned as chunked
Uint8Array data instead of a single JS string, bypassing V8's hard limit
of ~536MB (0x1fffffe8 characters).

For schemas generating DMMF larger than ~536MB, the existing get_dmmf()
throws 'Cannot create a string longer than 0x1fffffe8 characters' because
wasm-bindgen converts the Rust String to a JS string. The new buffered API
keeps the JSON as bytes in WASM linear memory and lets JS read chunks as
Uint8Array (which has no such limit).

Changes:
- query-compiler/dmmf/src/lib.rs: add dmmf_json_bytes_from_validated_schema()
  using serde_json::to_vec() instead of to_string()
- prisma-fmt/src/get_dmmf.rs: add get_dmmf_bytes() returning Vec<u8>
- prisma-fmt/src/lib.rs: export get_dmmf_bytes()
- prisma-schema-wasm/src/lib.rs: add 3 wasm_bindgen exports:
  - get_dmmf_buffered(params) -> byte count
  - read_dmmf_chunk(offset, length) -> Uint8Array
  - free_dmmf_buffer()

Tested with a very large schema producing DMMF over 536MB:
- Original get_dmmf: FAILS (V8 string limit)
- Buffered API: PASSES (chunked streaming successful)

Fixes: prisma/prisma#29111
Address review feedback from @jacek-prisma: replace the `static
DMMF_BUFFER: Mutex<Vec<u8>>` global state with a caller-owned
`DmmfBuffer` struct exported via `#[wasm_bindgen]`.

The new API:
- `get_dmmf_buffered(params)` → returns a `DmmfBuffer` handle
- `buffer.len()` → total byte count
- `buffer.read_chunk(offset, length)` → `Uint8Array` chunk
- `buffer.free()` → release WASM memory (auto-provided by wasm-bindgen)

Benefits:
- No implicit global state — each call returns an independent buffer
- Multiple concurrent buffers work correctly
- `FinalizationRegistry` provides automatic cleanup as safety net
- `Symbol.dispose` enables `using` syntax in modern JS/TS

Tested with Node.js integration test: 7/7 tests pass including
concurrent buffers, use-after-free detection, and OOB bounds checking.
@chris-tophers chris-tophers force-pushed the fix/dmmf-v8-string-limit branch from 8eed767 to c429262 Compare February 21, 2026 02:52
Comment thread prisma-fmt/src/get_dmmf.rs Outdated
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 16, 2026

Walkthrough

Added a byte-returning DMMF function in prisma-fmt and introduced a DmmfBuffer struct in prisma-schema-wasm that enables chunked reading of DMMF data from WASM memory, avoiding V8 string length constraints for large schemas.

Changes

Cohort / File(s) Summary
DMMF Byte Conversion
prisma-fmt/src/lib.rs
Added get_dmmf_bytes() function that converts the existing DMMF string result into a Vec<u8> for efficient byte-level handling.
WASM Buffered DMMF API
prisma-schema-wasm/src/lib.rs
Introduced DmmfBuffer struct with len(), is_empty(), and read_chunk(offset, length) methods for streaming DMMF reads from WASM memory. Added get_dmmf_buffered() function that delegates to prisma-fmt's byte conversion and exposes the buffer for chunked access by JavaScript, with panic hook registration.

Possibly related issues

No additional issues to link. The changes directly implement the WASM buffered approach proposed in the linked issue #29111 for handling V8 string length limits on large schema DMMF generation.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: add buffered DMMF API to bypass V8 string length limit' directly and clearly describes the main change—adding a buffered API to resolve V8 string limits.
Linked Issues check ✅ Passed The PR fully addresses issue #29111 objectives: prevents DMMF failures by adding a buffered API with chunked reads to bypass V8's JS string length limit, enabling large schema handling.
Out of Scope Changes check ✅ Passed All changes directly support the buffered DMMF API implementation: get_dmmf_bytes() in prisma-fmt and the handle-based DmmfBuffer with read_chunk() in prisma-schema-wasm are narrowly scoped.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use OpenGrep to find security vulnerabilities and bugs across 17+ programming languages.

OpenGrep is compatible with Semgrep configurations. Add an opengrep.yml or semgrep.yml configuration file to your project to enable OpenGrep analysis.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b363e07b-5485-4d0c-985d-6eece4c370c4

📥 Commits

Reviewing files that changed from the base of the PR and between abbfa42 and 0ff6e01.

📒 Files selected for processing (2)
  • prisma-fmt/src/lib.rs
  • prisma-schema-wasm/src/lib.rs

Comment thread prisma-fmt/src/lib.rs
@jacek-prisma jacek-prisma self-requested a review March 24, 2026 16:40
@jacek-prisma jacek-prisma merged commit 75cbdc1 into prisma:main Mar 24, 2026
98 of 99 checks passed
jacek-prisma pushed a commit to prisma/prisma that referenced this pull request Mar 24, 2026
When prisma generate processes schemas that produce DMMF larger than
~536MB, the existing get_dmmf() WASM call fails with V8's hard-coded
string length limit (0x1fffffe8 characters). This adds automatic
fallback to the buffered DMMF API (get_dmmf_buffered + read_dmmf_chunk)
which returns data as chunked Uint8Array, bypassing the V8 string limit.

The fallback is transparent — it only activates when the V8 string limit
error is detected, so there is no behavior change for schemas that work
with the existing API.

Companion to: prisma/prisma-engines#5757
Fixes: #29111
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prisma 7.x WASM DMMF generation fails with "Cannot create a string longer than 0x1fffffe8 characters"

4 participants