Get structured data out of LLM text — reliably, in any language.
Teach the model to wrap each payload between unmistakable markers, then extract it
with one pass that never re-parses content as code. Survives the quotes,
newlines, backticks, and braces that shatter a naive JSON.parse.
🌐 Website & guide → eth-interchained.github.io/sentinel-blocks
<<<JSON>>>
{ "summary": "one object, taken verbatim", "ok": true }
<<<END>>>
<<<FILE src/app.ts>>>
export const greet = (n: string) => `hi ${n}`; // code lives OUTSIDE the JSON
<<<END>>>
You ask a model for JSON. It mostly complies — until the day a value contains a quote, a newline, a stray brace, or a code snippet:
Now your pipeline falls back to a mock, retries, or crashes. The root cause is almost always code or free text stuffed into a JSON string.
Stop fighting the escaping. Tell the model to put each payload between sentinels:
<<<TAG>>>
...payload, taken VERBATIM...
<<<END>>>
and to give code its own named block instead of burying it in JSON:
<<<FILE path/to/file>>>
...file contents, verbatim...
<<<END>>>
Extraction is a single regex (or a tiny scanner). The bytes between the markers are returned untouched — so quotes, braces, and newlines inside them are harmless. For JSON specifically: keep the JSON to data only, and the #1 cause of parse failures disappears.
Distilled from the surgical-edit parser in KeyStone-Lite, generalized into a tiny format with a formal spec and ports in 10 languages.
npm install sentinel-blocks # Node / TypeScript / Bun / Deno
pip install sentinel-blocks # Python 3.8+Every other language is a single dependency-free file — vendor it:
| Language | Copy this into your project |
|---|---|
| C | implementations/c/sentinel_blocks.{h,c} |
| C++ (header-only) | implementations/cpp/sentinel_blocks.hpp |
| Go | implementations/go/sentinelblocks.go |
| Rust | implementations/rust (cargo add from git) |
| Java | implementations/java/SentinelBlocks.java |
| Ruby | implementations/ruby/sentinel_blocks.rb |
| PHP | implementations/php/sentinel_blocks.php |
| JS (no build) | implementations/javascript/sentinel-blocks.{mjs,cjs} |
TypeScript / JavaScript
import { jsonFromResponse, extractTaggedBlocks } from "sentinel-blocks";
const reply = await llm(prompt); // raw completion text
const meta = jsonFromResponse(reply); // robust: never breaks on code-in-string
for (const { arg, content } of extractTaggedBlocks(reply, "FILE")) {
writeFileSync(arg, content); // arg = path, content = verbatim file
}Python
from sentinel_blocks import json_from_response, extract_tagged_blocks
meta = json_from_response(reply) # dict; raises only if nothing is parseable
for arg, content in extract_tagged_blocks(reply, "FILE"):
open(arg, "w").write(content)Go, Rust, C, C++, Java, Ruby, PHP
import sb "sentinelblocks"
meta, err := sb.JSONFromResponse(reply, "") // map[string]interface{}
files := sb.ExtractTaggedBlocks(reply, "FILE") // []sb.Block{Arg, Content}use sentinel_blocks as sb;
let json_text = sb::json_text_from_response(&reply, None); // Option<String> → serde_json
let files = sb::extract_tagged_blocks(&reply, "FILE"); // Vec<Block { arg, content }>char *json = sb_json_text_from_response(reply, NULL); // feed to cJSON/jansson; sb_free(json)
sb_blocks files = sb_extract_blocks(reply, "FILE"); // sb_blocks_free(&files)auto json = sentinel::jsonTextFromResponse(reply); // std::optional<std::string>
auto files = sentinel::extractTaggedBlocks(reply, "FILE");var json = SentinelBlocks.jsonTextFromResponse(reply, null); // Optional<String>
var files = SentinelBlocks.extractTaggedBlocks(reply, "FILE");meta = SentinelBlocks.json_from_response(reply) # Hash
files = SentinelBlocks.extract_tagged_blocks(reply, "FILE")$meta = SentinelBlocks\json_from_response($reply); // associative array
$files = SentinelBlocks\extract_tagged_blocks($reply, 'FILE');Same behavior everywhere; names follow each language's conventions.
| Purpose | TS / JS / C++ / Java | Python / Ruby / PHP / Rust / C |
|---|---|---|
| First block's content | extractBlock |
extract_block |
| All blocks' content | extractBlocks |
extract_blocks |
Blocks with their arg |
extractTaggedBlocks |
extract_tagged_blocks |
| Build a block | wrap / wrapNamed |
wrap / wrap_named |
| Strip fences + trailing commas | repairJson |
repair_json |
First balanced {…} |
firstJsonObject |
first_json_object |
| Robust JSON entry point | jsonFromResponse¹ |
json_from_response¹ |
¹ Languages with a stdlib JSON parser (JS, TS, Python, Go, Ruby, PHP) return a
parsed value. Languages without one (C, C++, Rust, Java) expose
jsonTextFromResponse / json_text_from_response returning the cleaned JSON
text to hand to your JSON library. See the spec §4.
The format only pays off if the model emits it. Two ready-to-paste prompts:
examples/prompts/json-extraction.md— a single JSON object.examples/prompts/codegen-files.md— metadata + multiple files.
The golden rules (full version in docs/PROMPTING.md):
- Reply only in sentinel blocks — nothing before the first or after the last.
- JSON blocks carry data only — never code, never markdown fences.
- Code and long text go in their own
<<<FILE …>>>blocks, verbatim.
Runnable end-to-end examples: examples/openai-node.mjs,
examples/anthropic_python.py, and an EJS
prompt-templating demo in examples/ejs/.
- Content is never re-parsed as code. Extraction returns the bytes between the markers untouched, so quotes/braces/newlines inside them can't corrupt anything.
- Code leaves the JSON. The most common
JSON.parsefailure — unescaped code in a string — is designed out, not patched up. <<<…>>>is rare in natural text and survives markdown, diffs, and streaming.- Graceful JSON fallback.
jsonFromResponsetries the block, then a light repair, then a balanced-brace slice, and only fails loudly — it never fabricates.
More detail and failure-mode analysis: docs/WHY-IT-WORKS.md.
All ports pass the same 8-invariant conformance suite (spec §7).
| Language | Source | Tests |
|---|---|---|
| TypeScript | src/index.ts |
node tests/test_core.ts |
| JavaScript (ESM + CJS) | implementations/javascript |
node tests/test_core.mjs · node tests/test_core.cjs |
| Python | python/sentinel_blocks |
python3 tests/test_core.py |
| Go | implementations/go |
go test ./... |
| Rust | implementations/rust |
cargo test |
| C | implementations/c |
cc test_sentinel_blocks.c sentinel_blocks.c -o t && ./t |
| C++ | implementations/cpp |
c++ -std=c++17 test_sentinel_blocks.cpp -o t && ./t |
| Java | implementations/java |
javac *.java && java SentinelBlocksTest |
| Ruby | implementations/ruby |
ruby test_sentinel_blocks.rb |
| PHP | implementations/php |
php test_sentinel_blocks.php |
sentinel-blocks/
├── SPEC.md formal format specification (v1.0)
├── src/index.ts canonical TypeScript (npm package source)
├── python/sentinel_blocks/ canonical Python (PyPI package source)
├── implementations/ ports: javascript, c, cpp, go, rust, java, ruby, php
├── tests/ shared invariant suites (TS, JS, Python)
├── examples/ prompts, OpenAI/Python runners, EJS templating
└── docs/ PROMPTING.md, WHY-IT-WORKS.md
New language ports are very welcome — porting is mechanical and the conformance
suite tells you when you're done. See CONTRIBUTING.md for the
"add a language in 6 steps" guide.
The technique originates in KeyStone-Lite's surgical-edit parser, generalized here into a documented, multi-language format.
Licensed under GPL-3.0-or-later — see LICENSE. Part of the
Interchained ecosystem.
Curious where the Sentinels come from? There's a bit of lore. 🛡️
Built by Mark Allen Evans Jr. (INTERCHAINED, LLC) with Claude Sonnet 4.6 on Hyperagent.
"Take one idea, turn it into an LP, then an app, then a system, then a platform, then infrastructure that is irreplaceable."
{ "code": "const re = /\{.*\}/; // oops — unescaped braces & slashes" } // ^ JSON.parse throws: "Expected ',' or '}'…"