GitHub - vectoral-io/lyra: Lyra is a lightweight engine for building precomputed, faceted index bundles from structured data.

A lightweight engine for building precomputed indexes from structured data.

Build an index from your data offline, ship it as JSON, and run fast deterministic queries anywhere — browser, server, edge, or as a tool for an LLM agent.

Not a vector DB. Not a warehouse. A portable, manifest-driven query layer between raw data and whoever needs to filter it.

Install

npm install @vectoral/lyra

Quick start

import { createBundle, LyraBundle } from '@vectoral/lyra';

// 1. Build a bundle (typically in CI / a build step)
const bundle = await createBundle(tickets, {
  datasetId: 'tickets-2025-11-22',
  equal: ['customer', 'priority', 'status'],
  ranges: ['createdAt'],
});

// 2. Persist as JSON (portable, debuggable) — or as a binary container.
const json = JSON.stringify(bundle.toJSON());
const bytes = bundle.serialize('binary');     // Uint8Array, ~50× faster cold-start

// 3. Load and query anywhere
const loaded = LyraBundle.load<Ticket>(JSON.parse(json));
//                  ↑ also accepts a Uint8Array (autodetected v4 binary)

const result = loaded.query({
  equal: { customer: 'Acme Corp', priority: 'high' },
  ranges: { createdAt: { min: Date.now() - 7 * 86400_000 } },
  limit: 50,
});

result.items;  // matching tickets
result.total;  // total matches (ignores pagination)

Serialization

Lyra ships two interoperable formats. Both round-trip identical query results.

Format	Producer	Consumer	When to use
JSON (v3.x)	`bundle.toJSON()` / `bundle.serialize()`	`LyraBundle.load(json)`	Debugging, portability, transport over plain HTTP/JSON pipelines
Binary (v4.x)	`bundle.serialize('binary')` → `Uint8Array`	`LyraBundle.loadBinary(bytes)` or `LyraBundle.load(bytes)` (autodetect)	Production hot path: smaller wire size + drastically faster cold-start

The binary container (v4.1) holds a single header JSON, then aligned blocks for items (columnar — dictionary-encoded strings, raw f64 numbers, packed booleans), facet posting lists (delta + varint), null posting lists, and range columns (zero-copy Float64Array views when alignment permits).

Headline measurements on a 300k-item real-world fixture (deeply-nested record shape — strings, numbers, arrays, and a per-row Record<string, …> step map):

	v3.1 JSON	v4.1 binary
Wire size (gzipped)	49.3 MB	43.6 MB (12% smaller)
Critical-path cold start (post-network main thread)	~887 ms (`JSON.parse`) + 2 ms (`load`)	~18 ms (`loadBinary`)
Speedup	—	~49× faster

JSON.parse dominates the v3.1 cold start because fetch().json() is buffered + synchronous. The v4.1 path skips it entirely via arrayBuffer() → loadBinary. Range columns are zero-copy when buffer alignment permits.

// Browser hot path (assumes API responds with the binary container):
const res = await fetch('/api/bundle');
const bytes = new Uint8Array(await res.arrayBuffer());
const bundle = LyraBundle.loadBinary<Ticket>(bytes);
bundle.query({ equal: { status: 'open' } });

v3.1 JSON readers stay supported indefinitely. See docs/migration-v4.md for the full migration guide and docs/bundle-json-spec.md for the on-the-wire format spec.

Query

All operators AND together. All accept a scalar or array (IN semantics).

bundle.query({
  equal:     { status: 'open', priority: ['high', 'urgent'] },
  notEqual:  { region: 'EU' },
  isNull:    ['archivedAt'],
  isNotNull: ['owner'],
  ranges:    { createdAt: { min: oneWeekAgo, max: now } },
  limit:     50,
  offset:    0,
  includeFacetCounts: true,   // populate result.facets
});

null in equal/notEqual is normalized to isNull/isNotNull. [val, null] matches val OR null.

Aliases

Declare human-readable fields that resolve to canonical IDs. Lookup tables are auto-generated from your data.

const bundle = await createBundle(items, {
  datasetId: 'zones',
  equal: ['zone_id'],
  aliases: { zone_name: 'zone_id' },     // zone_name → zone_id
});

// Query by alias
bundle.query({ equal: { zone_name: 'Zone A' } });

// Enrich results with alias values (opt-in)
const result = bundle.query({
  equal: { zone_id: 'Z-001' },
  enrichAliases: true,                    // or ['zone_name']
});
result.items[0].zone_name;                // ['Zone A']

// Or enrich on demand with batch dedup
const enriched = bundle.enrichItems(result.items, ['zone_name']);

LLM agent tool

import { buildOpenAiTool } from '@vectoral/lyra';

const tool = buildOpenAiTool(bundle.describe(), {
  name: 'queryTickets',
  description: 'Query support tickets',
});

// Pass `tool` to your agent framework; execute queries via bundle.query(args).

The schema is derived from the manifest's capabilities, so it always matches what the bundle actually supports. See examples/agent-tool/.

Facet summaries (for dashboards)

bundle.getFacetSummary('status', { equal: { customer: 'ACME' } });
// { field: 'status', values: [{ value: 'open', count: 12 }, ...] }

Configuration styles

Simple (type inference, auto-meta):

createBundle(items, {
  datasetId: 'tickets',
  equal: ['status', 'priority'],
  ranges: ['createdAt'],
  aliases: { zone_name: 'zone_id' },
});

Explicit (full control):

createBundle(items, {
  datasetId: 'tickets',
  fields: {
    id:        { kind: 'id',    type: 'string' },
    status:    { kind: 'facet', type: 'string' },
    createdAt: { kind: 'range', type: 'date' },
  },
});

Simple config auto-adds remaining primitive fields as meta. Disable with autoMeta: false.

Behavior

Queries are deterministic: same bundle + same query = same result.
Unknown fields → no matches (fail-closed, never throws).
Bad pagination clamped (negative offset → 0, negative limit → 0 items).
null / undefined values are excluded from the facet index and from range results.

Docs

When to use Lyra

Good fit: structured records with a known schema, sub-millisecond filter queries in the browser / edge / agent, medium datasets (thousands to low hundreds of thousands of rows).

Not a fit: full-text search, semantic similarity, transactional writes, datasets that need live updates.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.config		.config
.cursor/rules		.cursor/rules
.github		.github
.vscode		.vscode
docs		docs
examples		examples
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
VERSIONING.md		VERSIONING.md
bun.lock		bun.lock
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install

Quick start

Serialization

Query

Aliases

LLM agent tool

Facet summaries (for dashboards)

Configuration styles

Behavior

Docs

When to use Lyra

License

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Install

Quick start

Serialization

Query

Aliases

LLM agent tool

Facet summaries (for dashboards)

Configuration styles

Behavior

Docs

When to use Lyra

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages