Skip to content

vectoral-io/lyra

Repository files navigation

Lyra

A lightweight engine for building precomputed indexes from structured data.


Build an index from your data offline, ship it as JSON, and run fast deterministic queries anywhere — browser, server, edge, or as a tool for an LLM agent.

Not a vector DB. Not a warehouse. A portable, manifest-driven query layer between raw data and whoever needs to filter it.

Install

npm install @vectoral/lyra

Quick start

import { createBundle, LyraBundle } from '@vectoral/lyra';

// 1. Build a bundle (typically in CI / a build step)
const bundle = await createBundle(tickets, {
  datasetId: 'tickets-2025-11-22',
  equal: ['customer', 'priority', 'status'],
  ranges: ['createdAt'],
});

// 2. Persist as JSON (portable, debuggable) — or as a binary container.
const json = JSON.stringify(bundle.toJSON());
const bytes = bundle.serialize('binary');     // Uint8Array, ~50× faster cold-start

// 3. Load and query anywhere
const loaded = LyraBundle.load<Ticket>(JSON.parse(json));
//                  ↑ also accepts a Uint8Array (autodetected v4 binary)

const result = loaded.query({
  equal: { customer: 'Acme Corp', priority: 'high' },
  ranges: { createdAt: { min: Date.now() - 7 * 86400_000 } },
  limit: 50,
});

result.items;  // matching tickets
result.total;  // total matches (ignores pagination)

Serialization

Lyra ships two interoperable formats. Both round-trip identical query results.

Format Producer Consumer When to use
JSON (v3.x) bundle.toJSON() / bundle.serialize() LyraBundle.load(json) Debugging, portability, transport over plain HTTP/JSON pipelines
Binary (v4.x) bundle.serialize('binary')Uint8Array LyraBundle.loadBinary(bytes) or LyraBundle.load(bytes) (autodetect) Production hot path: smaller wire size + drastically faster cold-start

The binary container (v4.1) holds a single header JSON, then aligned blocks for items (columnar — dictionary-encoded strings, raw f64 numbers, packed booleans), facet posting lists (delta + varint), null posting lists, and range columns (zero-copy Float64Array views when alignment permits).

Headline measurements on a 300k-item real-world fixture (deeply-nested record shape — strings, numbers, arrays, and a per-row Record<string, …> step map):

v3.1 JSON v4.1 binary
Wire size (gzipped) 49.3 MB 43.6 MB (12% smaller)
Critical-path cold start (post-network main thread) ~887 ms (JSON.parse) + 2 ms (load) ~18 ms (loadBinary)
Speedup ~49× faster

JSON.parse dominates the v3.1 cold start because fetch().json() is buffered + synchronous. The v4.1 path skips it entirely via arrayBuffer()loadBinary. Range columns are zero-copy when buffer alignment permits.

// Browser hot path (assumes API responds with the binary container):
const res = await fetch('/api/bundle');
const bytes = new Uint8Array(await res.arrayBuffer());
const bundle = LyraBundle.loadBinary<Ticket>(bytes);
bundle.query({ equal: { status: 'open' } });

v3.1 JSON readers stay supported indefinitely. See docs/migration-v4.md for the full migration guide and docs/bundle-json-spec.md for the on-the-wire format spec.

Query

All operators AND together. All accept a scalar or array (IN semantics).

bundle.query({
  equal:     { status: 'open', priority: ['high', 'urgent'] },
  notEqual:  { region: 'EU' },
  isNull:    ['archivedAt'],
  isNotNull: ['owner'],
  ranges:    { createdAt: { min: oneWeekAgo, max: now } },
  limit:     50,
  offset:    0,
  includeFacetCounts: true,   // populate result.facets
});

null in equal/notEqual is normalized to isNull/isNotNull. [val, null] matches val OR null.

Aliases

Declare human-readable fields that resolve to canonical IDs. Lookup tables are auto-generated from your data.

const bundle = await createBundle(items, {
  datasetId: 'zones',
  equal: ['zone_id'],
  aliases: { zone_name: 'zone_id' },     // zone_name → zone_id
});

// Query by alias
bundle.query({ equal: { zone_name: 'Zone A' } });

// Enrich results with alias values (opt-in)
const result = bundle.query({
  equal: { zone_id: 'Z-001' },
  enrichAliases: true,                    // or ['zone_name']
});
result.items[0].zone_name;                // ['Zone A']

// Or enrich on demand with batch dedup
const enriched = bundle.enrichItems(result.items, ['zone_name']);

LLM agent tool

import { buildOpenAiTool } from '@vectoral/lyra';

const tool = buildOpenAiTool(bundle.describe(), {
  name: 'queryTickets',
  description: 'Query support tickets',
});

// Pass `tool` to your agent framework; execute queries via bundle.query(args).

The schema is derived from the manifest's capabilities, so it always matches what the bundle actually supports. See examples/agent-tool/.

Facet summaries (for dashboards)

bundle.getFacetSummary('status', { equal: { customer: 'ACME' } });
// { field: 'status', values: [{ value: 'open', count: 12 }, ...] }

Configuration styles

Simple (type inference, auto-meta):

createBundle(items, {
  datasetId: 'tickets',
  equal: ['status', 'priority'],
  ranges: ['createdAt'],
  aliases: { zone_name: 'zone_id' },
});

Explicit (full control):

createBundle(items, {
  datasetId: 'tickets',
  fields: {
    id:        { kind: 'id',    type: 'string' },
    status:    { kind: 'facet', type: 'string' },
    createdAt: { kind: 'range', type: 'date' },
  },
});

Simple config auto-adds remaining primitive fields as meta. Disable with autoMeta: false.

Behavior

  • Queries are deterministic: same bundle + same query = same result.
  • Unknown fields → no matches (fail-closed, never throws).
  • Bad pagination clamped (negative offset → 0, negative limit → 0 items).
  • null / undefined values are excluded from the facet index and from range results.

Docs

When to use Lyra

Good fit: structured records with a known schema, sub-millisecond filter queries in the browser / edge / agent, medium datasets (thousands to low hundreds of thousands of rows).

Not a fit: full-text search, semantic similarity, transactional writes, datasets that need live updates.

License

MIT

About

Lyra is a lightweight engine for building precomputed, faceted index bundles from structured data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors