Skip to content

marvinified/voctar

Repository files navigation

Voctar logo

Voctar

Simple TypeScript library with RAG primitives for embeddings, chunking, storage, and retrieval.

npm version TypeScript Node License: MIT

Features

  • Simple primitives: embed and search
  • Supports multiple vector stores: SQLite, Qdrant, in-memory, or custom store providers
  • Automatic chunking for long documents with multiple strategies (fixed, recursive, sentence, paragraph, semantic)
  • Semantic search with score thresholds and metadata filtering
  • TypeScript-first.

Quick Start

yarn add voctar
import { Voctar } from 'voctar';

const vector = new Voctar({
  embedding: {
    type: 'openai',
    apiKey: '<your-api-key>',
  },
  store: {
    type: 'sqlite',
    path: 'data/vector.db',
  },
});

const { documentId } = await vector.embed('documents', "Very long text...", {
  metadata: { author: 'Alice' },
});

const results = await vector.search('documents', 'Some query');

Primitives API

embed(collection, text, options?)

Embeds a document into a collection.
If the text exceeds model limits, Voctar auto-chunks and stores chunk vectors.

const { documentId, chunkIds } = await vector.embed('documents', longText, {
  documentId: 'doc-1',                 // optional; auto-generated if omitted
  metadata: { source: 'guide' },       // optional user metadata
  chunkSize: 1000,                     // optional
  chunkStrategy: 'recursive',          // fixed | recursive | sentence | paragraph | semantic
  chunkOverlap: 200,                   // optional
  autoChunk: true,                     // optional override
});

Returns:

  • documentId: stable parent id for the document
  • chunkIds: stored ids (single id for unchunked docs, multiple for chunked docs)

search(collection, query, options?)

Retrieves semantically similar text from a collection.

const results = await vector.search('documents', 'how does chunking work', {
  limit: 5,                            // optional, default provider behavior
  scoreThreshold: 0,                   // optional
  filter: { source: 'guide' },         // optional metadata filter
  includeSystem: false,                // optional; include internal metadata when true
});

Each result includes:

  • id
  • text
  • score
  • createdAt
  • metadata (and optional system when includeSystem: true)

Documentation

About

Simple TypeScript library with RAG primitives for embeddings, chunking, storage, and semantic retrieval.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors