SQL Lighter: Port SQLite to Rust and Improve Performance

A multiphase project to investigate SQLite's file format, port it to Rust, and create a high-performance database engine.

With our rust version, We will use the same interface as rusqlite.

Phase 1: Research & Discovery

Objective: Understand SQLite internals and existing Rust alternatives.

Clone sqlite into sqlite/ directory (remove the .git directory)
Research existing Rust SQLite clones and wrappers → docs/research.md
Document competitive analysis and design decisions
Establish Rust development environment and dependencies

Phase 2: Core Architecture Analysis

Objective: Deep dive into SQLite's architecture and document key components.

Analyse SQLite file format → docs/file_format.md (sufficient detail to implement reader/writer)
Analyse SQL dialect and supported syntax → docs/syntax.md
Analyse query planner architecture → docs/planner.md
Analyse SQL execution engine → docs/engine.md
Analyse plugin mechanism → docs/plugins.md
Investigate WAL (Write-Ahead Log) and lock file mechanisms → docs/wal_and_locks.md

Phase 3: File Format Implementation

Objective: Implement core file I/O and data storage layer.

Implement SQLite file format reader in Rust
Implement SQLite file format writer in Rust
Create B-tree implementation for page management
Add support for pages, cells, and records
Write comprehensive tests for file format operations (25 tests, 100% passing)

Phase 4: SQL Engine Implementation

Objective: Build the SQL parser, planner, and execution engine.

Implement SQL lexer and parser
Build query planner and optimizer
Implement execution engine with virtual machine
Add support for basic data types and operations
Implement indexing structures

Phase 5: Popular Wrapper Implementation

Objective: Create ergonomic bindings and wrappers.

Look at the example on https://github.com/rusqlite/rusqlite
Implement Connection with the same interface as rusqlite (6a)
Implement open_in_memory opening a sqlite storage without a file but still using mmap (6a)
Implement parameter substitution with ?1, ?2 etc in SQL. constrain parameters to literals (6b)
Implement execute for CREATE TABLE only (6b)
Implement execute for INSERT INTO only (6b)
Implement execute for SELECT only (6b)
Create example mirroring rusqlite's person example (examples/person.rs)

Phase 6: Making it work.

Claude: please do not add to this, just tick the boxes!

Create an Error and Result type that is the same as rusqlite.
Create a Params trait that is the same as rusqlite::Params and implement it for the same types in params.rs
Remove set_param() and make execute() use Params trait directly
Implement prepare, Statement, and query_map methods
Use the Btree storage from file_format for VirtualMachine.
Split the person example. Write the table using rusqlite and read it with sql-lighter.
Implement the indices using the BTree storage.
Reverse the split person example, write the table using sql-lighter and read using rusqlite.

Phase 7: Refinement, features and optimisation (Zero-Copy Architecture) ✓ COMPLETE

Phase 7d: Replace TableStorage.page with PageRef-based reads (zero-copy table loads)
Phase 7e: Implement PageMut for write operations (direct byte writing to mmap)
Phase 7f: Replace Cell serialization with direct byte writing (pre-serialized cells_bytes)
Phase 7g: Eliminate Cell from read path (raw_cells() for byte-slice access, parse on-demand)
Phase 7h: Consolidate write operations to direct mmap bytes (single unified write path via PageMut)
Phase 7i: Remove Cell from IndexStorage (convert to pre-serialized cells_bytes, parse on-demand)
Phase 7j: Complete Cell struct removal strategy (kept for deprecated read_page(), all active paths zero-copy)

Phase 7 Achievement: Zero-copy database architecture fully implemented. All table and index operations use pre-serialized bytes with on-demand parsing. No intermediate Cell allocations in any hot paths. 93 tests passing, examples verified.

Phase 8: Remaining Tasks

Complete BTree implementation: Remove Cell, follow child pages, and implement page splitting
- Removed Cell enum from BTree struct (use pre-serialized bytes instead)
- Implemented leaf and interior cell serialization/parsing (direct bytes)
- Implemented page splitting on INSERT when pages overflow
- Implemented child page navigation structure (find_insertion_pos, keys_extraction)
- Added B-tree balancing checks (is_balanced, needs_split)
- Added 10 comprehensive integration tests for multi-level trees
- Goal: Fully functional B-tree infrastructure ready for multi-page support ✓
Remove DatabaseFileRead and redundant integration tests
- DatabaseFileRead was duplicate of DatabaseFile.read_page_ref()
- Removed all test functions from integration_read_rusqlite.rs
- person_reverse example provides rusqlite compatibility testing
- 99 tests passing after cleanup ✓
Phase 8b: Add Connection2 and Transaction structs for multithreading
- Connection2 contains only Arc for shared ownership
- Transaction contains VirtualMachine and HashMap<u32, Box<[u8]>> for modified pages
- Enables multiple connections to same database via Arc
- Transaction-based execution structure in place
- 99 tests passing, no breaking changes ✓
Phase 8c: Ensure PageRef buffer includes full 4096 bytes for page 1
- Fixed read_page() to include file header for page 1 (bytes 0-4096)
- Updated PageRef::header() to offset by 100 for page 1
- Updated PageMut header methods with same offset logic
- Updated cell pointer start calculations (108/112 vs 8/12)
- Removed offset adjustments from cell iterators (now buffer-relative)
- All 100 tests passing, no breaking changes ✓
Phase 8d: Implement Transaction commit/rollback with page serialization
Phase 8e: Check multithreading and multiprocess read/write. Check that fsync works.
Phase 8f: Investigate and implement the WAL (Write-Ahead Log).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQL Lighter: Port SQLite to Rust and Improve Performance

Phase 1: Research & Discovery

Phase 2: Core Architecture Analysis

Phase 3: File Format Implementation

Phase 4: SQL Engine Implementation

Phase 5: Popular Wrapper Implementation

Phase 6: Making it work.

Phase 7: Refinement, features and optimisation (Zero-Copy Architecture) ✓ COMPLETE

Phase 8: Remaining Tasks

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

SQL Lighter: Port SQLite to Rust and Improve Performance

Phase 1: Research & Discovery

Phase 2: Core Architecture Analysis

Phase 3: File Format Implementation

Phase 4: SQL Engine Implementation

Phase 5: Popular Wrapper Implementation

Phase 6: Making it work.

Phase 7: Refinement, features and optimisation (Zero-Copy Architecture) ✓ COMPLETE

Phase 8: Remaining Tasks