Skip to content

chdb-io/chdb-zig

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

chdb-zig

A Zig binding for chdb - the embedded ClickHouse database engine. This library provides a safe and convenient way to interact with ClickHouse directly from Zig, leveraging the language's memory safety features and type system.

Overview

chdb-zig wraps the C API of chdb, giving you access to a full-featured SQL database that runs in-process without needing to manage a separate server. Whether you need to query Parquet files, create in-memory tables, or perform complex analytical queries, chdb-zig makes it straightforward.

Basic Usage

Here's a simple example that creates a table and queries it:

const std = @import("std");
const chdb_zig = @import("chdb_zig");

pub fn main() !void {
    var gpa: std.heap.GeneralPurposeAllocator(.{}) = .{};
    const allocator = gpa.allocator();
    defer _ = gpa.deinit();

    // Initialize a connection with options
    const options = chdb_zig.ChdbConnectionOptions{
        .UseMultiQuery = true,
        .Path = "my_database.db",
    };

    const conn = try chdb_zig.initConnection(allocator, options);
    defer conn.deinit();

    // Create a table
    try conn.execute(@constCast("CREATE TABLE IF NOT EXISTS test (id Int32, name String) " ++
        "ENGINE = MergeTree() ORDER BY id"));

    try conn.execute(@constCast("INSERT INTO test (id,name) VALUES (1,'Alice'), (2,'Bob')"));
    // Query the database
    var result = try conn.query(@constCast("SELECT * FROM test"));
    if (!result.isSuccess()) {
        std.debug.print("Query failed: {?s}\n", .{result.getError()});
        return;
    }
    defer result.deinit();

    // Iterate through results
    var iter = result.iter(allocator);
    while (iter.nextRow()) |row| {
        std.debug.print("Row: {s}\n", .{row});
    }
}

Working with Remote Data

One of the powerful features of chdb is the ability to query remote data sources directly. Here's an example using Parquet files from a URL:

const query = 
    \\CREATE TABLE IF NOT EXISTS parquet_data ENGINE = MergeTree() 
    \\ORDER BY tuple()
    \\AS SELECT * FROM url('https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_0.parquet');
;

try conn.execute(@constCast(query));

// Now query the data
var result = try conn.query(@constCast(
    "SELECT URL, COUNT(*) FROM parquet_data " ++
    "GROUP BY URL ORDER BY COUNT(*) DESC LIMIT 10"
));

// Access query statistics
std.debug.print("Elapsed time: {d}ms\n", .{result.elapsedTime()});
std.debug.print("Rows read: {d}\n", .{result.rowsRead()});
std.debug.print("Bytes read: {d}\n", .{result.bytesRead()});

Connection Options

The ChdbConnectionOptions struct allows you to configure how the connection behaves:

  • UseMultiQuery - Enable support for multiple queries in a single statement
  • Path - File path for persistent storage (omit for in-memory database)
  • LogLevel - Set logging verbosity (e.g., "debug", "info")
  • CustomArgs - Pass additional command-line arguments to chdb

API Overview

Executing Queries

  • execute() - Run a query and discard the result
  • query() - Run a query and return results
  • queryStreaming() - Stream large result sets

Working with Results

After executing a query, you get a ChdbResult object with the following methods:

  • iter() - Get an iterator over result rows (NDJSON format)
  • isSuccess() - Check if the query succeeded
  • getError() - Retrieve error message if query failed
  • elapsedTime() - Query execution time
  • rowsRead() - Number of rows processed
  • bytesRead() - Number of bytes read
  • storageRowsRead() - Rows read from disk
  • storageBytesRead() - Bytes read from disk

Iterator Methods

The iterator returned by result.iter() provides several methods for traversing and processing query results. For optimal performance with multiple allocations, use an arena allocator:

var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();

var iter = result.iter(arena.allocator());

Basic Iteration:

  • nextRow() - Get next row as a raw slice (zero-copy, no allocation)
  • nextAs(T) - Get next row parsed as type T (allocates)
  • rowCount() - Total number of rows in result
  • reset() - Reset iterator to beginning
  • rowAt(index) - dGet row at specific inex (zero-copy)
  • maxMemoryUsage() - Maximum memory needed for results

Batch Operations:

  • takeOwned(count) - Take next N rows (allocates owned slice)
  • takeAsOwned(T, count) - Take next N rows parsed as type T
  • sliceOwned(start, end) - Get rows in range as owned slice
  • sliceAsOwned(T, start, end) - Get rows in range parsed as type T
  • selectOwned(predicate) - Filter rows with predicate function
  • selectAsOwned(T, predicate) - Filter and parse rows with predicate

Arena Allocators:

Methods ending in Owned or using generic type parsing (As) perform allocations. For best performance, pass an arena allocator to iter(). This allows all allocations to be freed in a single call to deinit(), rather than fragmenting memory with individual allocations:

// Good - all allocations freed together
var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();

var iter = result.iter(arena.allocator());
const rows = try iter.takeAsOwned(User, 100);
// rows is valid until arena.deinit()

Without an arena, allocations may fragment memory and you'll need to manage cleanup individually.

Memory Management

This library uses Zig's allocator pattern. You should always defer cleanup:

var gpa: std.heap.GeneralPurposeAllocator(.{}) = .{};
const allocator = gpa.allocator();
defer _ = gpa.deinit();

const conn = try chdb_zig.initConnection(allocator, options);
defer conn.deinit();

var result = try conn.query(@constCast(query));
defer result.deinit();

Installation

Add as a Dependency

Add chdb-zig to your build.zig.zon:

.dependencies = .{
    .chdb_zig = .{
        .url = "https://github.com/s0und0fs1lence/chdb-zig/archive/refs/tags/0.0.4.tar.gz",
        .hash = "12200c7a3c6b8e9f1d2a3b4c5d6e7f8g9h0i1j2k3l4m5n6o7p8q9r0s1t2u3v4w5x6y7z8",
    },
},

Configure in build.zig

In your build.zig, add the dependency to your executable:

const chdb_dep = b.dependency("chdb_zig", .{
    .target = target,
    .optimize = optimize,
});

// Get the module from the dependency
const chdb_module = chdb_dep.module("chdb_zig");
chdb_module.link_libc = true;

// Add the module to your executable's imports
exe.root_module.addImport("chdb_zig", chdb_module);

Now you can import and use chdb-zig in your code:

const chdb_zig = @import("chdb_zig");

Contributing

Contributions are welcome. Feel free to open issues or submit pull requests.

License

Licensed under the Apache License, Version 2.0. See the LICENSE file for details.

About

Zig wrapper for chdb

Resources

License

Stars

Watchers

Forks

Sponsor this project

 

Packages

No packages published

Languages

  • Zig 100.0%