Skip to content

Nen-Co/nen-core

Repository files navigation

Nen Core

Zig Version License Build Status

High-performance, zero-allocation core library for the Nen ecosystem

Nen Core is a foundational library that provides optimized primitives, utilities, and data structures for building high-performance applications in Zig. Built with Data-Oriented Design (DOD) principles and optimized for ReleaseFast builds, it delivers exceptional performance across all critical paths.

πŸš€ Performance Highlights

Operation Performance Optimization
SIMD Operations 9,330M ops/sec ReleaseFast + Inline
Stack Allocators 125M allocations/sec Zero heap allocations
Fixed Allocators 937M operations/sec Zero-allocation patterns
Memory Pools 77M operations/sec High-performance block reuse
Fast Math 1,511M ops/sec Inline approximations
RNG Operations 370M ops/sec Inline generation

✨ Key Features

πŸ”₯ High-Performance Computing

  • SIMD-optimized operations for vector mathematics
  • Fast math approximations using bit manipulation
  • Zero-allocation patterns for maximum efficiency
  • ReleaseFast optimizations with 2-28x performance improvements

πŸ’Ύ Advanced Memory Management

  • Stack-backed allocators for ultra-fast temporary allocations
  • Fixed-size allocators for compile-time known sizes
  • Memory pools with high-performance block reuse
  • Batch allocators with automatic heap fallback

πŸ“Š Data-Oriented Design (DOD)

  • Cache-friendly memory layouts (Struct of Arrays)
  • Batch processing for maximum throughput
  • SIMD-optimized vector operations
  • Zero-copy data structures

🎯 TigerBeetle-Style Batching

  • High-performance batch processing with zero-allocation
  • Atomic batch commits for data consistency
  • Client-side automatic batching to reduce overhead
  • Pre-allocated message buffers for predictable performance

πŸ”§ Unified Ecosystem Foundation

  • Consolidated data types across all Nen projects
  • Unified error handling and version management
  • Common constants and configuration
  • Zero dependencies (except Zig toolchain)

πŸ“¦ Installation

Prerequisites

  • Zig 0.15.1 or later
  • No external dependencies required

Add to Your Project

Add this to your build.zig:

const nen_core = b.dependency("nen-core", .{
    .target = target,
    .optimize = optimize,
});

exe.addModule("nen-core", nen_core.module("nen-core"));

Or add as a submodule:

git submodule add https://github.com/your-org/nen-core.git

πŸš€ Quick Start

Basic Usage

const std = @import("std");
const nen_core = @import("nen-core");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // High-performance stack allocator
    var stack_arena = try nen_core.StackArena.init(allocator, 1024 * 1024);
    defer stack_arena.deinit();

    // Allocate temporary data
    const data = try stack_arena.alloc(u8, 1000);
    
    // SIMD-optimized vector operations
    const a = [_]f32{ 1.0, 2.0, 3.0, 4.0 };
    const b = [_]f32{ 2.0, 3.0, 4.0, 5.0 };
    var result = [_]f32{ 0.0, 0.0, 0.0, 0.0 };
    
    nen_core.SIMDOperations.addVectors(&a, &b, &result);
    
    // Fast math approximations
    const fast_exp = nen_core.FastMath.fastExp(2.0);
    const fast_ln = nen_core.FastMath.fastLn(2.0);
    
    // High-performance RNG
    var rng = nen_core.XorShift32.init(42);
    const random_value = rng.next();
    
    std.debug.print("Result: {any}\n", .{result});
    std.debug.print("Fast exp(2): {d}\n", .{fast_exp});
    std.debug.print("Random: {}\n", .{random_value});
}

TigerBeetle-Style Batching

const std = @import("std");
const nen_core = @import("nen-core");

pub fn batchExample() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // Create batch processor
    var processor = try nen_core.BatchProcessor.init(allocator);
    defer processor.deinit();

    // Add operations to batch
    try processor.addOperation(.data_write, "Hello, World!");
    try processor.addOperation(.data_read, "user:123");
    try processor.addOperation(.data_delete, "temp:456");

    // Execute batch atomically
    const result = try processor.executeBatch();
    
    if (result.err) |err| {
        std.debug.print("Batch failed: {}\n", .{err});
    } else {
        std.debug.print("Batch executed: {} operations\n", .{result.processed});
    }
}

Advanced Memory Management

const std = @import("std");
const nen_core = @import("nen-core");

pub fn memoryExample() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // Fixed-size stack allocator (zero-allocation)
    var fixed_alloc = nen_core.FixedStackAllocator(64 * 1024){};
    
    // Allocate different types
    const u8_data = try fixed_alloc.alloc(u8, 100);
    const f32_data = try fixed_alloc.alloc(f32, 50);
    const u64_data = try fixed_alloc.alloc(u64, 25);
    
    // Use the data...
    u8_data[0] = 42;
    f32_data[0] = 3.14;
    u64_data[0] = 12345;
    
    // Reset for reuse (ultra-fast)
    fixed_alloc.reset();
    
    // Memory pool for high-frequency allocations
    var pool = try nen_core.StackMemoryPool.init(allocator, 256, 1000);
    defer pool.deinit();
    
    // Allocate blocks from pool
    const block1 = try pool.allocBlock();
    const block2 = try pool.allocBlock();
    
    // Use blocks...
    block1[0] = 0xFF;
    block2[0] = 0xAA;
    
    // Free blocks back to pool
    pool.freeBlock(block1);
    pool.freeBlock(block2);
}

πŸ“š API Reference

Core Modules

Memory Management

  • StackArena - Ultra-fast stack-backed allocations
  • FixedStackAllocator(comptime size) - Zero-allocation for known sizes
  • BatchAllocator - Stack-first with heap fallback
  • StackMemoryPool - High-performance block allocation
  • StringAllocator - Zero-copy string management

SIMD Operations

  • SIMDOperations.addScalar() - Add scalar to vector
  • SIMDOperations.multiplyScalar() - Multiply vector by scalar
  • SIMDOperations.addVectors() - Element-wise vector addition
  • SIMDOperations.dotProduct() - Vector dot product
  • SIMDOperations.sum() - Sum all elements

Fast Math

  • FastMath.fastExp() - Fast exponential approximation
  • FastMath.fastLn() - Fast natural logarithm
  • FastMath.fastSqrt() - Fast square root
  • FastMath.fastSin() - Fast sine approximation
  • FastMath.fastCos() - Fast cosine approximation

Random Number Generation

  • XorShift32 - High-performance 32-bit RNG
  • SplitMix64 - High-quality 64-bit RNG
  • PCG32 - Permuted Congruential Generator

Batching System

  • BatchProcessor - TigerBeetle-style batch processing
  • BatchAPI - High-level batch interface
  • ClientBatcher - Automatic client-side batching

Data Types

  • DataType - Unified data type definitions
  • Shape - Tensor shape management
  • Backend - Compute backend types
  • QuantizationType - Quantization schemes

Performance Optimization

Build Configuration

# Maximum performance (production)
zig build -Doptimize=ReleaseFast

# Balanced performance and safety
zig build -Doptimize=ReleaseSafe

# Development with debug info
zig build -Doptimize=Debug

Inline Functions

All hot-path functions are strategically inlined for maximum performance:

  • SIMD operations in tight loops
  • Fast math approximations
  • Memory allocation/deallocation
  • RNG operations
  • Metrics recording

πŸ§ͺ Examples

Run Examples

# Numerical computing demo
zig build run-numerical

# Memory management demo
zig build run-memory

# Unified data types demo
zig build run-unified

# TigerBeetle batching demo
zig build run-batching

# Inline performance demo
zig build run-inline

# Release performance demo
zig build run-release

Performance Testing

# Run all tests
zig build test

# Run with ReleaseFast optimization
zig build -Doptimize=ReleaseFast test

# Run performance benchmarks
zig build -Doptimize=ReleaseFast run-inline
zig build -Doptimize=ReleaseFast run-release

πŸ—οΈ Architecture

Design Principles

  1. Data-Oriented Design (DOD)

    • Cache-friendly memory layouts
    • Batch processing for maximum throughput
    • Zero-allocation patterns where possible
  2. Zero Dependencies

    • Only depends on Zig standard library
    • No external C libraries or dependencies
    • Self-contained and portable
  3. Performance First

    • Strategic inlining of hot-path functions
    • ReleaseFast optimizations
    • Advanced allocator patterns
  4. Ecosystem Foundation

    • Consolidates common functionality
    • Eliminates code duplication
    • Provides unified interfaces

Module Structure

src/
β”œβ”€β”€ lib.zig                 # Main library file
β”œβ”€β”€ memory.zig              # Memory management
β”œβ”€β”€ simd.zig                # SIMD operations
β”œβ”€β”€ math.zig                # Fast math functions
β”œβ”€β”€ rng.zig                 # Random number generation
β”œβ”€β”€ batching.zig            # TigerBeetle-style batching
β”œβ”€β”€ advanced_allocators.zig # Advanced allocator patterns
β”œβ”€β”€ data_types.zig          # Unified data types
β”œβ”€β”€ version.zig             # Version management
β”œβ”€β”€ unified_errors.zig      # Error handling
β”œβ”€β”€ unified_constants.zig   # Common constants
β”œβ”€β”€ assertions.zig          # Assertion utilities
β”œβ”€β”€ constants.zig           # DOD constants
β”œβ”€β”€ errors.zig              # Error definitions
β”œβ”€β”€ layouts.zig             # Memory layouts
└── metrics.zig             # Performance metrics

πŸš€ Performance Benchmarks

ReleaseFast vs Debug Performance

Operation Debug ReleaseFast Speedup
SIMD Operations 335M ops/sec 9,330M ops/sec 28x
Fast Math 204M ops/sec 1,511M ops/sec 7.4x
RNG Operations 126M ops/sec 370M ops/sec 2.9x
Memory Management 50M ops/sec 490M ops/sec 9.8x

Allocator Performance Comparison

Allocator Performance Use Case
Stack Arena 125M allocations/sec Temporary data
Fixed Stack 937M operations/sec Known sizes
Batch Allocator 21M allocations/sec Mixed patterns
Memory Pool 77M operations/sec Block reuse
Heap Allocator 66M allocations/sec General purpose

🀝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

Development Setup

# Clone the repository
git clone https://github.com/your-org/nen-core.git
cd nen-core

# Run tests
zig build test

# Run examples
zig build run-numerical
zig build run-memory
zig build run-batching

# Run performance benchmarks
zig build -Doptimize=ReleaseFast run-inline

Code Style

  • Follow Zig's snake_case naming conventions
  • Use descriptive variable names (no _ for unused variables)
  • Keep functions under 70 lines
  • Add assertions for function arguments and return values
  • Use zig fmt for formatting
  • Maintain 100-column line limit

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • TigerBeetle - Inspiration for high-performance batching patterns
  • Zig Community - For the excellent language and ecosystem
  • Data-Oriented Design - For performance optimization principles

πŸ“ž Support


Built with ❀️ for high-performance computing in Zig

About

High-performance, zero-allocation core library for the Nen ecosystem. Built with Data-Oriented Design principles and optimized for ReleaseFast builds.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors