Nen Core

High-performance, zero-allocation core library for the Nen ecosystem

Nen Core is a foundational library that provides optimized primitives, utilities, and data structures for building high-performance applications in Zig. Built with Data-Oriented Design (DOD) principles and optimized for ReleaseFast builds, it delivers exceptional performance across all critical paths.

🚀 Performance Highlights

Operation	Performance	Optimization
SIMD Operations	9,330M ops/sec	ReleaseFast + Inline
Stack Allocators	125M allocations/sec	Zero heap allocations
Fixed Allocators	937M operations/sec	Zero-allocation patterns
Memory Pools	77M operations/sec	High-performance block reuse
Fast Math	1,511M ops/sec	Inline approximations
RNG Operations	370M ops/sec	Inline generation

✨ Key Features

🔥 High-Performance Computing

SIMD-optimized operations for vector mathematics
Fast math approximations using bit manipulation
Zero-allocation patterns for maximum efficiency
ReleaseFast optimizations with 2-28x performance improvements

💾 Advanced Memory Management

Stack-backed allocators for ultra-fast temporary allocations
Fixed-size allocators for compile-time known sizes
Memory pools with high-performance block reuse
Batch allocators with automatic heap fallback

📊 Data-Oriented Design (DOD)

Cache-friendly memory layouts (Struct of Arrays)
Batch processing for maximum throughput
SIMD-optimized vector operations
Zero-copy data structures

🎯 TigerBeetle-Style Batching

High-performance batch processing with zero-allocation
Atomic batch commits for data consistency
Client-side automatic batching to reduce overhead
Pre-allocated message buffers for predictable performance

🔧 Unified Ecosystem Foundation

Consolidated data types across all Nen projects
Unified error handling and version management
Common constants and configuration
Zero dependencies (except Zig toolchain)

📦 Installation

Prerequisites

Zig 0.15.1 or later
No external dependencies required

Add to Your Project

Add this to your build.zig:

const nen_core = b.dependency("nen-core", .{
    .target = target,
    .optimize = optimize,
});

exe.addModule("nen-core", nen_core.module("nen-core"));

Or add as a submodule:

git submodule add https://github.com/your-org/nen-core.git

🚀 Quick Start

Basic Usage

const std = @import("std");
const nen_core = @import("nen-core");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // High-performance stack allocator
    var stack_arena = try nen_core.StackArena.init(allocator, 1024 * 1024);
    defer stack_arena.deinit();

    // Allocate temporary data
    const data = try stack_arena.alloc(u8, 1000);
    
    // SIMD-optimized vector operations
    const a = [_]f32{ 1.0, 2.0, 3.0, 4.0 };
    const b = [_]f32{ 2.0, 3.0, 4.0, 5.0 };
    var result = [_]f32{ 0.0, 0.0, 0.0, 0.0 };
    
    nen_core.SIMDOperations.addVectors(&a, &b, &result);
    
    // Fast math approximations
    const fast_exp = nen_core.FastMath.fastExp(2.0);
    const fast_ln = nen_core.FastMath.fastLn(2.0);
    
    // High-performance RNG
    var rng = nen_core.XorShift32.init(42);
    const random_value = rng.next();
    
    std.debug.print("Result: {any}\n", .{result});
    std.debug.print("Fast exp(2): {d}\n", .{fast_exp});
    std.debug.print("Random: {}\n", .{random_value});
}

TigerBeetle-Style Batching

const std = @import("std");
const nen_core = @import("nen-core");

pub fn batchExample() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // Create batch processor
    var processor = try nen_core.BatchProcessor.init(allocator);
    defer processor.deinit();

    // Add operations to batch
    try processor.addOperation(.data_write, "Hello, World!");
    try processor.addOperation(.data_read, "user:123");
    try processor.addOperation(.data_delete, "temp:456");

    // Execute batch atomically
    const result = try processor.executeBatch();
    
    if (result.err) |err| {
        std.debug.print("Batch failed: {}\n", .{err});
    } else {
        std.debug.print("Batch executed: {} operations\n", .{result.processed});
    }
}

Advanced Memory Management

const std = @import("std");
const nen_core = @import("nen-core");

pub fn memoryExample() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // Fixed-size stack allocator (zero-allocation)
    var fixed_alloc = nen_core.FixedStackAllocator(64 * 1024){};
    
    // Allocate different types
    const u8_data = try fixed_alloc.alloc(u8, 100);
    const f32_data = try fixed_alloc.alloc(f32, 50);
    const u64_data = try fixed_alloc.alloc(u64, 25);
    
    // Use the data...
    u8_data[0] = 42;
    f32_data[0] = 3.14;
    u64_data[0] = 12345;
    
    // Reset for reuse (ultra-fast)
    fixed_alloc.reset();
    
    // Memory pool for high-frequency allocations
    var pool = try nen_core.StackMemoryPool.init(allocator, 256, 1000);
    defer pool.deinit();
    
    // Allocate blocks from pool
    const block1 = try pool.allocBlock();
    const block2 = try pool.allocBlock();
    
    // Use blocks...
    block1[0] = 0xFF;
    block2[0] = 0xAA;
    
    // Free blocks back to pool
    pool.freeBlock(block1);
    pool.freeBlock(block2);
}

📚 API Reference

Core Modules

Memory Management

StackArena - Ultra-fast stack-backed allocations
FixedStackAllocator(comptime size) - Zero-allocation for known sizes
BatchAllocator - Stack-first with heap fallback
StackMemoryPool - High-performance block allocation
StringAllocator - Zero-copy string management

SIMD Operations

SIMDOperations.addScalar() - Add scalar to vector
SIMDOperations.multiplyScalar() - Multiply vector by scalar
SIMDOperations.addVectors() - Element-wise vector addition
SIMDOperations.dotProduct() - Vector dot product
SIMDOperations.sum() - Sum all elements

Fast Math

FastMath.fastExp() - Fast exponential approximation
FastMath.fastLn() - Fast natural logarithm
FastMath.fastSqrt() - Fast square root
FastMath.fastSin() - Fast sine approximation
FastMath.fastCos() - Fast cosine approximation

Random Number Generation

XorShift32 - High-performance 32-bit RNG
SplitMix64 - High-quality 64-bit RNG
PCG32 - Permuted Congruential Generator

Batching System

BatchProcessor - TigerBeetle-style batch processing
BatchAPI - High-level batch interface
ClientBatcher - Automatic client-side batching

Data Types

DataType - Unified data type definitions
Shape - Tensor shape management
Backend - Compute backend types
QuantizationType - Quantization schemes

Performance Optimization

Build Configuration

# Maximum performance (production)
zig build -Doptimize=ReleaseFast

# Balanced performance and safety
zig build -Doptimize=ReleaseSafe

# Development with debug info
zig build -Doptimize=Debug

Inline Functions

All hot-path functions are strategically inlined for maximum performance:

SIMD operations in tight loops
Fast math approximations
Memory allocation/deallocation
RNG operations
Metrics recording

🧪 Examples

Run Examples

# Numerical computing demo
zig build run-numerical

# Memory management demo
zig build run-memory

# Unified data types demo
zig build run-unified

# TigerBeetle batching demo
zig build run-batching

# Inline performance demo
zig build run-inline

# Release performance demo
zig build run-release

Performance Testing

# Run all tests
zig build test

# Run with ReleaseFast optimization
zig build -Doptimize=ReleaseFast test

# Run performance benchmarks
zig build -Doptimize=ReleaseFast run-inline
zig build -Doptimize=ReleaseFast run-release

🏗️ Architecture

Design Principles

Data-Oriented Design (DOD)
- Cache-friendly memory layouts
- Batch processing for maximum throughput
- Zero-allocation patterns where possible
Zero Dependencies
- Only depends on Zig standard library
- No external C libraries or dependencies
- Self-contained and portable
Performance First
- Strategic inlining of hot-path functions
- ReleaseFast optimizations
- Advanced allocator patterns
Ecosystem Foundation
- Consolidates common functionality
- Eliminates code duplication
- Provides unified interfaces

Module Structure

src/
├── lib.zig                 # Main library file
├── memory.zig              # Memory management
├── simd.zig                # SIMD operations
├── math.zig                # Fast math functions
├── rng.zig                 # Random number generation
├── batching.zig            # TigerBeetle-style batching
├── advanced_allocators.zig # Advanced allocator patterns
├── data_types.zig          # Unified data types
├── version.zig             # Version management
├── unified_errors.zig      # Error handling
├── unified_constants.zig   # Common constants
├── assertions.zig          # Assertion utilities
├── constants.zig           # DOD constants
├── errors.zig              # Error definitions
├── layouts.zig             # Memory layouts
└── metrics.zig             # Performance metrics

🚀 Performance Benchmarks

ReleaseFast vs Debug Performance

Operation	Debug	ReleaseFast	Speedup
SIMD Operations	335M ops/sec	9,330M ops/sec	28x
Fast Math	204M ops/sec	1,511M ops/sec	7.4x
RNG Operations	126M ops/sec	370M ops/sec	2.9x
Memory Management	50M ops/sec	490M ops/sec	9.8x

Allocator Performance Comparison

Allocator	Performance	Use Case
Stack Arena	125M allocations/sec	Temporary data
Fixed Stack	937M operations/sec	Known sizes
Batch Allocator	21M allocations/sec	Mixed patterns
Memory Pool	77M operations/sec	Block reuse
Heap Allocator	66M allocations/sec	General purpose

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

Development Setup

# Clone the repository
git clone https://github.com/your-org/nen-core.git
cd nen-core

# Run tests
zig build test

# Run examples
zig build run-numerical
zig build run-memory
zig build run-batching

# Run performance benchmarks
zig build -Doptimize=ReleaseFast run-inline

Code Style

Follow Zig's snake_case naming conventions
Use descriptive variable names (no _ for unused variables)
Keep functions under 70 lines
Add assertions for function arguments and return values
Use zig fmt for formatting
Maintain 100-column line limit

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

TigerBeetle - Inspiration for high-performance batching patterns
Zig Community - For the excellent language and ecosystem
Data-Oriented Design - For performance optimization principles

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: API Docs

Built with ❤️ for high-performance computing in Zig

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
examples		examples
scripts		scripts
src		src
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
INLINE_OPTIMIZATION_STRATEGY.md		INLINE_OPTIMIZATION_STRATEGY.md
LICENSE		LICENSE
PERFORMANCE_OPTIMIZATION_SUMMARY.md		PERFORMANCE_OPTIMIZATION_SUMMARY.md
README.md		README.md
build.zig		build.zig

Folders and files

Latest commit

History

Repository files navigation

Nen Core

🚀 Performance Highlights

✨ Key Features

🔥 High-Performance Computing

💾 Advanced Memory Management

📊 Data-Oriented Design (DOD)

🎯 TigerBeetle-Style Batching

🔧 Unified Ecosystem Foundation

📦 Installation

Prerequisites

Add to Your Project

🚀 Quick Start

Basic Usage

TigerBeetle-Style Batching

Advanced Memory Management

📚 API Reference

Core Modules

Memory Management

SIMD Operations

Fast Math

Random Number Generation

Batching System

Data Types

Performance Optimization

Build Configuration

Inline Functions

🧪 Examples

Run Examples

Performance Testing

🏗️ Architecture

Design Principles

Module Structure

🚀 Performance Benchmarks

ReleaseFast vs Debug Performance

Allocator Performance Comparison

🤝 Contributing

Development Setup

Code Style

📄 License

🙏 Acknowledgments

📞 Support

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages