Building open-source tools for ML infrastructure, distributed systems, quantitative finance, and AI safety.
Focus areas: LLM serving & optimization · GPU compute · consensus protocols · market microstructure · AI safety & interpretability
| Project | Description |
|---|---|
| llm-inference-benchmark | Benchmark vLLM vs TGI vs TensorRT-LLM with realistic bursty traffic patterns |
| attention-kernel-cuda | Custom CUDA Flash Attention kernel for non-standard head dimensions |
| triton-inference-kernels | Fused Triton kernels for attention + FFN with autotuning for LLM serving |
| distributed-rlhf-trainer | Minimal distributed RLHF training loop with Ray + DeepSpeed |
| llm-eval-suite | Evaluation framework with LLM-as-judge and custom rubrics |
| token-streaming-proxy | High-perf SSE proxy for LLM APIs with backpressure handling |
| model-quantization-lab | GPTQ/AWQ/GGML quantization comparison with quality metrics |
| prompt-cache-engine | KV-cache sharing for prompt prefix deduplication |
| Project | Description |
|---|---|
| gpu-memory-profiler | Visual GPU memory profiler with leak detection for PyTorch |
| distributed-kv-store | Raft-consensus KV store optimized for ML metadata |
| container-gpu-scheduler | K8s operator for GPU-aware batch scheduling with bin-packing |
| data-pipeline-monitor | Real-time ML pipeline observability with statistical drift detection |
| zero-copy-tensor-ipc | Zero-copy tensor sharing across processes via shared memory |
| fault-tolerant-training | Elastic checkpoint/recovery for distributed training jobs |
| ml-feature-store | Point-in-time correct feature store with DuckDB + Arrow |
| Project | Description |
|---|---|
| orderbook-simulator | High-fidelity limit order book with microstructure modeling |
| alpha-signal-framework | Walk-forward backtesting with lookahead bias prevention |
| real-time-risk-engine | Portfolio VaR engine with Monte Carlo simulation and Greeks |
| market-data-lakehouse | High-throughput market data ingestion into columnar storage |
| llm-financial-agent | Multi-agent financial analysis with hallucination detection |
| low-latency-matching-engine | Price-time priority matching engine — cache-line alignment & seqlock |
| polymarket-hft | Prediction market HFT strategy with sub-second order execution |
| Project | Description |
|---|---|
| adversarial-prompt-suite | Red-teaming harness for systematic jailbreak & prompt injection testing |
| bpe-tokenizer | BPE tokenizer from scratch with vocabulary analysis and merge tracking |
| jax-transformer-impl | Transformer implementation in JAX/Flax with XLA-optimized training loop |
