Crabka is a Rust reimplementation of Apache Kafka. It speaks the Kafka wire protocol, stores records in Kafka-compatible log segments, runs metadata on KRaft, and is validated against the official JVM clients and command-line tooling.
Crabka is built for people who want Kafka-compatible streaming infrastructure without the JVM runtime: memory-safe Rust, async I/O, no ZooKeeper, no GC pauses, and a workspace that also includes native Rust clients, Schema Registry, a gRPC / Connect-RPC gateway, a Kubernetes operator, a partition rebalancer, and cross-cluster replication.
- Project Status
- Why Crabka
- Features
- Feature Compatibility
- Installation
- Quick Start
- Documentation
- Architecture
- Workspace Packages
- Development
- Performance
- Roadmap
- Contributing
- Security
- License
- Acknowledgements
Crabka is beta, pre-1.0 software. The workspace version is currently
0.3.7.
The Kafka-facing surface is broad: wire protocol, log storage, replication, KRaft metadata, authorization, quotas, tiered storage, transactions, consumer groups, share groups, Schema Registry, Kubernetes operation, and Rust clients are all implemented to meaningful depth and tested against JVM Kafka behavior.
The important caveat: Crabka is still greenfield infrastructure. It has no production users and does not yet promise on-disk compatibility across versions. Use it for evaluation, development, interoperability testing, and non-critical workloads while the project hardens.
- Kafka wire compatibility. Protocol codecs are generated from Apache Kafka
message schemas and checked byte-for-byte against
kafka-clients. - Works with JVM tooling. The acceptance suite drives tools such as
kafka-topics.sh,kafka-configs.sh,kafka-acls.sh,kafka-consumer-groups.sh,kafka-leader-election.sh, andkafka-reassign-partitions.shagainst a live Crabka broker. - Rust runtime. Crabka uses
tokio, forbids unsafe code across the workspace, and avoids JVM heap tuning and GC behavior. - KRaft-native. Metadata is stored in a native KRaft quorum; ZooKeeper mode is deliberately out of scope.
- Operations included. The repository ships a Kubernetes operator, Prometheus metrics, OTLP tracing, Helm charts, OCI images, and a Cruise-Control-style rebalancer.
- Rust ecosystem first-class. Native producer, consumer, admin, streams, schema-serde, gateway, connector, and replication crates live in the same workspace.
- Kafka wire protocol targeting the Apache Kafka 4.x schema surface.
- Byte-compatible record batches, log segments, indexes, transaction indexes, compaction, retention, JBOD, and tiered storage.
- KRaft metadata quorum, snapshots, dynamic voters, controller-only and broker-only process roles, replication, ISR maintenance, leader election, and partition reassignment.
- Idempotent and transactional producers, read-committed fetches, consumer groups, next-generation consumer protocol support, share groups, and Streams group task assignment.
- TLS/mTLS, SASL/PLAIN, SASL/SCRAM-SHA-256/512, SASL/OAUTHBEARER, SASL/GSSAPI/Kerberos, delegation tokens, ACLs, quotas, and an OPA authorizer bridge.
- Native Rust producer, consumer, admin, and KIP-1071 Streams clients.
- Confluent Schema Registry-compatible REST service.
- gRPC / Connect-RPC and HTTP gateway for Kafka topics.
- Connector framework SPI plus a Postgres logical-decoding source connector.
- Cross-cluster geo-replication service with MirrorMaker-2-compatible records.
- Kubernetes operator with Strimzi-style cluster resources.
- Helm charts for the operator, Schema Registry, and rebalancer.
- Multi-arch OCI images for broker, operator, Schema Registry, and benchmark driver.
- Prometheus metrics and OTLP distributed tracing.
- Benchmark harness for Crabka-vs-Strimzi comparisons.
Crabka's compatibility target is Kafka's wire, storage, and operational semantics. JVM implementation internals, ZooKeeper mode, and ZooKeeper-to-KRaft migration are not goals.
| Area | Status |
|---|---|
| Wire protocol and API version negotiation | Implemented |
| Kafka-compatible record batches, compression, and log segments | Implemented |
| KRaft metadata quorum and controller records | Implemented |
| Replication, ISR maintenance, leader election, and reassignment | Implemented |
| Idempotent and transactional produce / consume | Implemented |
| Classic and next-generation consumer groups | Implemented |
| Share groups / queues | Implemented |
| Tiered storage | Implemented, with segment-data JVM interop still being validated |
| TLS, SASL, delegation tokens, ACLs, and quotas | Implemented |
| Schema Registry-compatible REST service | Implemented |
| Kubernetes operator | Implemented, with some external listener surfaces still maturing |
| Rust Streams client | Partial versus the full JVM Kafka Streams library |
| Kafka Connect-equivalent runtime | Partial; connector SPI exists and continues to evolve |
| ZooKeeper mode and ZK-to-KRaft migration | Out of scope |
For the detailed per-KIP breakdown, see docs/KIP_MATRIX.md.
Crabka is a Rust workspace. The pinned toolchain lives in rust-toolchain.toml.
git clone https://github.com/robot-head/crabka.git
cd crabka
cargo build --workspaceTo install the local broker and CLI binaries:
cargo install --path crates/cli
cargo install --path crates/brokerCrabka publishes its Rust crates independently. For example:
cargo add crabka-client-producer
cargo add crabka-client-consumer
cargo add crabka-client-adminRelease images are published to both GHCR and Docker Hub:
docker pull ghcr.io/robot-head/crabka-broker:latest
docker pull robothead/crabka-broker:latestImage build, signing, SBOM, and attestation details are in packaging/README.md.
helm repo add crabka https://robot-head.github.io/crabka/charts
helm repo update
helm search repo crabkaChart usage and provenance verification are documented in charts/README.md.
Start a single local broker from the source tree:
export CRABKA_CLUSTER_ID=00000000-0000-0000-0000-000000000001
rm -rf target/crabka-data
cargo run -p crabka-cli --bin crabka -- format \
--log-dir target/crabka-data \
--cluster-id "$CRABKA_CLUSTER_ID" \
--standalone \
--node-id 1 \
--controller-listener 127.0.0.1:9093
cargo run -p crabka-broker --bin crabka-broker -- \
--log-dir target/crabka-data \
--cluster-id "$CRABKA_CLUSTER_ID" \
--broker-id 1 \
--listen-addr 127.0.0.1:9092In another shell, use normal Kafka tooling against the broker:
kafka-topics.sh \
--bootstrap-server 127.0.0.1:9092 \
--create \
--topic demo \
--partitions 1 \
--replication-factor 1
kafka-console-producer.sh \
--bootstrap-server 127.0.0.1:9092 \
--topic demo
kafka-console-consumer.sh \
--bootstrap-server 127.0.0.1:9092 \
--topic demo \
--from-beginningcrabka format is a one-time initialization step for an empty log directory. To
start over locally, stop the broker and remove target/crabka-data.
- docs.rs package documentation
- KIP implementation matrix
- Contributing guide
- Known issues
- Container image docs
- Helm chart docs
- Benchmark harness
- Project website
Crabka is organized as a Cargo workspace. The main runtime path is:
flowchart LR
clients[Kafka and Crabka clients] --> broker[crabka-broker]
broker --> log[Kafka-compatible log]
broker --> kraft[KRaft metadata quorum]
broker --> remote[Tiered storage]
broker --> metrics[Prometheus / OTLP]
operator[crabka-operator] --> broker
registry[crabka-schema-registry] --> broker
gateway[crabka-grpc-gateway] --> broker
rebalancer[crabka-rebalancer] --> broker
replicator[crabka-replicator] --> broker
| Layer | Key crates |
|---|---|
| Protocol and records | crabka-protocol, crabka-protocol-codegen, crabka-compression, crabka-records-legacy |
| Storage and metadata | crabka-log, crabka-metadata, crabka-raft, crabka-kraft-core, crabka-voters |
| Broker runtime | crabka-broker, crabka-authz, crabka-security, crabka-telemetry |
| Clients | crabka-client-core, crabka-client-producer, crabka-client-consumer, crabka-client-admin, crabka-client-streams |
| Services | crabka-schema-registry, crabka-grpc-gateway, crabka-replicator |
| Connect | crabka-connect, crabka-connect-derive, crabka-connect-postgres, crabka-schema-serde |
| Operations | crabka-cli, crabka-operator, crabka-rebalancer, crabka-bench-driver, crabka-docgen |
| Package | Purpose |
|---|---|
crabka-broker |
Kafka-compatible broker runtime |
crabka-cli |
Operator CLI, installed as crabka |
crabka-client-admin |
Admin client |
crabka-client-consumer |
Subscribe-style consumer client |
crabka-client-core |
Connection management and request dispatch |
crabka-client-producer |
Idempotent and transactional producer client |
crabka-client-streams |
KIP-1071 Streams client and runtime |
crabka-schema-registry |
Confluent Schema Registry-compatible service |
crabka-grpc-gateway |
gRPC / Connect-RPC and HTTP gateway |
crabka-operator |
Kubernetes operator |
crabka-rebalancer |
Cruise-Control-style partition rebalancer |
crabka-replicator |
Cross-cluster geo-replication service |
crabka-connect |
Connector framework SPI |
crabka-connect-postgres |
Postgres logical-decoding source connector |
crabka-schema-serde |
Confluent-compatible schema serdes |
crabka-protocol |
Kafka wire-protocol codec |
crabka-log |
Kafka-compatible log segment reader/writer |
crabka-raft |
KRaft metadata quorum |
crabka-remote-storage |
KIP-405 tiered-storage SPI |
crabka-remote-storage-topic |
Topic-backed remote-log metadata manager |
crabka-security |
TLS, SASL, SCRAM, OAuth, and Kerberos utilities |
crabka-authz |
Kafka ACL authorization evaluator |
- Rust toolchain from rust-toolchain.toml
- JDK 17 for JVM differential tests
- Docker or a compatible container runtime for integration tests that use Kafka containers
cargo build --workspace
cargo fmt --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspaceRun JVM-backed differential and acceptance tests:
(cd tools/oracle && ./gradlew installDist)
cargo test --workspace -- --include-ignoredRegenerate protocol code after editing Kafka schemas:
./tools/regenerate.sh
git diff crates/protocol/generatedMore contributor workflow details are in CONTRIBUTING.md.
The benchmark harness compares Crabka and Strimzi-managed Apache Kafka under the same Kubernetes resources and the same Kafka wire-protocol load driver.
Highlights from the current benchmark report:
- Crabka uses a low-hundreds-of-MiB working set where comparable Strimzi brokers are JVM-heap dominated.
- Small-record and
acks=allworkloads are competitive with, or faster than, the comparable Kafka setup. - Fetch responses use zero-copy
sendfile(2)where supported; Linux kTLS keeps encrypted fetches on the zero-copy path.
Read the full report: Crabka vs Strimzi on Kubernetes.
Near-term work focuses on production hardening and compatibility depth:
- More JVM interop coverage for edge-case protocol and storage behavior.
- Continued Kubernetes operator maturity.
- More complete Connect runtime and connector surfaces.
- Better documentation for deployment, security, and operations.
- Compatibility and upgrade testing as the project approaches 1.0.
Detailed implementation status lives in docs/KIP_MATRIX.md and the design notes under docs/superpowers/specs.
Contributions are welcome. Start with:
- Read CONTRIBUTING.md.
- Open an issue for substantial design or compatibility changes.
- Keep Kafka wire and behavior compatibility as the primary constraint.
- Run
cargo fmt --check,cargo clippy --workspace --all-targets -- -D warnings, and the relevant tests before opening a pull request.
Conventional commits are used by release-plz for automated versioning and
changelog generation.
Crabka includes authentication, authorization, TLS, mTLS, delegation-token, and OPA integration work, but the project is still beta infrastructure. Do not use it as the sole security boundary for critical production systems yet.
If you believe you have found a security vulnerability, please avoid posting exploit details in a public issue. Use GitHub private vulnerability reporting if it is enabled for the repository, or contact the maintainers privately through the repository owner.
Crabka is licensed under the Apache License, Version 2.0. See LICENSE and NOTICE.
Crabka is a derivative, compatibility-focused implementation of Apache Kafka protocols, record formats, and operational semantics. The project depends on the Apache Kafka schema corpus and JVM client/tool behavior as its compatibility oracle.
