Skip to content

Replication streaming compression (per-replica) — review vs #3531 base#18

Open
roshkhatri wants to merge 13 commits into
streaming-compression-rio-prfrom
replication-streaming-compression-pr
Open

Replication streaming compression (per-replica) — review vs #3531 base#18
roshkhatri wants to merge 13 commits into
streaming-compression-rio-prfrom
replication-streaming-compression-pr

Conversation

@roshkhatri

Copy link
Copy Markdown
Owner

Isolates the per-replica replication-stream compression work on top of the streaming-compression-rio (valkey-io#3531) base.

Review-only (both branches in this fork) — shows just the replication-compression diff, separated from the valkey-io#3531 base.

roshkhatri and others added 13 commits May 28, 2026 00:52
Adds replication wire compression on top of valkey-io#3531 with lz4 as the first
supported codec for the incremental replication stream. The replication
stream from primary to replica is wrapped in a VKCS envelope (using
STREAM_KIND_REPL) and compressed as a single long-lived frame at the
per-replica buffer layer. Default behavior is unchanged with
'replcompression no'; existing replicas without the new capability stay
uncompressed.

Negotiation is per-replica via the existing PSYNC handshake; a new
REPLICA_CAPA_COMPRESSION capability lets each side opt in independently.
Compression runs inline on the IO thread that owns the replica's write
job; no dedicated compression thread, no IPC, no reordering. Optional
sticky thread affinity (lazy ownership + event-driven rebalance) keeps
the long-lived LZ4 frame state on a single IO thread for cache locality.

Configs:
  replcompression                   bool, default no
  repl-compression-thread-affinity  bool, default yes

Internal constants:
  REPLICA_CAPA_COMPRESSION         (1 << 4)
  REPL_COMPRESSION_ALGO            ALGO_LZ4
  REPL_COMPRESSION_LEVEL           0  (LZ4 fast mode)
  REPL_COMPRESSION_BATCH_LIMIT     1 MB raw input per dispatch
  REPL_STREAM_DECODER_OUTPUT_MAX   256 MB

INFO replication per-replica fields:
  compression=lz4, compressed_bytes, uncompressed_bytes,
  compression_ratio, compression_errors, compression_cpu_usec,
  debug_compression_pending_drains, debug_thread_switches

INFO replication server-level (replica side):
  repl_decompression_errors, repl_decompression_cpu_usec,
  repl_decompressed_bytes_total, repl_apply_cpu_usec,
  repl_apply_batches

CI adds a test-replication-compression job that runs the
replication-tagged integration tests with replcompression=yes to
exercise compression across the broader replication test surface.

Tests: 18 streamReader push-mode unit tests + 3 replCompression unit
tests + 27 integration tests.

Performance (BlockMesh tweets, 3M keys x ~315 byte JSON values, 1,073
MB uncompressed per replica, 30 clients, pipeline 50, 2 cross-region
replicas):
  LZ4 level 0 (default): 0.48 ratio, 52% bandwidth saved, 2.5s
                         compression CPU per replica, <1% throughput
                         overhead vs uncompressed baseline.
Affinity ON vs OFF: throughput unchanged (118.6K vs 118.1K keys/s) but
thread switches drop from ~800K to ~30 per replica.

ZSTD support follows in valkey-io#3798.

Related to valkey-io#3531.

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
# Conflicts:
#	src/io_threads.c
#	src/networking.c
#	src/replication.c
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
The compression CI job (--config replcompression yes) ran replication-buffer.tcl
and the dual-channel buffer-memory tests, which assert exact replication
buffer/backlog memory and byte volumes. Compression legitimately changes those
(per-replica codec buffers add ~1MB scratch; fewer wire bytes let replicas keep
up), so the assertions fail under compression even though replication is
correct. Drop the repl-compression tag from replication-buffer.tcl and the two
dual-channel blocks holding the memory tests; they still run uncompressed in the
regular job. Functional dual-channel coverage stays in the compression job.

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…io#3897)

## Summary

Fixes valkey-io#3008

> lets add assert checking that the object has a key in
dbUntrackKeyWithVolatileItems and dbTrackKeyWithVolatileItems to be able
to get a more explicit error in these cases

This is addressed in this PR.

Signed-off-by: ydsakshi <ydsakshi023@gmail.com>
clusterNode.shard_id is a fixed-size char[CLUSTER_NAMELEN] buffer
that is not guaranteed to be NUL-terminated, so it must be printed
with %.40s.

This was introduced in valkey-io#2510.

Signed-off-by: Binbin <binloveplay1314@qq.com>
…valkey-io#3941)

Since valkey-io#2449 made the failover delay relative to cluster-node-timeout.
Now delay = min(cluster-node-timeout / 30, 500), any cluster-node-timeout
below 30, including the legal minimum 0 will collapses delay to zero,
and `x % 0` is undefined behaviour.

Signed-off-by: Binbin <binloveplay1314@qq.com>
…thakaggarwal97/valkey into replication-streaming-compression-pr

# Conflicts:
#	src/compression_stream.c
#	src/compression_stream.h
#	src/config.c
#	src/rdb.c
#	src/server.h
#	src/unit/test_compression.cpp
#	tests/integration/rdb-compression.tcl
#	valkey.conf
…tate and plaintext passthrough

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…val (fix macOS -Werror)

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants