Backport bug fixes for 9.0.2 by zuiderkwast · Pull Request #3129 · valkey-io/valkey

zuiderkwast · 2026-01-29T15:25:35Z

Includes release notes for these commits and for the bug fixes backported earlier in #3111 and #3132.

codecov · 2026-01-29T15:44:27Z

Codecov Report

❌ Patch coverage is 69.23077% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.61%. Comparing base (d26a103) to head (529dad6).
⚠️ Report is 12 commits behind head on 9.0.

Files with missing lines	Patch %	Lines
src/module.c	0.00%	7 Missing ⚠️
src/networking.c	78.26%	5 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##              9.0    #3129   +/-   ##
=======================================
  Coverage   72.61%   72.61%           
=======================================
  Files         128      128           
  Lines       69972    69989   +17     
=======================================
+ Hits        50810    50826   +16     
- Misses      19162    19163    +1

Files with missing lines	Coverage Δ
src/db.c	`93.18% <100.00%> (ø)`
src/expire.c	`97.83% <100.00%> (+0.01%)`	⬆️
src/object.c	`81.94% <100.00%> (ø)`
src/server.c	`88.43% <100.00%> (-0.11%)`	⬇️
src/t_stream.c	`93.38% <100.00%> (ø)`
src/networking.c	`88.39% <78.26%> (-0.07%)`	⬇️
src/module.c	`9.75% <0.00%> (-0.01%)`	⬇️

... and 13 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

GitHub has deprecated older macOS runners, and macos-13 is no longer supported. 1. The latest version of cross-platform-actions/action does allow running on ubuntu-latest (Linux runner) and does not strictly require macOS. 2. Previously, cross-platform-actions/action@v0.22.0 used runs-on: macos-13. I checked the latest version of cross-platform-actions, and the official examples now use runs-on: ubuntu. I think we can switch from macOS to Ubuntu. --------- Signed-off-by: Vitah Lin <vitahlin@gmail.com>

…ey-io#2945) Previously, only changes to argv[0] were considered when deciding whether to re-prepare the command and recompute the cluster slot. This caused issues where changes to the argument count (argc) would not trigger the necessary prepare in case additional keys were injected to the command This fix adds a check to ensure that changes to argc are properly considered. Signed-off-by: Patrik Hermansson <phermansson@gmail.com>

…ey-io#2944) When import-mode is yes, we might be able to set an expired TTL. At the same time, commands like EXPIREAT/EXPIRE do not restrict TTL from being negative. After we set import-mode to no, server will crash at: ``` int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) { long long t = objectGetExpire(val); serverAssert(t >= 0); ``` In this case, we restrict ttl from being negative in expireGenericCommand, we simply change the expiration time to 0 to mark the key as expired since in import-mode, the import-source client can always read the expired keys anyway. import-mode was introduced in valkey-io#1185 --------- Signed-off-by: cjx-zar <jxchenczar@foxmail.com>

…-io#2983) There is a crash in freeReplicationBacklog: ``` Discarding previously cached primary state. ASSERTION FAILED 'listLength(server.replicas) == 0' is not true freeReplicationBacklog ``` The reason is that during dual channel operation, the RDB channel is protected. In the chained replica case, `disconnectReplicas` is called to disconnect all replica clients, but since the RDB channel is protected, `freeClient` does not actually free the replica client. Later, we encounter an assertion failure in `freeReplicationBacklog`. ``` void replicationAttachToNewPrimary(void) { /* Replica starts to apply data from new primary, we must discard the cached * primary structure. */ serverAssert(server.primary == NULL); replicationDiscardCachedPrimary(); /* Cancel any in progress imports (we will now use the primary's) */ clusterCleanSlotImportsOnFullSync(); disconnectReplicas(); /* Force our replicas to resync with us as well. */ freeReplicationBacklog(); /* Don't allow our chained replicas to PSYNC. */ } ``` Dual channel replication was introduced in valkey-io#60. Signed-off-by: Binbin <binloveplay1314@qq.com>

…n and data corruption (valkey-io#3004) When loading AOF in cluster mode, keys inside a MULTI/EXEC block could be inserted into wrong hash slots, causing key duplication and data corruption. The root cause was the slot caching optimization in getKeySlot(). This optimization reuses a cached slot value to avoid recalculating the hash for every key operation. However, when replaying AOF, a transaction may contain commands affecting keys in different slots. The cached slot from a previous command (e.g., SET k1) would incorrectly be used for subsequent commands in the transaction (e.g., SET k0), causing k0 to be stored in k1's slot. The existing code already skipped this optimization for replicated clients (commands from primary) using isReplicatedClient(). This change extends that to also skip for AOF clients by using mustObeyClient() instead, which covers both replicated clients and the AOF client. Fixes valkey-io#2995, introduced in valkey-io#1949. Signed-off-by: aditya.teltia <teltia.aditya22@gmail.com>

…erhead (valkey-io#3005) The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually overflowed negative value), as reported in valkey-io#2994. ## Double-Counted database memory When server starts, the global variable `server.initial_memory_usage` is used to record a memory baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database memory, since databases are created in initServer. In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes initial database memory. And then all extra memory usage of databases are added to mem_total. The initial database memory are therefore counted TWICE. This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys, the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`. ## Missed Empty Databases In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation: ```c if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue; ``` However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(), including `hashtable_size_index`, which can be larger than 128 KiB. On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase (for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion. Signed-off-by: Ace Breakpoint <chemistudio@gmail.com> Signed-off-by: bpint <chemistudio@gmail.com> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>

Makes our tests possible to run with TCL 9. The latest Fedora now has TCL 9.0 and it's working now, including the TCL TLS package. (This wasn't working earlier due to some packaging errors for TCL packages in Fedora, which have been fixed now.) This PR also removes the custom compilation of TCL 8 used in our Daily jobs and uses the system default TCL version instead. The TCL version depends on the OS. For the latest Fedora, you get 9.0, for macOS you get 8.5 and for most other OSes you get 8.6. The checks for TCL 8.7 are removed, because 8.7 doesn't exist. It was never released. --------- Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

…copy-avoidance (valkey-io#3046) In releaseBufReferences(), clusterSlotStatsAddNetworkBytesOutForSlot() is called in the IO thread after writing the reply to the client, and (since valkey-io#2652) it's also done in the normal code path in the main thread (afterCommand() -> clusterSlotStatsAddNetworkBytesOutForUserClient()). This means the output bytes are recorded twice in the cluster slot stats. Fixes valkey-io#3043. Bug was introduced in valkey-io#2652 (9.0.1). Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

When using XREAD STREAMS <stream> + on an empty stream created with MKSTREAM, valkey returns an error instead of nil. This happens because is missing a check on the stream length. The fix adds the length check on the condition. Fixes valkey-io#2728 Signed-off-by: diego-ciciani01 <diego.ciciani@gmail.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

…alkey-io#3086) Currently after valkey-io#2652 in copy avoidance path we unconditionally track `c->net_output_bytes_curr_cmd` even when `commandlog-request-larger-than -1`. This PR provides ability to skip that accounting in copy avoidance path based on config value. If `commandlog-request-larger-than -1` then performance is same as before valkey-io#2652. Also added tracking of `c->net_output_bytes_curr_cmd` in IO thread if it is not tracked in main thread based on decision made in main thread. Read Performance (write performance is the same): ``` With this change: Summary: throughput summary: 2191732.75 requests per second latency summary (msec): avg min p50 p95 p99 max 1.720 0.072 1.743 1.919 2.647 23.983 Unstable: Summary: throughput summary: 1658197.25 requests per second latency summary (msec): avg min p50 p95 p99 max 2.299 0.120 2.343 2.503 3.319 4.791 Config: commandlog-request-larger-than -1 ^^ without this performance is just like on unstable branch databases 1 save "" appendonly no rdbcompression no activedefrag no maxclients 1000 io-threads 9 protected-mode no hz 10 maxmemory 2gb ``` --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>

This change amends valkey-io#3086 in which `commandlog-request-larger-than` was mixed up with `commandlog-reply-larger-than` while checking for config values and in tests. --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

zuiderkwast requested a review from ranshid January 29, 2026 15:25

github-actions Bot assigned zuiderkwast Jan 29, 2026

zuiderkwast requested a review from a team January 29, 2026 15:25

ranshid reviewed Jan 29, 2026

View reviewed changes

Comment thread 00-RELEASENOTES Outdated

Comment thread 00-RELEASENOTES Outdated

Comment thread 00-RELEASENOTES Outdated

Comment thread 00-RELEASENOTES Outdated

Comment thread 00-RELEASENOTES Outdated

enjoy-binbin approved these changes Jan 30, 2026

View reviewed changes

vitahlin and others added 11 commits January 30, 2026 12:25

zuiderkwast force-pushed the backport-9-0-2 branch 3 times, most recently from 85f43fa to eeba9e3 Compare February 3, 2026 10:57

Release notes and version file for 9.0.2

529dad6

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>

zuiderkwast force-pushed the backport-9-0-2 branch from eeba9e3 to 529dad6 Compare February 3, 2026 10:58

zuiderkwast merged commit 1ac4cfe into valkey-io:9.0 Feb 3, 2026
63 of 65 checks passed

zuiderkwast deleted the backport-9-0-2 branch February 3, 2026 13:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backport bug fixes for 9.0.2#3129

Backport bug fixes for 9.0.2#3129
zuiderkwast merged 12 commits into
valkey-io:9.0from
zuiderkwast:backport-9-0-2

zuiderkwast commented Jan 29, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jan 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Conversation

zuiderkwast commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

zuiderkwast commented Jan 29, 2026 •

edited

Loading

codecov Bot commented Jan 29, 2026 •

edited

Loading