Backport bug fixes for 9.0.2#3129
Merged
Merged
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 9.0 #3129 +/- ##
=======================================
Coverage 72.61% 72.61%
=======================================
Files 128 128
Lines 69972 69989 +17
=======================================
+ Hits 50810 50826 +16
- Misses 19162 19163 +1
🚀 New features to boost your workflow:
|
ranshid
reviewed
Jan 29, 2026
enjoy-binbin
approved these changes
Jan 30, 2026
GitHub has deprecated older macOS runners, and macos-13 is no longer supported. 1. The latest version of cross-platform-actions/action does allow running on ubuntu-latest (Linux runner) and does not strictly require macOS. 2. Previously, cross-platform-actions/action@v0.22.0 used runs-on: macos-13. I checked the latest version of cross-platform-actions, and the official examples now use runs-on: ubuntu. I think we can switch from macOS to Ubuntu. --------- Signed-off-by: Vitah Lin <vitahlin@gmail.com>
…ey-io#2945) Previously, only changes to argv[0] were considered when deciding whether to re-prepare the command and recompute the cluster slot. This caused issues where changes to the argument count (argc) would not trigger the necessary prepare in case additional keys were injected to the command This fix adds a check to ensure that changes to argc are properly considered. Signed-off-by: Patrik Hermansson <phermansson@gmail.com>
…ey-io#2944) When import-mode is yes, we might be able to set an expired TTL. At the same time, commands like EXPIREAT/EXPIRE do not restrict TTL from being negative. After we set import-mode to no, server will crash at: ``` int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) { long long t = objectGetExpire(val); serverAssert(t >= 0); ``` In this case, we restrict ttl from being negative in expireGenericCommand, we simply change the expiration time to 0 to mark the key as expired since in import-mode, the import-source client can always read the expired keys anyway. import-mode was introduced in valkey-io#1185 --------- Signed-off-by: cjx-zar <jxchenczar@foxmail.com>
…-io#2983) There is a crash in freeReplicationBacklog: ``` Discarding previously cached primary state. ASSERTION FAILED 'listLength(server.replicas) == 0' is not true freeReplicationBacklog ``` The reason is that during dual channel operation, the RDB channel is protected. In the chained replica case, `disconnectReplicas` is called to disconnect all replica clients, but since the RDB channel is protected, `freeClient` does not actually free the replica client. Later, we encounter an assertion failure in `freeReplicationBacklog`. ``` void replicationAttachToNewPrimary(void) { /* Replica starts to apply data from new primary, we must discard the cached * primary structure. */ serverAssert(server.primary == NULL); replicationDiscardCachedPrimary(); /* Cancel any in progress imports (we will now use the primary's) */ clusterCleanSlotImportsOnFullSync(); disconnectReplicas(); /* Force our replicas to resync with us as well. */ freeReplicationBacklog(); /* Don't allow our chained replicas to PSYNC. */ } ``` Dual channel replication was introduced in valkey-io#60. Signed-off-by: Binbin <binloveplay1314@qq.com>
…n and data corruption (valkey-io#3004) When loading AOF in cluster mode, keys inside a MULTI/EXEC block could be inserted into wrong hash slots, causing key duplication and data corruption. The root cause was the slot caching optimization in getKeySlot(). This optimization reuses a cached slot value to avoid recalculating the hash for every key operation. However, when replaying AOF, a transaction may contain commands affecting keys in different slots. The cached slot from a previous command (e.g., SET k1) would incorrectly be used for subsequent commands in the transaction (e.g., SET k0), causing k0 to be stored in k1's slot. The existing code already skipped this optimization for replicated clients (commands from primary) using isReplicatedClient(). This change extends that to also skip for AOF clients by using mustObeyClient() instead, which covers both replicated clients and the AOF client. Fixes valkey-io#2995, introduced in valkey-io#1949. Signed-off-by: aditya.teltia <teltia.aditya22@gmail.com>
…erhead (valkey-io#3005) The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually overflowed negative value), as reported in valkey-io#2994. ## Double-Counted database memory When server starts, the global variable `server.initial_memory_usage` is used to record a memory baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database memory, since databases are created in initServer. In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes initial database memory. And then all extra memory usage of databases are added to mem_total. The initial database memory are therefore counted TWICE. This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys, the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`. ## Missed Empty Databases In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation: ```c if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue; ``` However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(), including `hashtable_size_index`, which can be larger than 128 KiB. On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase (for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion. Signed-off-by: Ace Breakpoint <chemistudio@gmail.com> Signed-off-by: bpint <chemistudio@gmail.com> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Makes our tests possible to run with TCL 9. The latest Fedora now has TCL 9.0 and it's working now, including the TCL TLS package. (This wasn't working earlier due to some packaging errors for TCL packages in Fedora, which have been fixed now.) This PR also removes the custom compilation of TCL 8 used in our Daily jobs and uses the system default TCL version instead. The TCL version depends on the OS. For the latest Fedora, you get 9.0, for macOS you get 8.5 and for most other OSes you get 8.6. The checks for TCL 8.7 are removed, because 8.7 doesn't exist. It was never released. --------- Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
…copy-avoidance (valkey-io#3046) In releaseBufReferences(), clusterSlotStatsAddNetworkBytesOutForSlot() is called in the IO thread after writing the reply to the client, and (since valkey-io#2652) it's also done in the normal code path in the main thread (afterCommand() -> clusterSlotStatsAddNetworkBytesOutForUserClient()). This means the output bytes are recorded twice in the cluster slot stats. Fixes valkey-io#3043. Bug was introduced in valkey-io#2652 (9.0.1). Signed-off-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
When using XREAD STREAMS <stream> + on an empty stream created with MKSTREAM, valkey returns an error instead of nil. This happens because is missing a check on the stream length. The fix adds the length check on the condition. Fixes valkey-io#2728 Signed-off-by: diego-ciciani01 <diego.ciciani@gmail.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
…alkey-io#3086) Currently after valkey-io#2652 in copy avoidance path we unconditionally track `c->net_output_bytes_curr_cmd` even when `commandlog-request-larger-than -1`. This PR provides ability to skip that accounting in copy avoidance path based on config value. If `commandlog-request-larger-than -1` then performance is same as before valkey-io#2652. Also added tracking of `c->net_output_bytes_curr_cmd` in IO thread if it is not tracked in main thread based on decision made in main thread. Read Performance (write performance is the same): ``` With this change: Summary: throughput summary: 2191732.75 requests per second latency summary (msec): avg min p50 p95 p99 max 1.720 0.072 1.743 1.919 2.647 23.983 Unstable: Summary: throughput summary: 1658197.25 requests per second latency summary (msec): avg min p50 p95 p99 max 2.299 0.120 2.343 2.503 3.319 4.791 Config: commandlog-request-larger-than -1 ^^ without this performance is just like on unstable branch databases 1 save "" appendonly no rdbcompression no activedefrag no maxclients 1000 io-threads 9 protected-mode no hz 10 maxmemory 2gb ``` --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>
This change amends valkey-io#3086 in which `commandlog-request-larger-than` was mixed up with `commandlog-reply-larger-than` while checking for config values and in tests. --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>
85f43fa to
eeba9e3
Compare
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
eeba9e3 to
529dad6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Includes release notes for these commits and for the bug fixes backported earlier in #3111 and #3132.