Add on-demand parameterized benchmark workflow for PRs#3433
Conversation
75691eb to
6fe33f5
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3433 +/- ##
============================================
- Coverage 76.77% 76.70% -0.07%
============================================
Files 162 162
Lines 80706 80706
============================================
- Hits 61962 61909 -53
- Misses 18744 18797 +53 🚀 New features to boost your workflow:
|
|
Will this work for set and zset commands like sadd and zadd? Were you able to test this with other data types? |
rainsupreme
left a comment
There was a problem hiding this comment.
tbh I'm not a fan of having so much code embedded in the workflow instead of separate script files, but perhaps this is fine for a MVP.
Yes it will work, we have tests for this on valkey-perf-benchmark repo
These are the scripts to get the merge commit hash from the PR as this WF does not run with the PR context. I dont think we will be re-using these. If we do, we can make them into utils script later |
hpatro
left a comment
There was a problem hiding this comment.
Few questions:
- What's the queue it goes into ? Does it hit any limit (max queue size) ?
- Could we also print out the other configs which a user might have overriden?
- Is this workflow limited to maintainers? How will other maintainers discover this capability?
| description: "Comma-separated commands to benchmark (e.g. SET,GET,HSET,LPUSH)" | ||
| required: false | ||
| type: string | ||
| default: "SET,GET" |
There was a problem hiding this comment.
Is this same as -t of valkey-benchmark ? I would prefer flexibility in the parameters to the valkey-benchmark underneath. Could I just feed the parameters of valkey-benchmark as is ? I would be able to control the data size / keyspace length / pipeline / etc ?
@rainsupreme would like to hear your thoughts on this.
There was a problem hiding this comment.
This is kept limited to the -t option of valkey-benchmarks. Righ now I have added only commands, cluster_mode, pipelines, io_threads, and data_size as parameters, but we can add more.
To keep the benchmark running for limited time keyspace length is contant at 3M and we are benchmarking for a constant time.
I will add a parameter to add custom configs to valkey server, if that what we are looking for.
There was a problem hiding this comment.
If it's equivalent to -t then we should perhaps call it "predefined commands" in the description, not just commands.
If we can give custom commands to valkey-benchmark, for example valkey-benchmark SET __rand_1st__ __rand_2nd__ ';' GET __rand_1st__ (set a key and then get the same key afterwards) it would be very powerful.
There was a problem hiding this comment.
Hey @zuiderkwast , it would need changes in the valkey-perf-benchmark package. Lets get this merged, I will add that feature and open another PR for that change
sarthakaggarwal97
left a comment
There was a problem hiding this comment.
Are there limits to data sizes and other parameters? I want to confirm if excessive load can be generated by increasing the datasize to an extremely high value. Ideally we should upper / lower bound the accepted values
52e2f56 to
69b1fe8
Compare
|
Addressed the review feedback in 69b1fe8: Input validation bounds (@sarthakaggarwal97):
The workflow now fails early with a clear error if any value is out of bounds. Display overridden configs in PR comment (@hpatro):
Only non-default overrides are displayed. server_cpu_range safety cap: |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new GitHub Actions workflow for on-demand PR benchmarking that resolves a PR, sets up a build environment on an ARM64 runner, generates parameterized benchmark configurations, executes comparative benchmarks, and reports results via artifact uploads and PR comments. ChangesOn-Demand PR Benchmark Workflow
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/benchmark-on-demand.yml:
- Around line 40-43: The workflow currently grants pull-requests: write and
issues: write at the top-level permissions; restrict them by leaving only
minimal workflow-level permission (e.g., contents: read) and move job-specific
permissions into the relevant jobs: add issues: write (and pull-requests:
read/write if needed) to the "benchmark" job so it can call
github.rest.issues.createComment, and add pull-requests: read to the
"resolve-pr" job so it can call github.rest.pulls.get; ensure job names
"benchmark" and "resolve-pr" are used to locate where to add these scoped
permissions.
- Around line 145-149: The workflow inlines github.event.inputs directly into
Bash/JS variables (COMMANDS, DATA_SIZE, IO_THREADS, PIPELINES, CLUSTER_MODE and
the JS vars commands/cluster/dataSize/ioThreads/pipelines/issue_number), which
is vulnerable to quote-injection; instead, move all user inputs into the job env
using toJson(...) (e.g. env: COMMANDS: ${{ toJson(github.event.inputs.commands)
}}), then parse them safely inside the Bash step (with jq or shell parsing from
the environment) and in github-script/Node by JSON.parse(process.env.COMMANDS)
or equivalent; update the steps that reference
COMMANDS/DATA_SIZE/IO_THREADS/PIPELINES/CLUSTER_MODE and the github-script
inputs to read from the parsed env variables rather than interpolating
github.event.inputs directly.
- Around line 71-73: The workflow assumes pr.data.merge_commit_sha is always
present but GitHub can return null for open PRs; add a guard after retrieving
pr.data: if merge_commit_sha is null/undefined, call core.setFailed with a clear
message (e.g. "merge_commit_sha unavailable; cannot checkout PR merge commit")
and exit/return to fail early instead of proceeding to setOutput and later using
the merge_commit_sha for the checkout; keep setting head_sha and base_ref as
before only after confirming merge_commit_sha exists so the checkout ref at line
85 won't later fail unexpectedly.
- Line 78: The workflow's timeout-minutes is set to 7200 which is far higher
than the benchmark's expected runtime (duration: 180s, warmup: 30s, and
invocation uses --runs 3); change the timeout-minutes value to a much tighter
bound (e.g., 60) to prevent stuck jobs from locking self-hosted runners—update
the timeout-minutes key in the workflow file accordingly while keeping the
existing benchmark duration/warmup and --runs settings intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 336de460-5fe2-4db1-b2ca-ca445d13e669
📒 Files selected for processing (1)
.github/workflows/benchmark-on-demand.yml
zuiderkwast
left a comment
There was a problem hiding this comment.
It's a good start! I suppose we can merge it now and extend it later.
5f75834 to
a382095
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
deps/linenoise/linenoise.c (1)
970-997:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEnsure CSI buffer is always NUL-terminated and only consume bytes when
read()returns 1In
deps/linenoise/linenoise.c(CSI parsing block around lines 970-997):
- The loop condition
read(l.ifd, &additionalChar, 1) != -1still enters onread()==0(EOF) and appends an undefinedadditionalChar.seqBufferis only NUL-terminated when a final byte is detected; if the loop exits via length limit/EOF/error, the subsequentstrcmp(seqBuffer, ...)can read past the valid bytes.🛠️ Proposed fix
- char seqBuffer[SEQ_BUFFER_MAX_LENGTH]; + char seqBuffer[SEQ_BUFFER_MAX_LENGTH] = {0}; int i = 0; seqBuffer[i++] = seq[1]; /* If first param is digit or ';', read more until we see a final in @~ */ char additionalChar; - while (i < SEQ_BUFFER_MAX_LENGTH-1 && read(l.ifd, &additionalChar, 1) != -1) { + while (i < SEQ_BUFFER_MAX_LENGTH - 1 && + read(l.ifd, &additionalChar, 1) == 1) { seqBuffer[i++] = additionalChar; if (additionalChar >= '@' && additionalChar <= '~') { /* CSI final byte */ - seqBuffer[i] = '\0'; break; } } + seqBuffer[i] = '\0';🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deps/linenoise/linenoise.c` around lines 970 - 997, The CSI parsing loop appends bytes even when read() returns 0 or -1 and only NUL-terminates seqBuffer when a CSI final is seen, which can lead to unterminated reads in the subsequent strcmp; change the loop that reads into seqBuffer to only append additionalChar when read(l.ifd, &additionalChar, 1) == 1, stop on non-1 return or when buffer limit is reached, and after the loop always NUL-terminate seqBuffer (ensure i < SEQ_BUFFER_MAX_LENGTH then set seqBuffer[i] = '\0'); keep the existing comparisons (strcmp(seqBuffer, "1;5D"), "1;3D", "1;5C", "1;3C", "3~") and call linenoiseEditMoveWordLeft/Right or linenoiseEditDelete as before.
♻️ Duplicate comments (3)
.github/workflows/benchmark-on-demand.yml (3)
71-73:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFail early when
merge_commit_shais missing.Open PRs can temporarily return no merge commit SHA; without a guard, checkout fails later with a less actionable error.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/benchmark-on-demand.yml around lines 71 - 73, Check for the presence of pr.data.merge_commit_sha before calling core.setOutput('merge_commit_sha', ...) and fail fast if it's missing: if pr.data.merge_commit_sha is falsy, log a clear error and exit (or set a workflow failure) so the later checkout step does not run with a missing merge SHA; update the block that currently calls core.setOutput('merge_commit_sha', pr.data.merge_commit_sha), core.setOutput('head_sha', pr.data.head.sha), core.setOutput('base_ref', pr.data.base.ref) to validate pr.data.merge_commit_sha first and short-circuit with an explicit error when absent.
40-43:⚠️ Potential issue | 🟠 Major | ⚡ Quick winScope write permissions per job instead of workflow-wide.
pull-requests: writeandissues: writeare broader than needed at top level; keep minimal global permissions and grant write only to the commenting job.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/benchmark-on-demand.yml around lines 40 - 43, The workflow currently sets broad top-level permissions in the permissions: block (pull-requests: write and issues: write); instead scope these by leaving only minimal global permissions (e.g., contents: read) at the top-level and move the pull-requests: write and issues: write entries into the specific job that needs them (the commenting job), by adding a permissions: subsection under that job and granting pull-requests: write and issues: write there so only that job has write access.
145-149:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid direct template interpolation of dispatch inputs in scripts.
Inlining
${{ github.event.inputs.* }}into Bash/JS is injection-prone on self-hosted runners. Pass inputs throughenvwithtoJson(...)and parse fromprocess.env/shell variables.Also applies to: 266-267, 275-277, 288-288
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/benchmark-on-demand.yml around lines 145 - 149, The workflow currently inlines github.event.inputs directly into shell variables (COMMANDS, DATA_SIZE, IO_THREADS, PIPELINES, CLUSTER_MODE), which is injection-prone on self-hosted runners; instead set these inputs into environment variables using toJson(...) in the workflow env block and then read/parse them from process.env or standard shell variables inside the script; update all occurrences that inline inputs (the shown block and the other places referenced) to use the env approach and parse JSON/strings at runtime to safely populate COMMANDS, DATA_SIZE, IO_THREADS, PIPELINES and CLUSTER_MODE before use.
🧹 Nitpick comments (4)
src/latency.c (1)
41-46: ⚡ Quick winAdd a rationale comment for the new destructor helper.
The new function is clear, but it should include a brief why-focused function comment per repository C guidelines.
♻️ Proposed change
+/* Free a latency-events dict entry where key, value, and entry are all heap-owned. + * This keeps ownership localized to the dict and avoids leaking on dictRelease/delete. */ static void dictEntryDestructorHeapKeyValue(void *entry) { dictEntry *de = entry; zfree(dictGetKey(de)); zfree(dictGetVal(de)); zfree(de); }As per coding guidelines "Document why code exists, not just what it does; document all functions in C code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/latency.c` around lines 41 - 46, Add a why-focused function comment above dictEntryDestructorHeapKeyValue explaining the purpose and rationale: that this helper is used to free dictEntry instances where both key and value were heap-allocated by this module (and the dictEntry itself was malloc'd), so it safely releases key, value, and the entry in one place and avoids duplication of free logic; follow the repository C comment convention (brief rationale, usage context, and any invariants/assumptions about ownership) and place it immediately before the dictEntryDestructorHeapKeyValue definition.src/valkey-cli.c (1)
391-408: ⚡ Quick winDocument the new entry-destructor ownership rules.
These helpers encode different key/value ownership contracts, so they need brief
/* ... */comments explaining why each one frees the key, the value, or both. That wiring is subtle enough to be easy to misapply later.As per coding guidelines "Document why code exists, not just what it does; document all functions in C code".
Also applies to: 9223-9227
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/valkey-cli.c` around lines 391 - 408, Add brief block comments above the three destructor helpers describing the ownership/ownership-transfer contract they implement: explain that dictEntryDestructorSdsKeyNoVal (dictEntryDestructorSdsKeyNoVal) frees only the SDS key because values are owned/managed elsewhere, dictEntryDestructorSdsVal (dictEntryDestructorSdsVal) frees only the SDS value because keys are shared or static, and dictEntryDestructorSdsKeyListVal (dictEntryDestructorSdsKeyListVal) frees the SDS key and releases the list value via listRelease because ownership of both was transferred to the dict; include why this choice was made (who owns the memory before/after insertion) and add the same comment pattern for the equivalent functions at the other occurrence (lines referenced in review 9223-9227).src/fuzzer_command_generator.c (1)
280-282: 💤 Low valueDead code:
configDictHashFunctionis no longer used.This function is orphaned after
configDictTypewas updated to usesdsHashinstead. Consider removing it.♻️ Suggested removal
-/* Dictionary type implementation for config entries */ -uint64_t configDictHashFunction(const void *key) { - return dictGenHashFunction(key, strlen(key)); -} -🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/fuzzer_command_generator.c` around lines 280 - 282, Remove the unused helper function configDictHashFunction since configDictType now uses sdsHash; delete the function definition uint64_t configDictHashFunction(const void *key) and any forward declaration or prototype for configDictHashFunction to avoid unused symbol warnings, and run a quick grep for "configDictHashFunction" to ensure no remaining references.src/server.c (1)
390-392: ⚡ Quick winMake these new helpers
static.These compare/hash/destructor helpers appear to be used only inside
src/server.c, so exporting them widens the linkage surface for no benefit. Please mark themstatic(andconst-qualify the seed input if you touch the signature).As per coding guidelines, "Use static keyword for file-local functions in C code".
Also applies to: 424-433, 533-590
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/server.c` around lines 390 - 392, Make the file-local helper functions static and tighten any parameter qualifiers: change dictSdsKeyCaseCompare to have internal linkage by adding the static keyword (i.e., static int dictSdsKeyCaseCompare(...)) and, if you modify a hash helper signature to accept a seed, mark the seed parameter const; apply the same static conversion to the other compare/hash/destructor helpers in this translation unit (the other helper functions in the same block referenced in the review) so they are file-local and use const where appropriate.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/benchmark-on-demand.yml:
- Around line 294-307: The cleanup step currently uses a broad pkill -f valkey
which can terminate unrelated Valkey processes on shared runners; modify the
workflow to only kill processes started by this job by capturing and storing the
PID(s) when launching valkey (e.g., echo $! > valkey.pid or append to
valkey.pids in the step that starts valkey) and then in the cleanup step read
and kill those PIDs (kill <pid> for each, checking existence) and remove the pid
file(s); alternatively, if you must match by name, scope the pattern to the
workspace path or the exact command line used (e.g., match the working directory
or unique args) instead of using pkill -f valkey, and retain the existing safe
exit-code handling and rm -rf cleanup for files like comparison.md and valkey*.
In `@src/dict.h`:
- Around line 183-187: dictFreeUnlinkedEntry currently calls
type->entryDestructor(de) unconditionally which can dereference a NULL function
pointer; modify dictFreeUnlinkedEntry to retrieve hashtableType *type =
hashtableGetType(d) and only invoke type->entryDestructor(de) if type and
type->entryDestructor are non-NULL (i.e., guard the call with if (type &&
type->entryDestructor) ) so types that do not provide an entryDestructor are
handled safely.
In `@src/server.h`:
- Around line 947-949: Update the comment for the server.dbnum field to reflect
that it represents the configured/maximum number of selectable databases (the
configured DB count), not a count of currently initialized DBs; mention that
code such as watchForKey(), selectDb(), and
initServer()/server.config_databases(_cluster) rely on dbnum as the
configured/max DB index bound and that clients index watched_keys_by_db by
c->db->id which is validated against server.dbnum. Ensure the comment near the
hashtable **watched_keys_by_db** declaration and the server.dbnum declaration
clarifies this semantics to avoid confusion about bounds safety.
In `@src/valkey-cli.c`:
- Around line 379-380: dictSdsHash currently calls sdslen(key) with a void
pointer; cast key to const_sds like dictSdsKeyCompare does (e.g., uint64_t
dictSdsHash(const void *key) { return dictGenHashFunction((const_sds)key,
sdslen((const_sds)key)); }) to satisfy sdslen's expected type and avoid implicit
UB. Also add short doc comments above the destructor functions
dictEntryDestructorSdsKeyNoVal, dictEntryDestructorSdsVal,
dictEntryDestructorSdsKeyListVal, and dictEntryDestructorTypeinfoVal that state
the ownership semantics and exactly which parts they free (e.g., whether they
free the key, the value, both, or a list of sds values, and whether callers must
free before removal) so callers know the freeing contract.
In `@tests/unit/multi.tcl`:
- Around line 391-418: The test changes the shared client r's selected DB (uses
"r select 0" and "r select 1") without restoring it, which can leak DB selection
into other tests; update the test to save and restore r's DB (or simply ensure
you call "r select 0" at the end) so the shared client is returned to its
original DB selection after the test finishes, and ensure any temporary
selection on rd0/rd1 is also cleaned up (references: the r client and rd0/rd1
watch/multi/exec sequence).
---
Outside diff comments:
In `@deps/linenoise/linenoise.c`:
- Around line 970-997: The CSI parsing loop appends bytes even when read()
returns 0 or -1 and only NUL-terminates seqBuffer when a CSI final is seen,
which can lead to unterminated reads in the subsequent strcmp; change the loop
that reads into seqBuffer to only append additionalChar when read(l.ifd,
&additionalChar, 1) == 1, stop on non-1 return or when buffer limit is reached,
and after the loop always NUL-terminate seqBuffer (ensure i <
SEQ_BUFFER_MAX_LENGTH then set seqBuffer[i] = '\0'); keep the existing
comparisons (strcmp(seqBuffer, "1;5D"), "1;3D", "1;5C", "1;3C", "3~") and call
linenoiseEditMoveWordLeft/Right or linenoiseEditDelete as before.
---
Duplicate comments:
In @.github/workflows/benchmark-on-demand.yml:
- Around line 71-73: Check for the presence of pr.data.merge_commit_sha before
calling core.setOutput('merge_commit_sha', ...) and fail fast if it's missing:
if pr.data.merge_commit_sha is falsy, log a clear error and exit (or set a
workflow failure) so the later checkout step does not run with a missing merge
SHA; update the block that currently calls core.setOutput('merge_commit_sha',
pr.data.merge_commit_sha), core.setOutput('head_sha', pr.data.head.sha),
core.setOutput('base_ref', pr.data.base.ref) to validate
pr.data.merge_commit_sha first and short-circuit with an explicit error when
absent.
- Around line 40-43: The workflow currently sets broad top-level permissions in
the permissions: block (pull-requests: write and issues: write); instead scope
these by leaving only minimal global permissions (e.g., contents: read) at the
top-level and move the pull-requests: write and issues: write entries into the
specific job that needs them (the commenting job), by adding a permissions:
subsection under that job and granting pull-requests: write and issues: write
there so only that job has write access.
- Around line 145-149: The workflow currently inlines github.event.inputs
directly into shell variables (COMMANDS, DATA_SIZE, IO_THREADS, PIPELINES,
CLUSTER_MODE), which is injection-prone on self-hosted runners; instead set
these inputs into environment variables using toJson(...) in the workflow env
block and then read/parse them from process.env or standard shell variables
inside the script; update all occurrences that inline inputs (the shown block
and the other places referenced) to use the env approach and parse JSON/strings
at runtime to safely populate COMMANDS, DATA_SIZE, IO_THREADS, PIPELINES and
CLUSTER_MODE before use.
---
Nitpick comments:
In `@src/fuzzer_command_generator.c`:
- Around line 280-282: Remove the unused helper function configDictHashFunction
since configDictType now uses sdsHash; delete the function definition uint64_t
configDictHashFunction(const void *key) and any forward declaration or prototype
for configDictHashFunction to avoid unused symbol warnings, and run a quick grep
for "configDictHashFunction" to ensure no remaining references.
In `@src/latency.c`:
- Around line 41-46: Add a why-focused function comment above
dictEntryDestructorHeapKeyValue explaining the purpose and rationale: that this
helper is used to free dictEntry instances where both key and value were
heap-allocated by this module (and the dictEntry itself was malloc'd), so it
safely releases key, value, and the entry in one place and avoids duplication of
free logic; follow the repository C comment convention (brief rationale, usage
context, and any invariants/assumptions about ownership) and place it
immediately before the dictEntryDestructorHeapKeyValue definition.
In `@src/server.c`:
- Around line 390-392: Make the file-local helper functions static and tighten
any parameter qualifiers: change dictSdsKeyCaseCompare to have internal linkage
by adding the static keyword (i.e., static int dictSdsKeyCaseCompare(...)) and,
if you modify a hash helper signature to accept a seed, mark the seed parameter
const; apply the same static conversion to the other compare/hash/destructor
helpers in this translation unit (the other helper functions in the same block
referenced in the review) so they are file-local and use const where
appropriate.
In `@src/valkey-cli.c`:
- Around line 391-408: Add brief block comments above the three destructor
helpers describing the ownership/ownership-transfer contract they implement:
explain that dictEntryDestructorSdsKeyNoVal (dictEntryDestructorSdsKeyNoVal)
frees only the SDS key because values are owned/managed elsewhere,
dictEntryDestructorSdsVal (dictEntryDestructorSdsVal) frees only the SDS value
because keys are shared or static, and dictEntryDestructorSdsKeyListVal
(dictEntryDestructorSdsKeyListVal) frees the SDS key and releases the list value
via listRelease because ownership of both was transferred to the dict; include
why this choice was made (who owns the memory before/after insertion) and add
the same comment pattern for the equivalent functions at the other occurrence
(lines referenced in review 9223-9227).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 7e5a6058-f40f-4083-8f81-f9ae13ddc6d5
📒 Files selected for processing (79)
.github/workflows/benchmark-on-demand.ymlMAINTAINERS.mdcmake/Modules/SourceFiles.cmakedeps/CMakeLists.txtdeps/Makefiledeps/libvalkey/CMakeLists.txtdeps/libvalkey/Makefiledeps/libvalkey/src/async.cdeps/libvalkey/src/cluster.cdeps/linenoise/Makefiledeps/linenoise/linenoise.csrc/Makefilesrc/cluster_legacy.csrc/commands.defsrc/commands.hsrc/commands/brpoplpush.jsonsrc/commands/bzpopmax.jsonsrc/commands/client-import-source.jsonsrc/commands/cluster-flushslot.jsonsrc/commands/cluster-getslotmigrations.jsonsrc/commands/cluster-replicate.jsonsrc/commands/commandlog-help.jsonsrc/commands/decrby.jsonsrc/commands/flushdb.jsonsrc/commands/hexpire.jsonsrc/commands/hexpireat.jsonsrc/commands/hexpiretime.jsonsrc/commands/hgetex.jsonsrc/commands/hpexpire.jsonsrc/commands/hpexpireat.jsonsrc/commands/hpexpiretime.jsonsrc/commands/hsetex.jsonsrc/commands/incrbyfloat.jsonsrc/commands/lolwut.jsonsrc/commands/ltrim.jsonsrc/commands/setnx.jsonsrc/commands/slowlog-help.jsonsrc/commands/spublish.jsonsrc/commands/srandmember.jsonsrc/commands/xautoclaim.jsonsrc/commands/xclaim.jsonsrc/commands/xlen.jsonsrc/config.csrc/crc16.csrc/defrag.csrc/dict.csrc/dict.hsrc/eval.csrc/expire.csrc/expire.hsrc/functions.csrc/fuzzer_command_generator.csrc/hashtable.csrc/hashtable.hsrc/latency.csrc/module.csrc/multi.csrc/object.csrc/rdb.csrc/scripting_engine.csrc/sentinel.csrc/server.csrc/server.hsrc/t_zset.csrc/unit/test_dict.cppsrc/unit/test_hashtable.cppsrc/unit/test_kvstore.cppsrc/unit/test_object.cppsrc/valkey-benchmark.csrc/valkey-cli.ctests/integration/corrupt-dump-fuzzer.tcltests/integration/dual-channel-replication.tcltests/unit/cluster/failover.tcltests/unit/cluster/failover2.tcltests/unit/cluster/faster-failover.tcltests/unit/cluster/replica-migration.tcltests/unit/maxmemory.tcltests/unit/memefficiency.tcltests/unit/multi.tcl
💤 Files with no reviewable changes (3)
- src/unit/test_dict.cpp
- src/dict.c
- src/expire.h
✅ Files skipped from review due to trivial changes (35)
- src/commands/commandlog-help.json
- src/commands/hgetex.json
- src/commands/cluster-flushslot.json
- src/commands/lolwut.json
- src/commands/hpexpiretime.json
- src/commands/client-import-source.json
- src/commands/slowlog-help.json
- src/commands/hpexpireat.json
- src/commands/flushdb.json
- src/commands/brpoplpush.json
- src/commands/decrby.json
- src/commands/hexpireat.json
- src/commands/cluster-getslotmigrations.json
- MAINTAINERS.md
- src/commands/bzpopmax.json
- src/commands/hexpire.json
- src/commands/xautoclaim.json
- src/t_zset.c
- src/commands/xlen.json
- deps/libvalkey/src/cluster.c
- deps/linenoise/Makefile
- src/hashtable.h
- src/commands/spublish.json
- src/commands/hsetex.json
- deps/libvalkey/src/async.c
- src/commands/hexpiretime.json
- deps/Makefile
- src/commands/setnx.json
- src/commands/hpexpire.json
- src/commands/srandmember.json
- src/commands/xclaim.json
- src/commands/ltrim.json
- src/commands/cluster-replicate.json
- src/commands/incrbyfloat.json
- src/commands.def
There was a problem hiding this comment.
♻️ Duplicate comments (3)
.github/workflows/benchmark-on-demand.yml (3)
163-167:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid direct
${{ github.event.inputs.* }}interpolation inside Bash/JS scripts.Lines 163-167 and Lines 284-306 inline workflow inputs directly into executable script text. Even with maintainer-triggered dispatch, this is still a template-injection risk on self-hosted runners. Use
env+toJson(...)and parse from environment in Bash/Node.Safer pattern (condensed)
- name: Generate benchmark config + env: + INPUT_COMMANDS: ${{ toJson(github.event.inputs.commands) }} + INPUT_DATA_SIZE: ${{ toJson(github.event.inputs.data_size) }} + INPUT_IO_THREADS: ${{ toJson(github.event.inputs.io_threads) }} + INPUT_PIPELINES: ${{ toJson(github.event.inputs.pipelines) }} + INPUT_CLUSTER_MODE: ${{ toJson(github.event.inputs.cluster_mode) }} run: | - COMMANDS='${{ github.event.inputs.commands }}' + COMMANDS="$(jq -r . <<<"$INPUT_COMMANDS")" ... - name: Comment PR with results + env: + INPUT_PR_NUMBER: ${{ toJson(github.event.inputs.pr_number) }} + INPUT_COMMANDS: ${{ toJson(github.event.inputs.commands) }} + INPUT_CLUSTER_MODE: ${{ toJson(github.event.inputs.cluster_mode) }} + INPUT_DATA_SIZE: ${{ toJson(github.event.inputs.data_size) }} + INPUT_IO_THREADS: ${{ toJson(github.event.inputs.io_threads) }} + INPUT_PIPELINES: ${{ toJson(github.event.inputs.pipelines) }} with: script: | - const commands = '${{ github.event.inputs.commands }}'; + const commands = JSON.parse(process.env.INPUT_COMMANDS); ... - issue_number: ${{ github.event.inputs.pr_number }}, + issue_number: Number(JSON.parse(process.env.INPUT_PR_NUMBER)),Also applies to: 284-306
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/benchmark-on-demand.yml around lines 163 - 167, Replace direct interpolation of workflow inputs into script text (the variables COMMANDS, DATA_SIZE, IO_THREADS, PIPELINES, CLUSTER_MODE) with a safe environment-based pattern: set an env output using toJson(...) in the workflow YAML and then read/parse that JSON from the script (Bash or Node) instead of embedding ${ { github.event.inputs.* } } inline; update both the block that defines COMMANDS/DATA_SIZE/IO_THREADS/PIPELINES/CLUSTER_MODE and the later script region (lines around 284-306) to parse the same env JSON and assign local variables, ensuring no direct template injection into executable script content.
319-326:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not use broad
pkill -f valkeyon a shared self-hosted runner.Line 319 can terminate unrelated Valkey processes from other workflows on the same host. Prefer killing only PIDs started by this run (or, at minimum, scope matching to this workspace path).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/benchmark-on-demand.yml around lines 319 - 326, The pkill -f valkey call is too broad and may kill unrelated Valkey processes; instead modify the workflow script to track and kill only the PID(s) started by this run (or scope the match to the current workspace). When launching valkey, capture its PID (e.g., store $! into a file or variable) and later kill those PIDs (verify they belong to this workspace before killing), replacing the pkill -f valkey line and using the existing exit_code handling; alternatively, constrain any pattern used with pgrep/pkill to include the workspace path (e.g., match "valkey.*$GITHUB_WORKSPACE") so only processes from this run are targeted.
93-96:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove unnecessary
pull-requests: writefrom thebenchmarkjob permissions.The
benchmarkjob only posts the results to the PR viagithub.rest.issues.createComment, so it doesn’t needpull-requestswrite scope (that’s used elsewhere forgithub.rest.pulls.getwith read access).Suggested minimal change
permissions: contents: read - pull-requests: write issues: write🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/benchmark-on-demand.yml around lines 93 - 96, The benchmark job's permissions block grants unnecessary pull-requests: write; remove the "pull-requests: write" entry from the permissions section (leaving contents: read and issues: write) so that the job only has the minimal scopes needed for github.rest.issues.createComment; update the permissions object in the workflow where the keys permissions, contents, pull-requests, and issues are declared to delete the pull-requests line.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In @.github/workflows/benchmark-on-demand.yml:
- Around line 163-167: Replace direct interpolation of workflow inputs into
script text (the variables COMMANDS, DATA_SIZE, IO_THREADS, PIPELINES,
CLUSTER_MODE) with a safe environment-based pattern: set an env output using
toJson(...) in the workflow YAML and then read/parse that JSON from the script
(Bash or Node) instead of embedding ${ { github.event.inputs.* } } inline;
update both the block that defines
COMMANDS/DATA_SIZE/IO_THREADS/PIPELINES/CLUSTER_MODE and the later script region
(lines around 284-306) to parse the same env JSON and assign local variables,
ensuring no direct template injection into executable script content.
- Around line 319-326: The pkill -f valkey call is too broad and may kill
unrelated Valkey processes; instead modify the workflow script to track and kill
only the PID(s) started by this run (or scope the match to the current
workspace). When launching valkey, capture its PID (e.g., store $! into a file
or variable) and later kill those PIDs (verify they belong to this workspace
before killing), replacing the pkill -f valkey line and using the existing
exit_code handling; alternatively, constrain any pattern used with pgrep/pkill
to include the workspace path (e.g., match "valkey.*$GITHUB_WORKSPACE") so only
processes from this run are targeted.
- Around line 93-96: The benchmark job's permissions block grants unnecessary
pull-requests: write; remove the "pull-requests: write" entry from the
permissions section (leaving contents: read and issues: write) so that the job
only has the minimal scopes needed for github.rest.issues.createComment; update
the permissions object in the workflow where the keys permissions, contents,
pull-requests, and issues are declared to delete the pull-requests line.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 71b31b4f-47da-4b3e-89f8-fcb9d646ecc7
📒 Files selected for processing (1)
.github/workflows/benchmark-on-demand.yml
Adds a workflow_dispatch-triggered workflow that runs comparative benchmarks on any open PR with configurable inputs (commands, data sizes, io-threads, pipelines, cluster mode). Posts results back to the PR as a comment. Implements: valkey-io/valkey-perf-benchmark#52 Notes: - Permissions are scoped per-job (defense-in-depth): only the job that posts the comment has pull-requests:write + issues:write. - Input validation enforces sane bounds for data_size (1-1MB), io_threads (1-20), pipelines (1-1000). - timeout-minutes set to 420 (7h, ~1.3x expected runtime). - Fails fast with a clear message if the PR has no mergeable commit (conflict with base, or merge not yet computed). Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
3e926b4 to
f4e1240
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/benchmark-on-demand.yml:
- Around line 129-138: The workflow's "Install dependencies" step is missing the
jq package used later by the "Generate benchmark config" step; update the
install command in the Install dependencies step to include jq (add jq to the
sudo dnf install list) so the subsequent uses of jq in the Generate benchmark
config step (lines referenced in the review) will run without error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: d43e2b2b-5ad2-4593-8238-a99ed62e61ec
📒 Files selected for processing (1)
.github/workflows/benchmark-on-demand.yml
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Direct interpolation of ${{ github.event.inputs.* }} into bash and
JavaScript bodies meant a quote/escape character in an input could
break script quoting on the self-hosted runner. The risk is small
(workflow_dispatch is admin-only) but the pattern is fragile.
Move the inputs through job-level `env:` mappings instead, so the
runner passes them via environment variables rather than shell
substitution. The Bash step reads via "$INPUT_*", and the
github-script step reads via process.env.INPUT_*.
Addresses CodeRabbit review on PR valkey-io#3433.
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
PR valkey-io#3753 moved benchmark configs out of valkey/.github/benchmark_configs/ into valkey-perf-benchmark/configs/. Update the config path references in the Generate benchmark config and Run benchmarks steps to match. This aligns with how benchmark-on-label.yml references the config today. Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>

Description
Add a new
workflow_dispatchtriggered workflow (benchmark-on-demand.yml) that enables admin-triggered, parameterized benchmarking on any open PR.Implements: valkey-io/valkey-perf-benchmark#52
Problem
run-benchmark/run-cluster-benchmarklabels) only run a fixed SET/GET workloadSolution
A new workflow with configurable inputs:
pr_numbercommandsSET,GET,HSET,LPUSH)SET,GETdata_size16,96)16,96(from config)io_threads1,9)1,9(from config)pipelines1,10)1,10(from config)cluster_modefalseOther defaults from config:
duration=180,warmup=30,clients=1600,benchmark-threads=90,keyspacelen=3000000,runs=3.When
io_threadsis overridden,server_cpu_rangeis automatically adjusted to match.Multiple workflow runs can be queued for the same PR with different configs (no concurrency group).
How it works
resolve-prjob fetches the PR merge commit SHA, head SHA, and base refbenchmarkjob builds latestvalkey-benchmarkfrom unstable, overrides the ARM benchmark config with user-provided parameters, and runsbenchmark.pycomparing the PR commit against its base branchTesting