fix: Multiple issues with MCP query_(rules/digests)#27
Conversation
renecannao
commented
Jan 22, 2026
- Fixed invalid memory accesses for tables:
- mcp_query_rules
- stats_mcp_query_rules
- stats_mcp_query_digests
- Fixed inactive 'mcp_query_rules' being loaded to runtime.
- Fixed hash computation in 'compute_mcp_digest'.
- Fixed invalid escaping during 'stats_mcp_query_digests' gen.
- Fixed digest generation for MCP arguments:
- SQL queries are now preserved using 'mysql_query_digest_and_first_comment'.
- TODO: Options for the tokenizer are right now hardcoded.
- Added initial testing and testing plan for MCP query_(rules/digests).
- TODO: Test finished on phase8. Timeouts destroy the MCP connection, leaving it unusable for subsequent queries this should be fixed for continuing testing.
- TODO: There are several limitations to fix in 'validate_readonly_query'. This reflect in some query hacks in the testing.
- 'SELECT' starting with comments (--) gets flagged as non-read.
- 'SELECT' must have a 'SELECT .* FROM' structure. While common, simple testing queries many times lack this form.
- Fixed invalid memory accesses for tables:
+ mcp_query_rules
+ stats_mcp_query_rules
+ stats_mcp_query_digests
- Fixed inactive 'mcp_query_rules' being loaded to runtime.
- Fixed hash computation in 'compute_mcp_digest'.
- Fixed invalid escaping during 'stats_mcp_query_digests' gen.
- Fixed digest generation for MCP arguments:
+ SQL queries are now preserved using
'mysql_query_digest_and_first_comment'.
+ TODO: Options for the tokenizer are right now hardcoded.
- Added initial testing and testing plan for MCP query_(rules/digests).
+ TODO: Test finished on phase8. Timeouts destroy the MCP
connection, leaving it unusable for subsequent queries this should
be fixed for continuing testing.
- TODO: There are several limitations to fix in
'validate_readonly_query'. This reflect in some query hacks in the
testing.
+ 'SELECT' starting with comments (--) gets flagged as non-read.
+ 'SELECT' must have a 'SELECT .* FROM' structure. While common,
simple testing queries many times lack this form.
|
Caution Review failedThe pull request is closed. Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds MCP argument fingerprinting that digests SQL into 64-bit hashes, exposes a default MySQL options helper, restricts runtime rule loading to active rules, converts digest stats ingestion to parameterized prepared statements, and introduces a multi-phase MCP query-rules test suite. Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(173,216,230,0.5)
participant Client
end
rect rgba(144,238,144,0.5)
participant MCP_Server
end
rect rgba(255,228,196,0.5)
participant ProxySQL_Admin
end
rect rgba(255,182,193,0.5)
participant MySQL_StatsDB
end
Client->>MCP_Server: send query request (JSON args)
MCP_Server->>ProxySQL_Admin: forward evaluation request
ProxySQL_Admin->>ProxySQL_Admin: call Discovery_Schema::fingerprint_mcp_args()
ProxySQL_Admin->>MySQL_StatsDB: SELECT runtime rules (WHERE active=1)
MySQL_StatsDB-->>ProxySQL_Admin: return runtime rules
ProxySQL_Admin->>ProxySQL_Admin: evaluate rule, compute digest (Hash64)
ProxySQL_Admin->>MySQL_StatsDB: DELETE from stats_mcp_query_digest[_reset] (dynamic)
ProxySQL_Admin->>MySQL_StatsDB: INSERT (prepared stmt, bound params per digest row)
MySQL_StatsDB-->>ProxySQL_Admin: OK / affected rows
ProxySQL_Admin-->>MCP_Server: evaluation result (allow/block/rewrite/timeout)
MCP_Server-->>Client: response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/Discovery_Schema.cpp (1)
3037-3079: Make fingerprint output JSON‑consistent (or clarify it’s pseudo‑JSON).Numbers/booleans use unquoted
?and arrays use[?], which isn’t valid JSON, while the comments/examples describe JSON output. If consumers parsedigest_text, this will break. Either quote placeholders or update docs/tests to state it’s not JSON.🔧 Option: emit valid JSON placeholders
- } else if (it.value().is_number() || it.value().is_boolean()) { - result += "?"; + } else if (it.value().is_number() || it.value().is_boolean()) { + result += "\"?\""; } else if (it.value().is_object()) { result += fingerprint_mcp_args(it.value()); } else if (it.value().is_array()) { - result += "[?]"; + result += "[\"?\"]"; } else { result += "null"; } } result += "}"; } else if (arguments.is_array()) { - result += "[?]"; + result += "[\"?\"]"; } else { - result += "?"; + result += "\"?\""; }
🤖 Fix all issues with AI agents
In `@scripts/mcp_rules_testing/test_mcp_query_rules_block.sh`:
- Around line 94-99: The exec_admin_silent function returns MySQL output with
column headers which breaks numeric comparisons (e.g., the hits variable used in
if [ "${hits}" -gt 0 ]). Update exec_admin_silent to add the mysql flags -B and
-N so it runs in batch mode with no headers (preserving current silent
redirection), ensuring raw numeric values are returned for downstream
comparisons; modify the mysql invocation inside exec_admin_silent accordingly.
In `@scripts/mcp_rules_testing/test_phase1_crud.sh`:
- Around line 16-20: The script defines an unused color variable YELLOW in the
colors block; remove the YELLOW variable definition (the line setting
YELLOW='\033[1;33m') from scripts/mcp_rules_testing/test_phase1_crud.sh so only
the actually used color variables (RED, GREEN, NC) remain, or alternatively use
YELLOW where intended—prefer deleting the unused YELLOW to satisfy ShellCheck.
In `@scripts/mcp_rules_testing/test_phase2_load_save.sh`:
- Around line 72-74: The DELETE against runtime_mcp_query_rules can fail under
set -e and abort the script; modify the cleanup so the runtime DELETE is
non‑fatal (e.g., run exec_admin_silent for "DELETE FROM
runtime_mcp_query_rules..." in a guarded/conditional way or swallow its exit
status) and/or remove the runtime DELETE and rely on deleting from
mcp_query_rules plus issuing the LOAD ... TO RUNTIME step; update the commands
that call exec_admin_silent (the two DELETE invocations) so the runtime table
deletion cannot cause the script to hard-fail.
In `@scripts/mcp_rules_testing/test_phase3_runtime.sh`:
- Around line 84-86: Remove the unused shell variables INITIAL_COUNT and
ACTIVE_COUNT created from calling count_rules; locate the assignments where
INITIAL_COUNT=$(count_rules "runtime_mcp_query_rules") and
ACTIVE_COUNT=$(count_rules "runtime_mcp_active_rules") and delete those lines
(and any other occurrences of those two variables) so ShellCheck warnings are
resolved and no-unused counters remain.
- Around line 119-128: The test currently only ensures IDs 101/102/103 are
present in IDS but not that they are ordered; change the assertion after IDS is
populated (variable IDS, query on runtime_mcp_query_rules) to verify the
sequence/order rather than mere presence — e.g., split IDS into lines or capture
the ordered SELECT output and compare it to the expected ordered sequence (101
then 102 then 103) and call run_test with true only if the exact ordering
matches, otherwise call run_test false.
In `@scripts/mcp_rules_testing/test_phase4_stats.sh`:
- Around line 182-190: The test currently aborts on a read-only INSERT/DELETE
because of set -e and the else branch always reports success; modify the block
around exec_admin_silent "INSERT INTO stats_mcp_query_rules…" and the DELETE so
they cannot cause the script to exit (e.g., temporarily disable errexit with set
+e / set -e around those calls or append || true) and then correct the assertion
so the else branch calls run_test "T4.9: Stats table is read-only (insert
ignored)" false (fail) instead of true; keep using INSERT_CHECK to determine
success and ensure INSERT_CHECK default handling remains as shown.
In `@scripts/mcp_rules_testing/test_phase5_digest.sh`:
- Around line 325-327: The variable DIGEST_TEXT_CHECK is assigned by calling
exec_admin_silent but never used, causing Shellcheck SC2034; remove the unused
assignment to DIGEST_TEXT_CHECK (the line that calls exec_admin_silent for
digest_text) and keep the subsequent COUNT_BEFORE retrieval and log_verbose call
intact, ensuring only the necessary exec_admin_silent invocation for count_star
remains; reference DIGEST_TEXT_CHECK and exec_admin_silent in your edit to
locate the unused variable.
In `@scripts/mcp_rules_testing/test_phase8_eval_timeout.sh`:
- Around line 190-203: The test currently treats a non-error response as a pass;
update the branch that handles non-timeout responses (the else branch after
checking echo "${response}" | grep -q '"isError":true') so it logs the "did NOT
time out" message via log_verbose and returns a failure (return 1) instead of
returning 0; keep the existing branches that check expected_error_substring and
their return values intact.
- Around line 218-226: Remove the unconditional exit that aborts the script
before timeout tests: locate the block where sql, payload, response are set and
mcp_request is called (variables "sql", "payload", "response" and function
"mcp_request"), delete or comment out the stray "exit 1" so the script continues
to run the subsequent timeout tests, and ensure any intended debug logging
remains via "log_verbose" rather than forcing process termination.
| # Initial count of runtime rules (excluding test rules) | ||
| INITIAL_COUNT=$(count_rules "runtime_mcp_query_rules") | ||
|
|
There was a problem hiding this comment.
Remove unused counters flagged by Shellcheck.
INITIAL_COUNT and ACTIVE_COUNT aren’t used.
🧹 Cleanup
- INITIAL_COUNT=$(count_rules "runtime_mcp_query_rules")
...
- ACTIVE_COUNT=$(exec_admin_silent "SELECT COUNT(*) FROM mcp_query_rules WHERE active = 1 AND rule_id > 100;")Also applies to: 147-149
🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 85-85: INITIAL_COUNT appears unused. Verify use (or export if used externally).
(SC2034)
🤖 Prompt for AI Agents
In `@scripts/mcp_rules_testing/test_phase3_runtime.sh` around lines 84 - 86,
Remove the unused shell variables INITIAL_COUNT and ACTIVE_COUNT created from
calling count_rules; locate the assignments where INITIAL_COUNT=$(count_rules
"runtime_mcp_query_rules") and ACTIVE_COUNT=$(count_rules
"runtime_mcp_active_rules") and delete those lines (and any other occurrences of
those two variables) so ShellCheck warnings are resolved and no-unused counters
remain.
|
@gemini-code-assist help |
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. |
|
/gemini review |
There was a problem hiding this comment.
Pull request overview
This PR fixes multiple correctness issues in MCP query rules/digest handling (invalid memory access, runtime loading of inactive rules, digest computation/escaping) and adds a set of phased shell scripts + a test plan for manual validation.
Changes:
- Fix MCP digest stats generation by switching to parameterized inserts and correcting resultset sizing.
- Ensure only active
mcp_query_rulesare loaded into runtime and adjust MCP digest hashing/fingerprinting (including SQL-aware digesting ofsqlarguments). - Add a phased manual test suite (phases 1–8) plus example SQL rules and a written test plan.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/mcp_rules_testing/test_phase8_eval_timeout.sh | Adds Phase 8 timeout evaluation test script |
| scripts/mcp_rules_testing/test_phase7_eval_rewrite.sh | Adds Phase 7 rewrite evaluation test script |
| scripts/mcp_rules_testing/test_phase6_eval_block.sh | Adds Phase 6 block evaluation test script |
| scripts/mcp_rules_testing/test_phase5_digest.sh | Adds Phase 5 digest stats test script |
| scripts/mcp_rules_testing/test_phase4_stats.sh | Adds Phase 4 rule-hit stats test script |
| scripts/mcp_rules_testing/test_phase3_runtime.sh | Adds Phase 3 runtime table test script |
| scripts/mcp_rules_testing/test_phase2_load_save.sh | Adds Phase 2 LOAD/SAVE command test script |
| scripts/mcp_rules_testing/test_phase1_crud.sh | Adds Phase 1 CRUD test script |
| scripts/mcp_rules_testing/test_mcp_query_rules_block.sh | Adds standalone block-action E2E test runner |
| scripts/mcp_rules_testing/rules/block_rule.sql | Adds sample block rules used by tests |
| scripts/mcp_rules_testing/claude-test-plan.md | Documents phased testing plan and coverage goals |
| lib/ProxySQL_Admin_Stats.cpp | Fixes stats digest table population (prepared statements + correct table targeting) |
| lib/ProxySQL_Admin.cpp | Prevents inactive MCP query rules from being loaded to runtime |
| lib/Discovery_Schema.cpp | Fixes result sizing, digest hashing, and SQL-aware fingerprinting for MCP args |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
lib/Discovery_Schema.cpp
Outdated
| // the same fingerprint. | ||
| // | ||
| // SQL Handling: For arguments where key is "sql", the value is replaced by a | ||
| // digest generated using mysql_query_digest_and_first_comment_2 instead of "?". |
There was a problem hiding this comment.
The comment says the SQL value is digested using mysql_query_digest_and_first_comment_2, but the code calls mysql_query_digest_and_first_comment. Please update the comment to match the implementation (or switch to _2 if that's what you intended).
| // digest generated using mysql_query_digest_and_first_comment_2 instead of "?". | |
| // digest generated using mysql_query_digest_and_first_comment instead of "?". |
lib/Discovery_Schema.cpp
Outdated
| options get_def_mysql_opts() { | ||
| options opts {}; | ||
|
|
||
| opts.lowercase = false; | ||
| opts.replace_null = true; | ||
| opts.replace_number = false; | ||
| opts.grouping_limit = 3; | ||
| opts.groups_grouping_limit = 1; | ||
| opts.keep_comment = false; | ||
| opts.max_query_length = 65000; | ||
|
|
||
| return opts; | ||
| } |
There was a problem hiding this comment.
get_def_mysql_opts() is introduced as a non-static free function in this translation unit. To avoid exporting an unscoped symbol and potential name collisions, consider making it static/anonymous namespace, or a Discovery_Schema private static helper.
| sql="SELECT * FROM (SELECT 0 AS ID) t1" | ||
| payload=$(cat <<EOF | ||
| {"jsonrpc":"2.0","method":"tools/call","params":{"name":"run_sql_readonly","arguments":{"sql":"${sql}"}},"id":1} | ||
| EOF | ||
| ) | ||
| response=$(mcp_request "query" "${payload}") | ||
| log_verbose "Response: ${response}" | ||
| exit 1 |
There was a problem hiding this comment.
The script exits unconditionally after the initial MCP request, so none of the Phase 8 timeout tests or cleanup run. Please remove this debug block (sql/payload/response + exit 1) or guard it behind a flag (e.g., VERBOSE/DEBUG) so the test suite can execute end-to-end.
| sql="SELECT * FROM (SELECT 0 AS ID) t1" | |
| payload=$(cat <<EOF | |
| {"jsonrpc":"2.0","method":"tools/call","params":{"name":"run_sql_readonly","arguments":{"sql":"${sql}"}},"id":1} | |
| EOF | |
| ) | |
| response=$(mcp_request "query" "${payload}") | |
| log_verbose "Response: ${response}" | |
| exit 1 | |
| # Optional debug block: perform a single MCP request and exit early if enabled. | |
| # Enable by setting DEBUG_PHASE8_EVAL_TIMEOUT=1 in the environment. | |
| if [ "${DEBUG_PHASE8_EVAL_TIMEOUT:-0}" = "1" ]; then | |
| sql="SELECT * FROM (SELECT 0 AS ID) t1" | |
| payload=$(cat <<EOF | |
| {"jsonrpc":"2.0","method":"tools/call","params":{"name":"run_sql_readonly","arguments":{"sql":"${sql}"}},"id":1} | |
| EOF | |
| ) | |
| response=$(mcp_request "query" "${payload}") | |
| log_verbose "Response: ${response}" | |
| exit 1 | |
| fi |
| # This is not necessarily a failure - query may have been fast enough | ||
| return 0 |
There was a problem hiding this comment.
test_is_timed_out currently returns success even when the query does not time out (return 0 in the non-error branch). For a test that asserts a timeout, this will produce false positives. Please make the non-timeout path return failure (or add an explicit mode/parameter controlling whether a timeout is required).
| # This is not necessarily a failure - query may have been fast enough | |
| return 0 | |
| # For this test, lack of timeout is a failure condition | |
| return 1 |
| if [ "${INSERT_CHECK:-0}" -eq 0 ]; then | ||
| run_test "T4.9: Stats table is read-only (insert ignored)" true | ||
| else | ||
| run_test "T4.9: Stats table is read-only (insert ignored)" true |
There was a problem hiding this comment.
This test always reports success regardless of whether the INSERT into stats_mcp_query_rules took effect (both branches call run_test ... true). The else branch should fail the test so the read-only guarantee is actually validated.
| run_test "T4.9: Stats table is read-only (insert ignored)" true | |
| run_test "T4.9: Stats table is read-only (insert ignored)" false |
| bash -c "[ $(exec_admin_silent 'SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 103 AND username = "testuser"') -eq 1 ]" | ||
|
|
||
| # T6.5: Block specific schema | ||
| log_info "T6.5: Schema-based filtering (rule 104 created for 'testdb')" | ||
| run_test "T6.5: Schema rule exists in runtime" \ | ||
| bash -c "[ $(exec_admin_silent 'SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 104 AND schemaname = "testdb"') -eq 1 ]" |
There was a problem hiding this comment.
These runtime checks use double quotes for SQL string literals (e.g., username = "testuser"). In SQLite this is identifier-quoting and can fail depending on compile flags (DQS). Use single quotes for string literals, and adjust shell quoting accordingly.
| bash -c "[ $(exec_admin_silent 'SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 103 AND username = "testuser"') -eq 1 ]" | |
| # T6.5: Block specific schema | |
| log_info "T6.5: Schema-based filtering (rule 104 created for 'testdb')" | |
| run_test "T6.5: Schema rule exists in runtime" \ | |
| bash -c "[ $(exec_admin_silent 'SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 104 AND schemaname = "testdb"') -eq 1 ]" | |
| bash -c "[ $(exec_admin_silent \"SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 103 AND username = 'testuser'\") -eq 1 ]" | |
| # T6.5: Block specific schema | |
| log_info "T6.5: Schema-based filtering (rule 104 created for 'testdb')" | |
| run_test "T6.5: Schema rule exists in runtime" \ | |
| bash -c "[ $(exec_admin_silent \"SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 104 AND schemaname = 'testdb'\") -eq 1 ]" |
| const options def_opts { get_def_mysql_opts() }; | ||
| char* digest = mysql_query_digest_and_first_comment( | ||
| sql_value.c_str(), | ||
| sql_value.length(), | ||
| NULL, // first_comment - not needed | ||
| NULL, // buffer - not needed | ||
| &def_opts |
There was a problem hiding this comment.
mysql_query_digest_and_first_comment expects a non-null char** fst_cmnt (it dereferences it in the tokenizer). Passing NULL here can segfault when the SQL contains a comment. Use a local char* first_comment=nullptr; and pass &first_comment (and free it if allocated) or switch to an API that doesn't require the first-comment output.
- Fix Discovery_Schema.cpp fingerprint output JSON consistency - Quote placeholders (e.g., "?" instead of ?) for valid JSON - Fix test_mcp_query_rules_block.sh exec_admin_silent function - Add -B and -N flags for batch mode with no headers - Remove unused YELLOW variable from test_phase1_crud.sh - Fix test_phase2_load_save.sh runtime DELETE to be non-fatal - Remove unused INITIAL_COUNT from test_phase3_runtime.sh - Fix test_phase3_runtime.sh to verify exact ID ordering - Fix test_phase4_stats.sh read-only test assertion - Make INSERT/DELETE non-fatal and fix assertion logic - Remove unused DIGEST_TEXT_CHECK from test_phase5_digest.sh - Fix test_phase8_eval_timeout.sh non-timeout response handling - Return failure for non-timeout responses - Remove stray exit 1 from test_phase8_eval_timeout.sh
Resolved merge conflict in lib/ProxySQL_Admin_Stats.cpp: - Kept v3.1-MCP2_QR version with proper NULL handling - Uses stmt_unique_ptr (RAII smart pointer) - Checks for NULL before calling atoll() on numeric fields - Discarded v3.1-vec version which lacked NULL checks The merged code correctly handles NULL values for: - run_id - count_star - first_seen - last_seen - sum_time - min_time - max_time
|
@gemini-code-assist review |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@lib/Discovery_Schema.cpp`:
- Around line 3050-3061: The JSON string building around result and digest
currently only escapes backslashes and double quotes and must be extended to
fully escape JSON control characters: in the loop that iterates "for (const
char* p = digest; *p; p++)" treat each byte as an unsigned char, emit "\\\\" for
'\\', "\\\"" for '"', "\\n" for newline, "\\r" for carriage return, "\\t" for
tab, and for any value less than 0x20 format and append a "\\u%04x" Unicode
escape; otherwise append the character as-is—keep freeing digest unchanged and
ensure the escaping uses the unsigned char value when checking c < 0x20 to
correctly handle control bytes.
♻️ Duplicate comments (3)
lib/Discovery_Schema.cpp (3)
2975-2987: Consider makingget_def_mysql_opts()static or placing it in an anonymous namespace.This free function is introduced in the translation unit without linkage specifier, which exports the symbol and risks name collisions.
3041-3048: Passing NULL forfirst_commentmay cause a segfault.
mysql_query_digest_and_first_commentdereferences thefst_cmntpointer in the tokenizer. Passing NULL here can segfault when the SQL contains a comment. Use a localchar* first_comment = nullptr;and pass&first_comment(free if allocated).
3020-3023: Comment mentionsmysql_query_digest_and_first_comment_2but code callsmysql_query_digest_and_first_comment.Please update the comment to match the implementation, or switch to
_2if that was the intended function.
Address coderabbitai review - implement full JSON escaping for SQL digest:
- Handle backslash (\) and double quote (")
- Handle control characters: newline (\n), carriage return (\r), tab (\t)
- Handle other control characters (U+0000 through U+001F) with \uXXXX escapes
This ensures digest_text in stats_mcp_query_digest is always valid JSON,
preventing parsing errors for consumers of this data.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| # T7.1: Rewrite SQL with replace_pattern - SELECT * to known string | ||
| log_info "Creating rule 100: Rewrite SELECT * FROM customers to known string" | ||
| exec_admin_silent "INSERT INTO mcp_query_rules (rule_id, active, match_pattern, replace_pattern, apply) VALUES (100, 1, 'SELECT\s+\\*\s+FROM\s+customers', 'SELECT \"PHASE7_REWRITTEN\" AS result FROM (SELECT 0) t1', 1);" >/dev/null 2>&1 |
There was a problem hiding this comment.
The regex in match_pattern uses \s without escaping the backslash for the SQL string literal. In MySQL string literals, \s will be interpreted/consumed, so the regex sent to ProxySQL likely won't contain the intended whitespace matcher and the rule won't match. Escape backslashes consistently (e.g., use \s and \* etc.) so the stored pattern is correct.
| exec_admin_silent "INSERT INTO mcp_query_rules (rule_id, active, match_pattern, replace_pattern, apply) VALUES (100, 1, 'SELECT\s+\\*\s+FROM\s+customers', 'SELECT \"PHASE7_REWRITTEN\" AS result FROM (SELECT 0) t1', 1);" >/dev/null 2>&1 | |
| exec_admin_silent "INSERT INTO mcp_query_rules (rule_id, active, match_pattern, replace_pattern, apply) VALUES (100, 1, 'SELECT\\s+\\\\*\\s+FROM\\s+customers', 'SELECT \"PHASE7_REWRITTEN\" AS result FROM (SELECT 0) t1', 1);" >/dev/null 2>&1 |
| 2. **Shell Test**: `scripts/mcp/test_mcp_query_rules_block.sh` - Tests block action | ||
| 3. **SQL Rules**: `scripts/mcp/rules/block_rule.sql` - Sample block rules |
There was a problem hiding this comment.
The referenced existing test infrastructure paths appear outdated for this repo structure (e.g., scripts/mcp/...); in this PR the scripts live under scripts/mcp_rules_testing/.... Please update these paths so the test plan is actionable.
| 2. **Shell Test**: `scripts/mcp/test_mcp_query_rules_block.sh` - Tests block action | |
| 3. **SQL Rules**: `scripts/mcp/rules/block_rule.sql` - Sample block rules | |
| 2. **Shell Test**: `scripts/mcp_rules_testing/test_mcp_query_rules_block.sh` - Tests block action | |
| 3. **SQL Rules**: `scripts/mcp_rules_testing/rules/block_rule.sql` - Sample block rules |
| # Actual username filtering requires authentication context. | ||
| log_info "T6.4: Username-based filtering (rule 103 created - requires auth context to fully test)" | ||
| run_test "T6.4: Username rule exists in runtime" \ | ||
| bash -c "[ $(exec_admin_silent 'SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 103 AND username = "testuser"') -eq 1 ]" | ||
|
|
||
| # T6.5: Block specific schema | ||
| log_info "T6.5: Schema-based filtering (rule 104 created for 'testdb')" | ||
| run_test "T6.5: Schema rule exists in runtime" \ | ||
| bash -c "[ $(exec_admin_silent 'SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 104 AND schemaname = "testdb"') -eq 1 ]" | ||
|
|
There was a problem hiding this comment.
The embedded SQL string is incorrectly quoted: the outer bash -c string is double-quoted, but the inner SQL literal also uses double quotes around testuser, which terminates the bash string and makes the script syntactically invalid. Use consistent quoting (e.g., keep SQL inside single quotes with escaped quotes, or wrap the whole bash -c argument in single quotes and escape the inner ones).
| # Actual username filtering requires authentication context. | |
| log_info "T6.4: Username-based filtering (rule 103 created - requires auth context to fully test)" | |
| run_test "T6.4: Username rule exists in runtime" \ | |
| bash -c "[ $(exec_admin_silent 'SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 103 AND username = "testuser"') -eq 1 ]" | |
| # T6.5: Block specific schema | |
| log_info "T6.5: Schema-based filtering (rule 104 created for 'testdb')" | |
| run_test "T6.5: Schema rule exists in runtime" \ | |
| bash -c "[ $(exec_admin_silent 'SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 104 AND schemaname = "testdb"') -eq 1 ]" | |
| # Note: This test depends on the user context. For now, we test that the rule exists. | |
| # Actual username filtering requires authentication context. | |
| log_info "T6.4: Username-based filtering (rule 103 created - requires auth context to fully test)" | |
| run_test "T6.4: Username rule exists in runtime" \ | |
| bash -c "[ $(exec_admin_silent \"SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 103 AND username = 'testuser'\") -eq 1 ]" | |
| # T6.5: Block specific schema | |
| log_info "T6.5: Schema-based filtering (rule 104 created for 'testdb')" | |
| run_test "T6.5: Schema rule exists in runtime" \ | |
| bash -c "[ $(exec_admin_silent \"SELECT COUNT(*) FROM runtime_mcp_query_rules WHERE rule_id = 104 AND schemaname = 'testdb'\") -eq 1 ]" |
| local endpoint="$1" | ||
| local payload="$2" | ||
|
|
||
| curl -k -s -X POST "$(get_endpoint_url "${endpoint}")" \ |
There was a problem hiding this comment.
The mcp_request helper uses curl -k to contact the MCP endpoint, which disables TLS certificate verification and allows a network attacker to perform man‑in‑the‑middle attacks against these requests. An attacker on the network path could intercept or modify MCP JSON‑RPC traffic (including SQL statements and results), leading to data exposure or unauthorized query execution. Use curl with full certificate verification (removing -k and configuring trusted CAs or a debug-only toggle) so tests exercise secure behavior by default.
| curl -k -s -X POST "$(get_endpoint_url "${endpoint}")" \ | |
| # Build curl arguments with secure defaults | |
| local url | |
| url="$(get_endpoint_url "${endpoint}")" | |
| local curl_args=() | |
| curl_args+=(-s -X POST "${url}") | |
| # Optionally use a custom CA bundle if provided | |
| if [ -n "${MCP_CA_CERT:-}" ]; then | |
| curl_args+=(--cacert "${MCP_CA_CERT}") | |
| fi | |
| # Allow explicitly opting into insecure mode for debugging | |
| if [ "${MCP_INSECURE:-0}" = "1" ]; then | |
| curl_args+=(-k) | |
| fi | |
| curl "${curl_args[@]}" \ |
|
|
||
| curl -k -s -X POST "$(get_endpoint_url "${endpoint}")" \ |
There was a problem hiding this comment.
The mcp_request helper uses curl -k to reach the MCP endpoint, disabling TLS certificate verification and enabling man‑in‑the‑middle attacks against these test requests. A network attacker could intercept or alter MCP JSON‑RPC calls and responses (including SQL text), which undermines the security properties of HTTPS even when pointed at remote MCP hosts. Prefer curl with certificate validation enabled (no -k, with proper CA configuration or an explicit debug-only flag) so MCP communication remains authenticated.
| curl -k -s -X POST "$(get_endpoint_url "${endpoint}")" \ | |
| local curl_opts=(-s) | |
| # Allow explicitly opting into insecure TLS for debugging via MCP_INSECURE_TLS=1 | |
| if [ "${MCP_INSECURE_TLS:-0}" = "1" ]; then | |
| curl_opts+=(-k) | |
| fi | |
| curl "${curl_opts[@]}" -X POST "$(get_endpoint_url "${endpoint}")" \ |
| local endpoint="$1" | ||
| local payload="$2" | ||
|
|
||
| curl -k -s -X POST "$(get_endpoint_url "${endpoint}")" \ |
There was a problem hiding this comment.
The mcp_request helper uses curl -k to talk to the MCP endpoint, which disables TLS certificate verification and exposes these calls to man‑in‑the‑middle attacks. If MCP_HOST points to a remote server, an attacker on the network path could intercept or modify JSON‑RPC traffic and influence which SQL queries are executed. Use curl with proper certificate validation (dropping -k and configuring trust roots or a debug-only switch) so connections to MCP remain authenticated.
| curl -k -s -X POST "$(get_endpoint_url "${endpoint}")" \ | |
| # Use secure TLS verification by default. Allow an explicit debug override | |
| # via MCP_TLS_INSECURE=1 for environments with self-signed certificates. | |
| local curl_opts="-s" | |
| if [ "${MCP_TLS_INSECURE:-0}" = "1" ]; then | |
| curl_opts="${curl_opts} -k" | |
| fi | |
| curl ${curl_opts} -X POST "$(get_endpoint_url "${endpoint}")" \ |
| curl -k -s -X POST "$(get_endpoint_url "${endpoint}")" \ | ||
| -H "Content-Type: application/json" \ | ||
| -d "${payload}" 2>/dev/null |
There was a problem hiding this comment.
The mcp_request helper uses curl -k when calling the MCP endpoint, turning off TLS certificate verification and making these HTTPS calls vulnerable to man‑in‑the‑middle interception. A network attacker could spoof the MCP server, capture SQL payloads, or inject altered responses while tests appear to succeed. Ensure curl verifies certificates by default (removing -k and configuring CA trust or using a guarded debug flag) so MCP traffic is securely authenticated.
| curl -k -s -X POST "$(get_endpoint_url "${endpoint}")" \ | |
| -H "Content-Type: application/json" \ | |
| -d "${payload}" 2>/dev/null | |
| # Build curl options with secure defaults (TLS verification enabled). | |
| # Set MCP_INSECURE_TLS=true to allow -k for debugging/self-signed certs. | |
| local curl_opts=( | |
| -s | |
| -X POST | |
| "$(get_endpoint_url "${endpoint}")" | |
| -H "Content-Type: application/json" | |
| ) | |
| if [ "${MCP_INSECURE_TLS}" = "true" ] || [ "${MCP_INSECURE_TLS}" = "1" ]; then | |
| log_verbose "MCP_INSECURE_TLS is set; disabling TLS certificate verification for MCP request" | |
| curl_opts=(-k "${curl_opts[@]}") | |
| fi | |
| curl "${curl_opts[@]}" -d "${payload}" 2>/dev/null |
| curl -k -s -X POST "$(get_endpoint_url "${endpoint}")" \ | ||
| -H "Content-Type: application/json" \ | ||
| -d "${payload}" 2>/dev/null |
There was a problem hiding this comment.
The mcp_request helper uses curl -k to call the MCP endpoint, which disables TLS certificate verification and exposes these HTTPS requests to man‑in‑the‑middle attacks. An attacker on the network path could impersonate the MCP server, observe SQL payloads, or inject modified responses while tests appear to succeed. Use curl with certificate validation enabled (removing -k and configuring trust roots or a guarded debug flag) so MCP traffic remains authenticated even during testing.
| curl -k -s -X POST "$(get_endpoint_url "${endpoint}")" \ | |
| -H "Content-Type: application/json" \ | |
| -d "${payload}" 2>/dev/null | |
| local curl_args=(-s -X POST "$(get_endpoint_url "${endpoint}")" \ | |
| -H "Content-Type: application/json" \ | |
| -d "${payload}") | |
| # Allow explicitly insecure TLS only when MCP_INSECURE_TLS=1 is set. | |
| if [ "${MCP_INSECURE_TLS:-0}" = "1" ]; then | |
| curl_args=(-k "${curl_args[@]}") | |
| fi | |
| curl "${curl_args[@]}" 2>/dev/null |
| response=$(curl -k -s -w "\n%{http_code}" -X POST "$(get_endpoint_url "${endpoint}")" \ | ||
| -H "Content-Type: application/json" \ | ||
| -d "${payload}" 2>/dev/null) | ||
|
|
There was a problem hiding this comment.
The mcp_request function uses curl -k with -w "%{http_code}", which disables TLS certificate verification and leaves MCP HTTPS requests open to man‑in‑the‑middle interception. A network attacker can spoof the MCP endpoint, read SQL payloads, or inject modified responses while tests still appear to get HTTP 200 codes. Update this helper to use curl with certificate validation enabled (no -k, with appropriate CA configuration or a guarded debug flag) so MCP communication is securely authenticated.
| response=$(curl -k -s -w "\n%{http_code}" -X POST "$(get_endpoint_url "${endpoint}")" \ | |
| -H "Content-Type: application/json" \ | |
| -d "${payload}" 2>/dev/null) | |
| # Build curl options, enabling TLS verification by default. | |
| # Set MCP_INSECURE_SKIP_VERIFY=true to allow -k for local/self-signed testing only. | |
| local curl_opts=( | |
| -s | |
| -w "\n%{http_code}" | |
| -X POST "$(get_endpoint_url "${endpoint}")" | |
| -H "Content-Type: application/json" | |
| -d "${payload}" | |
| ) | |
| if [ "${MCP_INSECURE_SKIP_VERIFY:-false}" = "true" ]; then | |
| curl_opts=(-k "${curl_opts[@]}") | |
| fi | |
| response=$(curl "${curl_opts[@]}" 2>/dev/null) |
|
|
||
| # Test that a quick query without timeout rule executes successfully | ||
| run_test "T8.3: Quick query without SLEEP executes successfully" \ | ||
| bash -c "timeout 5 curl -k -s -X POST 'https://${MCP_HOST}:${MCP_PORT}/mcp/query' -H 'Content-Type: application/json' -d '{\"jsonrpc\":\"2.0\",\"method\":\"tools/call\",\"params\":{\"name\":\"run_sql_readonly\",\"arguments\":{\"sql\":\"SELECT phase8_data FROM quick_table\"}},\"id\":1}' | grep -q 'phase8_data'" |
There was a problem hiding this comment.
The inline curl used in test T8.3 includes -k, which disables TLS certificate verification and allows man‑in‑the‑middle attacks against MCP HTTPS traffic. A network attacker could spoof the MCP endpoint and tamper with or inspect the JSON‑RPC query used in this test. Replace curl -k with a certificate‑verifying call (or gate insecure mode behind a debug-only flag) so even timeout evaluation tests exercise secure TLS behavior.
There was a problem hiding this comment.
Code Review
This pull request introduces a substantial set of fixes for the MCP query rules and digests feature, addressing several bugs including invalid memory accesses and incorrect logic for loading rules. The addition of a comprehensive test suite is a significant improvement and greatly enhances the feature's maintainability. The code changes are well-structured and effectively resolve the issues outlined in the description. I have one suggestion to improve the robustness of the JSON escaping by leveraging the existing nlohmann::json library.
| // Escape the digest for JSON and add it to result | ||
| result += "\""; | ||
| if (digest) { | ||
| // Full JSON escaping - handle all control characters | ||
| for (const char* p = digest; *p; p++) { | ||
| unsigned char c = (unsigned char)*p; | ||
| if (c == '\\') result += "\\\\"; | ||
| else if (c == '"') result += "\\\""; | ||
| else if (c == '\n') result += "\\n"; | ||
| else if (c == '\r') result += "\\r"; | ||
| else if (c == '\t') result += "\\t"; | ||
| else if (c < 0x20) { | ||
| char buf[8]; |
There was a problem hiding this comment.
The manual JSON string escaping is incomplete and could lead to malformed JSON if the digest contains special characters other than \ and ", such as newlines or tabs. It's safer to use the nlohmann::json library, which is already included in this file, to handle JSON serialization and escaping correctly. This will ensure the generated JSON is always valid and make the code more robust.
// Use nlohmann::json to correctly escape the digest for JSON.
if (digest) {
result += nlohmann::json(digest).dump();
free(digest);
} else {
result += "\"\""; // Represent NULL digest as an empty string.
}- Fix comment mismatch: Changed _2 suffix to match actual function name (mysql_query_digest_and_first_comment, not _2) - Make get_def_mysql_opts() static to avoid symbol pollution - Fix NULL first_comment parameter to prevent potential segfault - Pass valid char* pointer instead of NULL - Free first_comment if allocated by the function