Skip to content

fix: sync getBulk correctly decodes mixed hashed and plain keys#195

Open
bihaoxwork wants to merge 2 commits into
masterfrom
sync-getbulk-mixed-keys
Open

fix: sync getBulk correctly decodes mixed hashed and plain keys#195
bihaoxwork wants to merge 2 commits into
masterfrom
sync-getbulk-mixed-keys

Conversation

@bihaoxwork
Copy link
Copy Markdown

@bihaoxwork bihaoxwork commented May 28, 2026

Summary

Test

  • Existing EVCacheTestDI async bulk get tests pass
  • Re-enabled sync getBulk test path passes with mixed hashed/unhashed keys
  • No regressions for plain-key-only and hashed-key-only bulk requests

Refactor getBulkData to split hashed vs plain keys and route each set
through the appropriate transcoder path, fixing incorrect decoding when
a single bulk request contains both hashed and unhashed keys. Re-enable
the previously disabled getBulk test path in EVCacheTestDI.

Co-authored-by: Bihao Xu <bxu@netflix.com>
@bihaoxwork bihaoxwork requested review from AustinWheel, akhaku, Copilot and shy-1234 and removed request for akhaku and Copilot May 31, 2026 19:08
@bihaoxwork bihaoxwork requested a review from srrangarajan June 2, 2026 22:39
@shy-1234
Copy link
Copy Markdown
Contributor

shy-1234 commented Jun 3, 2026

The PR description seems misleading. The mixed-key-aware asyncGetBulk entry point was introduced in #181. (There was also another follow up PR #184) Please update the description when you get a chance to make sure the right context is linked.

Comment thread evcache-client/test/com/netflix/evcache/test/EVCacheTestDI.java Outdated
Comment thread evcache-client/test/com/netflix/evcache/test/EVCacheTestDI.java Outdated
Comment thread evcache-core/src/main/java/com/netflix/evcache/pool/EVCacheClient.java Outdated
@bihaoxwork
Copy link
Copy Markdown
Author

Code Review

Summary

Refactors the synchronous getBulk path so mixed hashed/plain key batches decode each key with the correct transcoder, adds chunking support for that path, and re-enables the previously-disabled mixed-key test. The core fix is correct and well-covered; findings below are non-blocking quality/observability items.

Approval Recommendation

Comment — no blocking issues. The mixed-key decode, the transcoder-fallback port from #184, and the chunked+hashed read path were all independently verified correct. Consider the null-guard suggestion and the test/observability considerations at your discretion.

Issues Found

Suggestions

  • On the chunking path, getBulk(plainKeys, hashedKeys, …) does return assembleChunks(...) directly, but assembleChunks swallows internal errors and returns null (its catch falls through to return null), bypassing getBulk's own outer catch that returns Collections.emptyMap(). getBulkData then calls buildKeyValueResult(objMap, …) with no null check, and buildKeyValueResult dereferences objMap.size() → NPE. When throwException=true && hasZF=false, the caller sees a confusing NullPointerException instead of the original exception (the pre-refactor plain path returned emptyMap() and never NPE'd). Suggest null-guarding before buildKeyValueResult, or normalizing assembleChunks's null return to emptyMap().
    (confidence: 82)
    evcache-core/.../EVCacheImpl.java#L1994-L1995 · EVCacheClient.java#L1056-L1057

Considerations

  • decodeForKey silently returns null when a hashed key's envelope decode does not yield an EVCacheValue (the instanceof check fails) — no debug log, no metric. The non-chunked path (EVCacheMemcachedClient.asyncGetBulk) logs this exact case ("did not yield an instance of EVCacheValue, this could be due to collision"). Adding an equivalent debug log (and/or reusing the KEY_HASH_COLLISION metric) would restore diagnostic parity on the chunked path.
    (confidence: 73)
    EVCacheClient.java#L584-L591
  • The debug log "fetching bulk data with set of keys containing hashed key(s)" now fires unconditionally for every bulk get, including all-plain-key batches (which the refactor routes through this same path). The async counterpart branches the message on whether hashedKeysMap is empty; the sync path lost that branch and will mislead anyone reading debug output.
    (confidence: 73)
    EVCacheImpl.java#L1991-L1992
  • doChunkingTests(true) (hashing + chunking) never asserts that chunking actually happened — the getAllChunks(...) structural check is guarded by if (!hashingEnabled). The round-trip get/getBulk assertions do exercise the chunked+hashed decode end-to-end, but if a value were silently stored unchunked (wrong branch taken), the test would still pass via the non-chunked branch in assembleChunks. Consider an explicit getAllChunks(hashKey) assertion in the hashed branch (using the actual hash key string) to close the gap.
    (confidence: 72)
    EVCacheTestDI.java#L376-L383
  • The comment "This exercises the mixed-key, chunk-aware getBulk" is inaccurate: with app-level hash.key=true, every key in doChunkingTests is hashed, so buildKeyMap produces an empty plainKeys set — this is an all-hashed (not mixed) chunking scenario. Minor comment wording.
    (confidence: 70)
    EVCacheTestDI.java#L353-L354

Pre-existing Issues

Exists on unchanged code (introduced in #181 for the async path), now also reachable from the sync path. Not reachable through normal flow since memcached only returns requested keys — listed for awareness.

  • buildKeyValueResult does getOrDefault(wireKey, hashedMap.get(wireKey)); if a returned wire key is in neither map, evcKey is null and retMap.put(null, val) silently inserts a null key.
    EVCacheImpl.java#L1962-L1965

Questions for the Author

None — the chunking-routing design and the chunked+hashed read path were verified correct (chunks are stored under hashKey_00, reassembled, then 2-step decoded), and the open human-review questions were already addressed.

Notes

  • Verified correct: (1) the mixed-key split decode (plain → 1-step, hashed → 2-step envelope decode with collision check); (2) the transcoder-fallback ternary correctly ported from the async path (fix: async bulk get transcoder fallback for plain keys in mixed-key requests #184); (3) the chunked+hashed read path (write stores the EVCacheValue envelope chunked under hashKey_NN; read reassembles and decodes via the envelope then value transcoder). The user-visible result is unchanged because the final decanonicalization loop normalizes absent-key and null-value to the same key → null.
  • The refreshEVCache change that closes the previous lifecycleManager before rebuilding is a good catch — it stops the per-refresh DI-container leak.
  • Already acknowledged by the human reviewer and not re-flagged here: unused variable (L359), assert-inside-loop (L461), and the PR-description attribution to Fix: asyncBulkGet decoding of hashed and unhashed keys together in one request #181/fix: async bulk get transcoder fallback for plain keys in mixed-key requests #184.
  • Two candidate findings were investigated and dropped after verification: a claimed RuntimeException-from-decodeForKey deadlock in the non-chunked path (false — decodeForKey is only called from the chunked assembleChunks), and a chunked-vs-non-chunked null-value divergence (false at the API level — normalized away downstream).
  • Agent 7 (Codex) ran but exhausted its turn budget during code exploration, surfacing only a trailing-whitespace nit (assumed CI/style-handled). Internal Netflix context sources (Jira/Slack/Manuals) were not applicable to this public OSS PR.

Generated with pr-review-plugin (Claude + Codex)
React with a thumbs up if useful, thumbs down otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants