Skip to content

fix(pflash): adaptive anchor_radius eliminates 64K NIAH cliff#357

Merged
davide221 merged 3 commits into
Luce-Org:mainfrom
dusterbloom:split/03-anchor-radius-cliff
Jun 10, 2026
Merged

fix(pflash): adaptive anchor_radius eliminates 64K NIAH cliff#357
davide221 merged 3 commits into
Luce-Org:mainfrom
dusterbloom:split/03-anchor-radius-cliff

Conversation

@dusterbloom

Copy link
Copy Markdown
Collaborator

Re-carved from #274 (commit 1d0baa2), with a DRY refactor + the unit test the original lacked.

At >=32K the needle text straddles multiple 32-token chunks and the fixed anchor_radius=2 window (~160 tokens) drops the back half of the needle → truncated/hallucinated retrieval. Scales the window with n_chunks: <32K → (radius 2, hits 8); 32-64K → (4, 16); >=64K → (8, 32). Overridable via PFLASH_COMPRESS_ANCHOR_RADIUS / PFLASH_COMPRESS_MAX_ANCHOR_HITS (legacy DFLASH_COMPRESS_* still accepted).

Refactor: both call sites now use a pure resolve_anchor_params() helper (server/src/qwen3/anchor_params.h) — removes the duplicated block and makes the tier/override logic unit-tested (28 checks: tier boundaries, override precedence, -1 sentinels). Sub-32K behavior is byte-identical to the previous hardcoded defaults. Source commit validated at 49K (needle correctly retrieved; truncated before the fix).

4 files, +122/-4.

Tiers (chunk_size=32): <1024->{2,8}; <2048->{4,16}; >=2048->{8,32}.
DRY: extract resolve_anchor_params() pure helper; both call sites use it.
Env precedence: PFLASH_COMPRESS_ANCHOR_RADIUS > legacy DFLASH_ > tier.
Unit test: 28 checks covering tier boundaries, env overrides, sentinels.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread server/src/qwen3/qwen3_drafter.cpp
easel pushed a commit to easel/lucebox-hub that referenced this pull request Jun 9, 2026
easel pushed a commit to easel/lucebox-hub that referenced this pull request Jun 10, 2026
Both anchor-scan loops now use vector<int>(max_anchor_hits) instead of
int[8]; write guard and iteration bound follow max_anchor_hits throughout.
@davide221

Copy link
Copy Markdown
Contributor

I cant reproduce the cliff, can you give some more instruction about it?

@davide221 davide221 merged commit 745878d into Luce-Org:main Jun 10, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants