Parent Epic
#44
Bug
When a paragraph's text is completely rewritten (no shared tokens), the content alignment layer's Jaccard similarity drops below 0.3 and the paragraphs are NOT matched. Instead of an in-place surgical replace, the reconciler emits a full deleteContentRange for the old paragraph followed by insertText for the new one.
Root Cause
Content alignment in content_align.py uses token-based Jaccard similarity to decide if two paragraphs are "matchable". When similarity < 0.3, the paragraphs are treated as unrelated — one is deleted, the other inserted.
Impact
- Any paragraph style, formatting, or properties attached to the original paragraph are lost
- The operation is a delete+reinsert pattern (which the user wants to avoid)
- Works correctly when paragraphs share >= 30% of tokens
Example
- Base:
"The quick brown fox jumps\n" → 5 tokens
- Desired:
"A lazy dog sleeps quietly here\n" → 6 tokens, 0 overlap
- Result: delete entire paragraph + insert new one (not surgical)
Suggested Fix
Consider always matching paragraphs at the same position (positional fallback) even when Jaccard is low, especially when the document has a 1:1 paragraph correspondence. Alternatively, lower the threshold or use a secondary matching heuristic.
xfail Tests (1)
TestCompleteTextRewrite::test_completely_different_text_is_surgical
Parent Epic
#44
Bug
When a paragraph's text is completely rewritten (no shared tokens), the content alignment layer's Jaccard similarity drops below 0.3 and the paragraphs are NOT matched. Instead of an in-place surgical replace, the reconciler emits a full
deleteContentRangefor the old paragraph followed byinsertTextfor the new one.Root Cause
Content alignment in
content_align.pyuses token-based Jaccard similarity to decide if two paragraphs are "matchable". When similarity < 0.3, the paragraphs are treated as unrelated — one is deleted, the other inserted.Impact
Example
"The quick brown fox jumps\n"→ 5 tokens"A lazy dog sleeps quietly here\n"→ 6 tokens, 0 overlapSuggested Fix
Consider always matching paragraphs at the same position (positional fallback) even when Jaccard is low, especially when the document has a 1:1 paragraph correspondence. Alternatively, lower the threshold or use a secondary matching heuristic.
xfail Tests (1)
TestCompleteTextRewrite::test_completely_different_text_is_surgical