Skip to content

Conversation

@LiangDai-Mars
Copy link

Purpose

This pull request aims to fix an issue in the incremental scan where concurrent writes to the same key could finally return wrong merged result. Since records included in merging set are in non-deterministic sequence numbers could result in unstable merge order, which the incremental scan heavily relies on, would lead to incorrect final results. The fix ensures the correct result by re-sorting the data in the getResult stage of the split diff read process.

Tests

A new integration test, testIncrementScanModeWithInsertOverwrite, has been added to BatchFileStoreITCase.java. This test simulates the scenario with multiple INSERT OVERWRITE operations with different seq. number on the same key, and then verifies that the diff mode of incremental scan correctly identifies the changes, ensuring the fix is effective.

API and Format

No API or format changes. This change only involves an internal logic correction and does not affect any external APIs, data storage formats, or configuration files.

Documentation

This change is an internal bug fix and does not require updates to user-facing documentation.

…write of a key with non-deterministic sequence number.
@LiangDai-Mars
Copy link
Author

@JingsongLi Please help review this MR. Thx!
cc @Aitozi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant