Skip to content

Conversation

@steFaiz
Copy link
Contributor

@steFaiz steFaiz commented Jan 13, 2026

Purpose

This PR is about to add a check for DataEvolutionMergeInto in Spark, to prevent users from updating global-indexed fields.
Otherwise, the indexed-scan results would be wrong.

Linked issue: none

Tests

Please see org.apache.paimon.spark.sql.RowTrackingTestBase

API and Format

None

Documentation

Will be added ASAP

@steFaiz
Copy link
Contributor Author

steFaiz commented Jan 13, 2026

Now I just simply check all indexed fields and updated fields. I think we could also push a Map<Partition, List<IndexedColumns>> or something else down to the DataEvolutionPaimonWriter, to check at the partition level.
But I don't know whether it's worth. @JingsongLi Could you please tell me your opinions?

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good question. I think we can create an option to describe what to do when data is updated. For example:

  1. Index columns are not allowed to be updated.
  2. Update and remove the index from the indexed column.
  3. For updates without changing the index, the business is responsible for rebuilding the index.

@steFaiz steFaiz changed the title [spark] calling merge into on DE table should not update indexed columns [wip][spark] calling merge into on DE table should not update indexed columns Jan 13, 2026
@steFaiz steFaiz marked this pull request as draft January 13, 2026 13:08
@steFaiz
Copy link
Contributor Author

steFaiz commented Jan 13, 2026

That's a very good question. I think we can create an option to describe what to do when data is updated. For example:

@JingsongLi Thanks for your insightful advise! I've drafted this PR and will work on this! I will create an issue after conceiving a basic design and reopen this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants