-
Notifications
You must be signed in to change notification settings - Fork 1.2k
cdc: update storage sink #21221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
cdc: update storage sink #21221
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
|
||
| 在处理文件时,可能会遇到某个文件还没有写入完成就进行读取的情况,这会导致部分数据没有成功读取。我们需要先读取 Index 文件来获取可以进行处理的文件来避免这种情况发生。消费逻辑为: | ||
| - 读取目录下的 meta/CDC.index 文件,获取当前已经完成写入的文件名。 | ||
| - 依次处理文件序号小于等于该文件名的 DML 事件。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在新架构上为小于等于,在老架构上应该为小于
ticdc/ticdc-architecture.md
Outdated
| 新架构支持为 MySQL Sink 和 Storage Sink 启用**表级任务拆分**。你可以通过在 Changefeed 配置中设置 `scheduler.enable-table-across-nodes = true` 来启用该功能。 | ||
|
|
||
| 启用后,当**有且仅有一个主键或非空唯一键**的表满足以下任一条件时,TiCDC 会自动将其拆分并分发到多个节点并行执行同步,从而提升同步效率与资源利用率: | ||
| 对于 MySQL Sink 来说,**有且仅有一个主键或非空唯一键**的表满足以下任一条件时,TiCDC 会自动将其拆分并分发到多个节点并行执行同步,从而提升同步效率与资源利用率: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
添加高亮说明。
|
@3AceShowHand: adding LGTM is restricted to approvers and reviewers in OWNERS files. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@wk989898: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| 新架构支持为 MySQL Sink 和 Storage Sink 启用**表级任务拆分**。你可以通过在 Changefeed 配置中设置 `scheduler.enable-table-across-nodes = true` 来启用该功能。 | ||
|
|
||
| 启用后,当**有且仅有一个主键或非空唯一键**的表满足以下任一条件时,TiCDC 会自动将其拆分并分发到多个节点并行执行同步,从而提升同步效率与资源利用率: | ||
| 对于 **MySQL Sink** 来说,**有且仅有一个主键或非空唯一键**的表满足以下任一条件时,TiCDC 会自动将其拆分并分发到多个节点并行执行同步,从而提升同步效率与资源利用率: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里感觉要找找老板确认一下,我们对于这个拆分条件要说到什么程度。目前是 默认 enable-splittable-check 为 false,但是 mysql sink 时,默认 enable-splittable-check 为 true。只有 enable-splittable-check 为 true 的时候我们会校验 “有且仅有一个主键或非空唯一键”。对于 mysql 我们又可以通过 force-split 强制开启拆分(不过 force split 估计不用暴露给用户了)
First-time contributors' checklist
What is changed, added or deleted? (Required)
Which TiDB version(s) do your changes apply to? (Required)
Tips for choosing the affected version(s):
By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.
For details, see tips for choosing the affected versions (in Chinese).
What is the related PR or file link(s)?
Do your changes match any of the following descriptions?