Skip to content

cdc,tests: integration test processor_resolved_ts_fallback case is flaky #384

@pingyu

Description

@pingyu

Bug Report

1. Describe the bug

rawkv_data failed, "[src/storage/txn/scheduler.rs:1938]: failed to key guard: Error(Other(\"[src/storage/mod.rs:3090]: Fail to get ts: TSO batch(504) used up\"))".

Full logs (see https://do.pingcap.net/jenkins/blue/organizations/jenkins/tikv%2Fmigration%2Fpull_integration_test/detail/pull_integration_test/14/pipeline/311):

run put data
[2024/01/03 15:39:56.050 +08:00] [INFO] [pd_service_discovery.go:632] ["[pd] switch leader"] [new-leader=http://127.0.0.1:2379/] [old-leader=]
[2024/01/03 15:39:56.050 +08:00] [INFO] [pd_service_discovery.go:197] ["[pd] init cluster id"] [cluster-id=7319773494435382658]
[2024/01/03 15:39:56.050 +08:00] [INFO] [client.go:600] ["[pd] changing service mode"] [old-mode=UNKNOWN_SVC_MODE] [new-mode=PD_SVC_MODE]
[2024/01/03 15:39:56.051 +08:00] [INFO] [tso_client.go:230] ["[tso] switch dc tso global allocator serving address"] [dc-location=global] [new-address=http://127.0.0.1:2379/]
[2024/01/03 15:39:56.051 +08:00] [INFO] [tso_dispatcher.go:318] ["[tso] tso dispatcher created"] [dc-location=global]
[2024/01/03 15:39:56.051 +08:00] [INFO] [client.go:648] ["[pd] service mode changed"] [old-mode=UNKNOWN_SVC_MODE] [new-mode=PD_SVC_MODE]
[2024/01/03 15:39:56.056 +08:00] [INFO] [pd_service_discovery.go:632] ["[pd] switch leader"] [new-leader=http://127.0.0.1:2379/] [old-leader=]
[2024/01/03 15:39:56.056 +08:00] [INFO] [pd_service_discovery.go:197] ["[pd] init cluster id"] [cluster-id=7319773494435382658]
[2024/01/03 15:39:56.056 +08:00] [INFO] [client.go:600] ["[pd] changing service mode"] [old-mode=UNKNOWN_SVC_MODE] [new-mode=PD_SVC_MODE]
[2024/01/03 15:39:56.056 +08:00] [INFO] [tso_client.go:230] ["[tso] switch dc tso global allocator serving address"] [dc-location=global] [new-address=http://127.0.0.1:2379/]
[2024/01/03 15:39:56.057 +08:00] [INFO] [tso_dispatcher.go:318] ["[tso] tso dispatcher created"] [dc-location=global]
[2024/01/03 15:39:56.057 +08:00] [INFO] [client.go:648] ["[pd] service mode changed"] [old-mode=UNKNOWN_SVC_MODE] [new-mode=PD_SVC_MODE]
[2024/01/03 15:39:56.061 +08:00] [INFO] [region_cache.go:2682] ["change store resolve state"] [store=5] [addr=127.0.0.1:20162] [from=unresolved] [to=resolved] [liveness-state=reachable]
[2024/01/03 15:39:56.109 +08:00] [INFO] [region_cache.go:2682] ["change store resolve state"] [store=5] [addr=127.0.0.1:20162] [from=unresolved] [to=resolved] [liveness-state=reachable]
[2024/01/03 15:39:57.019 +08:00] [ERROR] [gen_data.go:210] ["failed to put data"] [error="\"[src/storage/txn/scheduler.rs:1938]: failed to key guard: Error(Other(\\\"[src/storage/mod.rs:3090]: Fail to get ts: TSO batch(504) used up\\\"))\""] [errorVerbose="\"[src/storage/txn/scheduler.rs:1938]: failed to key guard: Error(Other(\\\"[src/storage/mod.rs:3090]: Fail to get ts: TSO batch(504) used up\\\"))\"\ngithub.com/tikv/client-go/v2/rawkv.(*Client).PutWithTTL\n\tgithub.com/tikv/client-go/v2@v2.0.8-0.20231116065855-46811b6ac353/rawkv/rawkv.go:354\ngithub.com/tikv/client-go/v2/rawkv.(*Client).Put\n\tgithub.com/tikv/client-go/v2@v2.0.8-0.20231116065855-46811b6ac353/rawkv/rawkv.go:398\nmain.runPutCmd.func3\n\tgithub.com/tikv/migration/cdc/tests/utils/rawkv_data/gen_data.go:193\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\truntime/asm_amd64.s:1650"] [stack="main.runPutCmd\n\tgithub.com/tikv/migration/cdc/tests/utils/rawkv_data/gen_data.go:210\nmain.main.NewPutCommand.func1\n\tgithub.com/tikv/migration/cdc/tests/utils/rawkv_data/gen_data.go:65\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/cobra@v1.7.0/command.go:940\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/cobra@v1.7.0/command.go:1068\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/cobra@v1.7.0/command.go:992\nmain.main\n\tgithub.com/tikv/migration/cdc/tests/utils/rawkv_data/main.go:136\nruntime.main\n\truntime/proc.go:267"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_dispatcher.go:186] ["exit tso requests cancel loop"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [pd_service_discovery.go:248] ["[pd] exit member loop due to context canceled"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_dispatcher.go:405] ["[tso] stop fetching the pending tso requests due to context canceled"] [dc-location=global]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_dispatcher.go:339] ["[tso] exit tso dispatcher"] [dc-location=global]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_dispatcher.go:240] ["exit tso dispatcher loop"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [resource_manager_client.go:271] ["[resource manager] exit resource token dispatcher"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_client.go:134] ["closing tso client"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_client.go:139] ["close tso client"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_client.go:150] ["tso client is closed"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [pd_service_discovery.go:294] ["[pd] close pd service discovery client"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_dispatcher.go:186] ["exit tso requests cancel loop"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [resource_manager_client.go:271] ["[resource manager] exit resource token dispatcher"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_dispatcher.go:240] ["exit tso dispatcher loop"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_dispatcher.go:405] ["[tso] stop fetching the pending tso requests due to context canceled"] [dc-location=global]
[2024/01/03 15:39:57.020 +08:00] [INFO] [pd_service_discovery.go:248] ["[pd] exit member loop due to context canceled"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_dispatcher.go:339] ["[tso] exit tso dispatcher"] [dc-location=global]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_client.go:134] ["closing tso client"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_client.go:139] ["close tso client"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [tso_client.go:150] ["tso client is closed"]
[2024/01/03 15:39:57.020 +08:00] [INFO] [pd_service_discovery.go:294] ["[pd] close pd service discovery client"]
Error: "[src/storage/txn/scheduler.rs:1938]: failed to key guard: Error(Other(\"[src/storage/mod.rs:3090]: Fail to get ts: TSO batch(504) used up\"))"
rawkv_data failed, "[src/storage/txn/scheduler.rs:1938]: failed to key guard: Error(Other(\"[src/storage/mod.rs:3090]: Fail to get ts: TSO batch(504) used up\"))".

2. Minimal reproduce step (Required)

Run integration tests in CI.

3. What did you see instead (Required)

Test failed.

4. What did you expect to see? (Required)

Test succeed.

5. What is your migration tool and TiKV version? (Required)

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions