Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kafka Consumer Issue] Data inconsistency seen after CDC scale #9694

Open
fubinzh opened this issue Sep 7, 2023 · 4 comments
Open

[Kafka Consumer Issue] Data inconsistency seen after CDC scale #9694

fubinzh opened this issue Sep 7, 2023 · 4 comments
Labels
affects-7.5 affects-8.1 area/ticdc Issues or PRs related to TiCDC. found/automation Bugs found by automation cases kafka severity/moderate type/bug The issue is confirmed as a bug.

Comments

@fubinzh
Copy link

fubinzh commented Sep 7, 2023

What did you do?

  1. deploy TiDB cluster with 1 cdc node, create kafka changefeed, and run kafka consumer to consumer kafka message a MySQL.
cdc  cli  changefeed  create "--server=127.0.0.1:8301" "--sink-uri=kafka://downstream-kafka.cdc-testbed-tps-2280524-1-333:9092/cdc-event-open-protocol-cdc-scale?max-message-bytes=1048576&protocol=open-protocol&replication-factor=3" "--changefeed-id=cdc-scale-open-protocol-changefeed
  1. Run sysbench prepare
sysbench --db-driver=mysql --mysql-host=`nslookup upstream-tidb.cdc-testbed-tps-2280524-1-333 | awk -F: '{print $2}' | awk 'NR==5' | sed s/[[:space:]]//g`  --mysql-port=4000 --mysql-user=root --mysql-db=workload --tables=32 --table-size=100000 --create_secondary=off --debug=true --threads=32 --mysql-ignore-errors=2013,1213,1105,1205,8022,8027,8028,9004,9007,1062 oltp_write_only prepare
  1. run sysbench workload and at the same scale cdc from 2 to 6 nodes.
sysbench --db-driver=mysql --mysql-host=`nslookup upstream-tidb.cdc-testbed-tps-2280524-1-333 | awk -F: '{print $2}' | awk 'NR==5' | sed s/[[:space:]]//g`  --mysql-port=4000 --mysql-user=root --mysql-db=workload --tables=32 --table-size=100000 --create_secondary=off --time=1200 --debug=true --threads=32 --mysql-ignore-errors=2013,1213,1105,1205,8022,8027,8028,9004,9007,1062 oltp_write_only run
  1. send finishmark, and do data consistency check when MySQL receives finishmark.

What did you expect to see?

Data should be consisntent.

What did you see instead?

Data inconsistency seen.

image

Versions of the cluster

cdc version:
[root@upstream-ticdc-0 /]# /cdc version
Release Version: v7.4.0-alpha
Git Commit Hash: 254cc2b
Git Branch: heads/refs/tags/v7.4.0-alpha
UTC Build Time: 2023-09-06 11:36:11
Go Version: go version go1.21.0 linux/amd64
Failpoint Build: false

@fubinzh fubinzh added area/ticdc Issues or PRs related to TiCDC. type/bug The issue is confirmed as a bug. labels Sep 7, 2023
@fubinzh
Copy link
Author

fubinzh commented Sep 7, 2023

/found automation

@ti-chi-bot ti-chi-bot bot added the found/automation Bugs found by automation cases label Sep 7, 2023
@fubinzh
Copy link
Author

fubinzh commented Sep 8, 2023

/severity major

@sdojjy
Copy link
Member

sdojjy commented Sep 10, 2023

Base on the sync-diff summary
image

I founded below messages from the kafka topic , it shows ticdc already sent the update event to Kafka

{
    "u": {
        "c": {
            "t": 254,
            "f": 0,
            "v": "55368250724-96461947335-24187764707-65260444679-46692396102-21811308953-36638923458-18656561470-57423451092-43285125722"
        },
        "id": {
            "t": 3,
            "h": true,
            "f": 11,
            "v": 67533
        },
        "k": {
            "t": 3,
            "f": 1,
            "v": 52118
        },
        "pad": {
            "t": 254,
            "f": 0,
            "v": "71861985700-07222871824-88378454986-92661605151-75207053250"
        }
    },
    "p": {
        "c": {
            "t": 254,
            "f": 0,
            "v": "53348281694-21978480135-81348173179-73925401350-41399101720-17376868646-87723030020-19163581079-21416984997-48227990150"
        },
        "id": {
            "t": 3,
            "h": true,
            "f": 11,
            "v": 67533
        },
        "k": {
            "t": 3,
            "f": 1,
            "v": 49961
        },
        "pad": {
            "t": 254,
            "f": 0,
            "v": "34753459357-89875488434-09948998701-49662349889-89845145068"
        }
    }
}

So it's not a ticdc side issue. we need to fix the kafka-consumer, seems it missed an update event.

@fubinzh
Copy link
Author

fubinzh commented Sep 10, 2023

/remove-severity major
/severity moderate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-7.5 affects-8.1 area/ticdc Issues or PRs related to TiCDC. found/automation Bugs found by automation cases kafka severity/moderate type/bug The issue is confirmed as a bug.
Projects
Status: Need Triage
Development

No branches or pull requests

4 participants