Skip to content

Conversation

@tiancaiamao
Copy link
Contributor

@tiancaiamao tiancaiamao commented Nov 19, 2025

What problem does this PR solve?

Issue Number: close #64008

Problem Summary:

I want to know whether there are some place calling client-go API, without passing trace ID.
All the caller should pass the trace ID from tidb to tikv to archive distributed tracing.

What changed and how does it work?

I add a flight recorder dump trigger to client-go, if the caller are not passing trace ID, print the call stack.
So we can found all those trace event using this command:

cat config.json
{
  "enabled_categories": ["-", "general"],
  "dump_trigger": {
    "type": "suspicious_event",
    "suspicious_event": {
      "type": "dev_debug",
      "dev_debug": {
        "type": "execute_internal_trace_missing"
      }
    }
  }
}

curl -X POST -d @config.json http://127.0.0.1:10080/debug/traceevent

CheckFlightRecorderDumpTrigger API current cannot be used by client-go.
The function old signature is

func CheckFlightRecorderDumpTrigger(ctx context.Context, triggerCanonicalName string, check func(*DumpTriggerConfig) bool) 

And it's changed to

func CheckFlightRecorderDumpTrigger(ctx context.Context, triggerCanonicalName string,  val any)

It means in the past, we pass *DumpTriggerConfig to the callee and the callee compare the config with its state.
But *DumpTriggerConfig is defined is tidb, not client-go, so client-go cannot use this API.
Now, we let the callee provide it state, and the dump trigger config change that state.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)

Together with #64341, we can find some internal session not passing trace ID to client-go

For example, it find this stack:

github.com/tikv/client-go/v2/internal/locate.(*RegionRequestSender).SendReq
	/Users/genius/project/client-go/internal/locate/region_request.go:464
github.com/tikv/client-go/v2/txnkv/transaction.(*prewrite1BatchReqHandler).sendReqAndCheck
	/Users/genius/project/client-go/txnkv/transaction/prewrite.go:389
github.com/tikv/client-go/v2/txnkv/transaction.actionPrewrite.handleSingleBatch
	/Users/genius/project/client-go/txnkv/transaction/prewrite.go:247
github.com/tikv/client-go/v2/txnkv/transaction.(*twoPhaseCommitter).doActionOnBatches
	/Users/genius/project/client-go/txnkv/transaction/2pc.go:1099
github.com/tikv/client-go/v2/txnkv/transaction.(*twoPhaseCommitter).doActionOnGroupMutations
	/Users/genius/project/client-go/txnkv/transaction/2pc.go:1059
github.com/tikv/client-go/v2/txnkv/transaction.(*twoPhaseCommitter).doActionOnMutations
	/Users/genius/project/client-go/txnkv/transaction/2pc.go:821
github.com/tikv/client-go/v2/txnkv/transaction.(*twoPhaseCommitter).prewriteMutations
	/Users/genius/project/client-go/txnkv/transaction/prewrite.go:307
github.com/tikv/client-go/v2/txnkv/transaction.(*twoPhaseCommitter).execute
	/Users/genius/project/client-go/txnkv/transaction/2pc.go:1809
github.com/tikv/client-go/v2/txnkv/transaction.(*KVTxn).Commit
	/Users/genius/project/client-go/txnkv/transaction/txn.go:853
github.com/pingcap/tidb/pkg/store/driver/txn.(*tikvTxn).Commit
	/Users/genius/project/tidb/pkg/store/driver/txn/txn_driver.go:118
github.com/pingcap/tidb/pkg/session.(*LazyTxn).Commit
	/Users/genius/project/tidb/pkg/session/txn.go:438
github.com/pingcap/tidb/pkg/session.(*session).commitTxnWithTemporaryData
	/Users/genius/project/tidb/pkg/session/session.go:703
github.com/pingcap/tidb/pkg/session.(*session).doCommit
	/Users/genius/project/tidb/pkg/session/session.go:583
github.com/pingcap/tidb/pkg/session.(*session).doCommitWithRetry
	/Users/genius/project/tidb/pkg/session/session.go:856
github.com/pingcap/tidb/pkg/session.(*session).CommitTxn
	/Users/genius/project/tidb/pkg/session/session.go:986
github.com/pingcap/tidb/pkg/session.autoCommitAfterStmt
	/Users/genius/project/tidb/pkg/session/tidb.go:250
github.com/pingcap/tidb/pkg/session.finishStmt
	/Users/genius/project/tidb/pkg/session/tidb.go:212
github.com/pingcap/tidb/pkg/session.runStmt
	/Users/genius/project/tidb/pkg/session/session.go:2961
github.com/pingcap/tidb/pkg/session.(*session).executeStmtImpl
	/Users/genius/project/tidb/pkg/session/session.go:2691
github.com/pingcap/tidb/pkg/session.(*session).ExecuteStmt
	/Users/genius/project/tidb/pkg/session/session.go:2397
github.com/pingcap/tidb/pkg/session.(*session).executeInternalImpl
	/Users/genius/project/tidb/pkg/session/session.go:1887
github.com/pingcap/tidb/pkg/session.(*session).ExecuteInternal
	/Users/genius/project/tidb/pkg/session/session.go:1865
github.com/pingcap/tidb/pkg/store/gcworker.(*GCWorker).saveValueToSysTable
	/Users/genius/project/tidb/pkg/store/gcworker/gc_worker.go:1653
github.com/pingcap/tidb/pkg/store/gcworker.(*GCWorker).saveTime
	/Users/genius/project/tidb/pkg/store/gcworker/gc_worker.go:1564
github.com/pingcap/tidb/pkg/store/gcworker.(*GCWorker).checkLeader
	/Users/genius/project/tidb/pkg/store/gcworker/gc_worker.go:1510
github.com/pingcap/tidb/pkg/store/gcworker.(*GCWorker).tick
	/Users/genius/project/tidb/pkg/store/gcworker/gc_worker.go:283
github.com/pingcap/tidb/pkg/store/gcworker.(*GCWorker).start
	/Users/genius/project/tidb/pkg/store/gcworker/gc_worker.go:226

And turn out to be casused by here https://github.com/tikv/client-go/blob/1264c12785957baf4ba745368bf0b853520de711/txnkv/transaction/prewrite.go#L171-L183

  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added the release-note-none Denotes a PR that doesn't merit a release note. label Nov 19, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Nov 19, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign cfzjywxk for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 19, 2025
@tiprow
Copy link

tiprow bot commented Nov 19, 2025

Hi @tiancaiamao. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hawkingrei
Copy link
Member

/retest
/ok-to-test

@ti-chi-bot ti-chi-bot bot added the ok-to-test Indicates a PR is ready to be tested. label Nov 20, 2025
@codecov
Copy link

codecov bot commented Nov 20, 2025

Codecov Report

❌ Patch coverage is 33.33333% with 80 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.7966%. Comparing base (be3d5b4) to head (adc8e58).

⚠️ Current head adc8e58 differs from pull request most recent head 466b2ea

Please upload reports for the commit 466b2ea to get more accurate results.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #64569        +/-   ##
================================================
+ Coverage   74.6905%   74.7966%   +0.1060%     
================================================
  Files          1888       1889         +1     
  Lines        515097     516091       +994     
================================================
+ Hits         384729     386019      +1290     
+ Misses       106545     106308       -237     
+ Partials      23823      23764        -59     
Flag Coverage Δ
unit 72.5318% <33.3333%> (+0.2841%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.8700% <ø> (+0.1132%) ⬆️
parser ∅ <ø> (∅)
br 62.0751% <ø> (-1.1087%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@tiprow
Copy link

tiprow bot commented Nov 21, 2025

@tiancaiamao: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
fast_test_tiprow 466b2ea link true /test fast_test_tiprow

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Nov 21, 2025

@tiancaiamao: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-unit-test-next-gen 466b2ea link true /test pull-unit-test-next-gen
idc-jenkins-ci-tidb/unit-test 466b2ea link true /test unit-test
pull-integration-realcluster-test-next-gen 466b2ea link true /test pull-integration-realcluster-test-next-gen
idc-jenkins-ci-tidb/check_dev_2 466b2ea link true /test check-dev2
idc-jenkins-ci-tidb/build 466b2ea link true /test build
idc-jenkins-ci-tidb/check_dev 466b2ea link true /test check-dev
pull-build-next-gen 466b2ea link true /test pull-build-next-gen
pull-mysql-client-test-next-gen 466b2ea link true /test pull-mysql-client-test-next-gen
pull-integration-ddl-test 466b2ea link true /test pull-integration-ddl-test
pull-integration-e2e-test 466b2ea link true /test pull-integration-e2e-test
pull-mysql-client-test 466b2ea link true /test pull-mysql-client-test
idc-jenkins-ci-tidb/mysql-test 466b2ea link true /test mysql-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Diagnosability enhacement for TiDB X

2 participants