Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl: fix a data race on localRowCntListener Written() #54484

Merged
merged 2 commits into from
Jul 26, 2024

Conversation

River2000i
Copy link
Contributor

@River2000i River2000i commented Jul 8, 2024

What problem does this PR solve?

Issue Number: close #54373

Problem Summary:

What changed and how does it work?

localRowCntListener can create tasks by HandleTask(), which can be read/write curPhysicalRowCnt

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-tests-checked release-note-none Denotes a PR that doesn't merit a release note. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 8, 2024
Copy link

ti-chi-bot bot commented Jul 8, 2024

Hi @River2000i. Thanks for your PR.

I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

tiprow bot commented Jul 8, 2024

Hi @River2000i. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@lance6716
Copy link
Contributor

/ok-to-test

@ti-chi-bot ti-chi-bot bot added ok-to-test Indicates a PR is ready to be tested. and removed needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Jul 8, 2024
Copy link

codecov bot commented Jul 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 56.5274%. Comparing base (7a0611f) to head (38e4538).
Report is 161 commits behind head on master.

Additional details and impacted files
@@                Coverage Diff                @@
##             master     #54484         +/-   ##
=================================================
- Coverage   72.9116%   56.5274%   -16.3842%     
=================================================
  Files          1542       1674        +132     
  Lines        435991     622710     +186719     
=================================================
+ Hits         317888     352002      +34114     
- Misses        98569     247050     +148481     
- Partials      19534      23658       +4124     
Flag Coverage Δ
integration 37.1147% <ø> (?)
unit 71.7648% <ø> (-0.1685%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.9656% <ø> (ø)
parser ∅ <ø> (∅)
br 52.0298% <ø> (+5.9184%) ⬆️

@@ -752,8 +752,8 @@ type localRowCntListener struct {
}

func (s *localRowCntListener) Written(rowCnt int) {
s.curPhysicalRowCnt += int64(rowCnt)
s.reorgCtx.setRowCount(s.prevPhysicalRowCnt + s.curPhysicalRowCnt)
newCurPhysicalRowCnt := atomic.AddInt64(&s.curPhysicalRowCnt, int64(rowCnt))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root cause is concurrent calling of Written, although we make s.curPhysicalRowCnt concurrent-safe, below setRowCount may set to the smaller newCurPhysicalRowCnt value of multiple concurrent calls.

Maybe add a lock? /cc @tangenta

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. actually i am a little bit confuse the usage of curPhysicalRowCnt, since there is no initialization and other reference, just equal to rowCnt.

Maybe it can simplify to s.reorgCtx.setRowCount(s.prevPhysicalRowCnt + int64(rowCnt))?

@ti-chi-bot ti-chi-bot bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 18, 2024
@River2000i
Copy link
Contributor Author

@lance6716 @tangenta PTAL~

Copy link

tiprow bot commented Jul 18, 2024

@River2000i: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
fast_test_tiprow 38e4538 link true /test fast_test_tiprow

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jul 19, 2024
Copy link
Collaborator

@Benjamin2037 Benjamin2037 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

ti-chi-bot bot commented Jul 26, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Benjamin2037, lance6716

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added approved lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jul 26, 2024
Copy link

ti-chi-bot bot commented Jul 26, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-07-19 05:55:25.068638546 +0000 UTC m=+592547.059580020: ☑️ agreed by lance6716.
  • 2024-07-26 08:22:22.72763936 +0000 UTC m=+1206164.718580830: ☑️ agreed by Benjamin2037.

@ti-chi-bot ti-chi-bot bot merged commit e366584 into pingcap:master Jul 26, 2024
20 of 23 checks passed
@River2000i River2000i deleted the fix-54373 branch July 28, 2024 13:28
morgo added a commit to morgo/tidb that referenced this pull request Jul 28, 2024
* upstream/master: (93 commits)
  disjoinset: add generic impl (pingcap#54917)
  planner: derive index filters for mv index paths (pingcap#54877)
  br: cli refactor backup error handling logic (pingcap#54697)
  expression: fix infinity loop in `timestampadd` (pingcap#54916)
  planner: import more expand test. (pingcap#54962)
  planner: use code-gen to generate CloneForPlanCache method for some operators (pingcap#54957)
  test: fix flaky test TestFailSchemaSyncer (pingcap#54958)
  planner: move logical show into logicalop pkg. (pingcap#54928)
  privilege: Remove TestAbnormalMySQLTable (pingcap#54925)
  resource_control: support unlimited keyword when setting the resource group (pingcap#54704)
  ddl: fix a data race on localRowCntListener Written() (pingcap#54484)
  lightning: fix SET SESSION on connection pool (pingcap#54927)
  planner: move logical mem-table to logicalop pkg. (pingcap#54903)
  table: Add `CachedTableSupport` and `TemporaryTableSupport` for `MutateContext` (pingcap#54900)
  executor: fix index_hash_join hang when context canceled (pingcap#54855)
  statistics: add metrics for unneeded analyze table (pingcap#54822)
  *: refine pipelined dml benchmarks (pingcap#54844)
  ddl: assign table IDs for jobs submitted to queue (pingcap#54880)
  *: avoid using Tables field of model.DBInfo, use API instead (pingcap#52302)
  planner: remove useless check (pingcap#54907)
  ...
hawkingrei pushed a commit to hawkingrei/tidb that referenced this pull request Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DATA RACE at the ddl.(*localRowCntListener).Written()
3 participants