Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*(ticdc): optimize memory usage of RowChangedEvent #10483

Merged
merged 8 commits into from
Feb 20, 2024

Conversation

lidezhu
Copy link
Collaborator

@lidezhu lidezhu commented Jan 16, 2024

What problem does this PR solve?

Issue Number: close #10386

What is changed and how it works?

  1. Add a more memory efficient struct ColumnData to represent column data, and the schema info of column can only be fetched from TableInfo;
  2. Replace []*Column in RowChangedEvent with []*ColumnData;
  3. Add method RowChangedEvent.GetColumns and RowChangedEvent.GetPreColumns to get the Column representation of columns;
  4. There should be no nil in RowChangedEvent.Columns and RowChangedEvent.PreColumns, so this pr remove some code in sink module to handle nil value.
  5. Maintain a correct TableInfo in all kinds of tests;

Check List

Tests

  • Unit test
  • Manual test (add detailed scripts or steps below)
  1. disable tidb gc;
  2. load 10M rows of a wide table;
  3. create changefeed with memory-quota set 8GB;
  4. compare the peak memory usage;
    Before optimization, peak memory usage: 26.5GB,
    image
    After optimization, peak memory usage: 10.2GB,
    image
    The peak memory usage decrease about 60%.

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

None.

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 16, 2024
@lidezhu lidezhu force-pushed the reduce-cdc-mem-usage branch 2 times, most recently from 1cf2afc to d0c061e Compare January 24, 2024 09:37
@lidezhu lidezhu changed the title [DNM] Optimize memory usage of RowChangedEvent Optimize memory usage of RowChangedEvent Jan 25, 2024
@asddongmen asddongmen self-assigned this Jan 25, 2024
@lidezhu lidezhu force-pushed the reduce-cdc-mem-usage branch 3 times, most recently from b187bf7 to ebd4e2f Compare January 27, 2024 08:14
@asddongmen asddongmen removed their assignment Jan 28, 2024
@asddongmen asddongmen self-requested a review January 28, 2024 05:02
Copy link

codecov bot commented Jan 28, 2024

Codecov Report

Merging #10483 (fb658b9) into master (2eadc08) will increase coverage by 0.0427%.
Report is 1 commits behind head on master.
The diff coverage is 62.4595%.

Additional details and impacted files
Components Coverage Δ
cdc 62.0021% <62.4595%> (-0.1076%) ⬇️
dm 51.5055% <ø> (+0.3023%) ⬆️
engine 63.4494% <ø> (ø)
Flag Coverage Δ
unit 57.6760% <62.4595%> (+0.0427%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master     #10483        +/-   ##
================================================
+ Coverage   57.6333%   57.6760%   +0.0427%     
================================================
  Files           849        849                
  Lines        126085     127025       +940     
================================================
+ Hits          72667      73263       +596     
- Misses        47988      48298       +310     
- Partials       5430       5464        +34     

@lidezhu
Copy link
Collaborator Author

lidezhu commented Jan 29, 2024

/test cdc-integration-mysql-test

@lidezhu
Copy link
Collaborator Author

lidezhu commented Jan 29, 2024

/test verify

@lidezhu
Copy link
Collaborator Author

lidezhu commented Jan 29, 2024

/test cdc-integration-storage-test

@lidezhu
Copy link
Collaborator Author

lidezhu commented Jan 29, 2024

/test verify

@lidezhu
Copy link
Collaborator Author

lidezhu commented Jan 30, 2024

/test cdc-integration-storage-test

@lidezhu
Copy link
Collaborator Author

lidezhu commented Jan 30, 2024

/test cdc-integration-mysql-test

@lidezhu lidezhu changed the title Optimize memory usage of RowChangedEvent *(ticdc): optimize memory usage of RowChangedEvent Feb 1, 2024
@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Feb 6, 2024
@asddongmen
Copy link
Contributor

Excellent work! Respect!

@ti-chi-bot ti-chi-bot bot added the lgtm label Feb 20, 2024
Copy link
Contributor

ti-chi-bot bot commented Feb 20, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: asddongmen, sdojjy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Feb 20, 2024
Copy link
Contributor

ti-chi-bot bot commented Feb 20, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-02-06 08:58:21.910152165 +0000 UTC m=+260827.476922039: ☑️ agreed by asddongmen.
  • 2024-02-20 05:14:30.12287006 +0000 UTC m=+334158.870493170: ☑️ agreed by sdojjy.

@lidezhu
Copy link
Collaborator Author

lidezhu commented Feb 20, 2024

/retest

@lidezhu
Copy link
Collaborator Author

lidezhu commented Feb 20, 2024

/retest required

Copy link
Contributor

ti-chi-bot bot commented Feb 20, 2024

@lidezhu: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:

  • /test cdc-integration-kafka-test
  • /test cdc-integration-mysql-test
  • /test cdc-integration-pulsar-test
  • /test cdc-integration-storage-test
  • /test dm-compatibility-test
  • /test dm-integration-test
  • /test engine-integration-test
  • /test verify
  • /test wip-pull-build
  • /test wip-pull-check
  • /test wip-pull-unit-test-cdc
  • /test wip-pull-unit-test-dm
  • /test wip-pull-unit-test-engine

Use /test all to run the following jobs that were automatically triggered:

  • pingcap/tiflow/ghpr_verify
  • pingcap/tiflow/pull_cdc_integration_kafka_test
  • pingcap/tiflow/pull_cdc_integration_pulsar_test
  • pingcap/tiflow/pull_cdc_integration_storage_test
  • pingcap/tiflow/pull_cdc_integration_test
  • pingcap/tiflow/pull_dm_compatibility_test
  • pingcap/tiflow/pull_dm_integration_test
  • pingcap/tiflow/pull_engine_integration_test

In response to this:

/retest required

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@lidezhu
Copy link
Collaborator Author

lidezhu commented Feb 20, 2024

/test cdc-integration-mysql-test

@lidezhu
Copy link
Collaborator Author

lidezhu commented Feb 20, 2024

/test cdc-integration-storage-test

@lidezhu
Copy link
Collaborator Author

lidezhu commented Feb 20, 2024

/test verify

@lidezhu
Copy link
Collaborator Author

lidezhu commented Feb 20, 2024

/retest

@ti-chi-bot ti-chi-bot bot merged commit a8c7563 into pingcap:master Feb 20, 2024
28 checks passed
@lidezhu lidezhu deleted the reduce-cdc-mem-usage branch February 20, 2024 11:56
GMHDBJD added a commit to 3AceShowHand/tiflow that referenced this pull request Feb 21, 2024
GMHDBJD added a commit to 3AceShowHand/tiflow that referenced this pull request Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize memory usage of RowChangedEvent
3 participants