Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lightning: remote backend #58789

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

zeminzhou
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Signed-off-by: zeminzhou <zhouzemin@pingcap.com>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/needs-tests-checked release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 8, 2025
Copy link

ti-chi-bot bot commented Jan 8, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign lance6716 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

tiprow bot commented Jan 8, 2025

Hi @zeminzhou. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: zeminzhou <zhouzemin@pingcap.com>
Copy link

ti-chi-bot bot commented Jan 8, 2025

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456.

📖 For more info, you can check the "Contribute Code" section in the development guide.


Notice: To remove the do-not-merge/needs-tests-checked label, please finished the tests then check the finished items in description.

For example:

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

📖 For more info, you can check the "Contribute Code" section in the development guide.

Copy link

codecov bot commented Jan 8, 2025

Codecov Report

Attention: Patch coverage is 3.70370% with 1040 lines in your changes missing coverage. Please review.

Project coverage is 73.0927%. Comparing base (c199ddf) to head (022f728).
Report is 6 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #58789        +/-   ##
================================================
+ Coverage   73.0885%   73.0927%   +0.0041%     
================================================
  Files          1676       1710        +34     
  Lines        463643     472765      +9122     
================================================
+ Hits         338870     345557      +6687     
- Misses       103924     105630      +1706     
- Partials      20849      21578       +729     
Flag Coverage Δ
integration 49.8207% <3.7037%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.6910% <ø> (ø)
parser ∅ <ø> (∅)
br 45.7330% <ø> (ø)

Copy link

ti-chi-bot bot commented Jan 8, 2025

@zeminzhou: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
idc-jenkins-ci-tidb/check_dev 022f728 link true /test check-dev
idc-jenkins-ci-tidb/unit-test 022f728 link true /test unit-test
idc-jenkins-ci-tidb/build 022f728 link true /test build
pull-lightning-integration-test 022f728 link true /test pull-lightning-integration-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ystaticy
Copy link
Contributor

ystaticy commented Jan 8, 2025

Please add UT for remote backend.

}

func (c *chunkSender) putChunk(ctx context.Context, data []byte) (uint64, error) {
data0 := make([]byte, len(data))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please give this variable a meaningful name to represent its specific meaning

updateFlushedChunkDuration = 10 * time.Second
)

func genLoadDataTaskID(keyspaceID uint32, taskID int64, cfg *backend.EngineConfig) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Does this all look like util functions? Can it be merged into a single file with util.go?"

)

func genLoadDataTaskID(keyspaceID uint32, taskID int64, cfg *backend.EngineConfig) string {
return fmt.Sprintf("%d-%d-%d-%d", keyspaceID, taskID, cfg.TableInfo.ID, cfg.Remote.EngineID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"%d-%d-%d-%d" needs to be set as a const.

return io.ReadAll(resp.Body)
}

func parseLDWUrl(resp *http.Response, enableTLS bool) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to explain the definition of LDW or avoid using the abbreviation. Additionally, have we described LDW in the code comments? Or, what exactly is the role of load_data_worker?This might cause confusion for other developers.

}

func parseLDWUrl(resp *http.Response, enableTLS bool) string {
base := strings.TrimSuffix(resp.Header.Get("Location"), "/load_data")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This string can be set as a constant


nextChunkID := result.HandledChunkID + 1
for nextChunkID <= expectChunkID {
url := fmt.Sprintf("%s/load_data?cluster_id=%d&task_id=%s&writer_id=%d&chunk_id=%d",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These URL strings can be set as constants

@@ -158,7 +161,7 @@ func (d *DBStore) adjust(
s *Security,
tlsObj *common.TLS,
) error {
if i.Backend == BackendLocal {
if i.Backend == BackendLocal || i.Backend == BackendRemote {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this conditional statement be encapsulated into isPhysicalBackend

@@ -65,6 +65,7 @@ type GlobalMydumper struct {
type GlobalImporter struct {
Backend string `toml:"backend" json:"backend"`
SortedKVDir string `toml:"sorted-kv-dir" json:"sorted-kv-dir"`
RemoteAddr string `toml:"remote-addr" json:"remote-addr"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if remote-addr is an appropriate name, or if we could use a clearer name, such as remote-load-data-worker-addr?

@@ -1832,3 +1859,44 @@ func (tr *TableImporter) preDeduplicate(
)
return err
}

func estimateEngineDataSize(tblMeta *mydump.MDTableMeta, tblInfo *checkpoints.TidbTableInfo, isIndexEngine bool, logger log.Logger) int64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these functions only used by the remote backend? If they are solely related to remote logic, it would be better to place them in the remote directory and the corresponding files for easier management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/needs-linked-issue do-not-merge/needs-tests-checked release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants