Skip to content

Conversation

@joechenrh
Copy link
Contributor

@joechenrh joechenrh commented Nov 29, 2025

What problem does this PR solve?

Issue Number: close #64770

Problem Summary:

What changed and how does it work?

Only sample 1024 (maybe make it configurable) files for each compression type and use harmonic mean to get the average compression ratio.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Signed-off-by: Ruihao Chen <joechenrh@gmail.com>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/needs-tests-checked release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 29, 2025
@joechenrh joechenrh added skip-issue-check Indicates that a PR no need to check linked issue. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 29, 2025
@tiprow
Copy link

tiprow bot commented Nov 29, 2025

Hi @joechenrh. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 29, 2025
@joechenrh joechenrh changed the title importer: only same part of the files to get compression ratio importer: only sample part of the files to get compression ratio Nov 29, 2025
Signed-off-by: Ruihao Chen <joechenrh@gmail.com>
@ti-chi-bot
Copy link

ti-chi-bot bot commented Nov 29, 2025

@joechenrh: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
idc-jenkins-ci-tidb/check_dev eed4f06 link true /test check-dev

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Signed-off-by: Ruihao Chen <joechenrh@gmail.com>
Signed-off-by: Ruihao Chen <joechenrh@gmail.com>
@joechenrh joechenrh marked this pull request as draft November 29, 2025 06:29
@ti-chi-bot ti-chi-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 29, 2025
Signed-off-by: Ruihao Chen <joechenrh@gmail.com>
@ti-chi-bot
Copy link

ti-chi-bot bot commented Nov 29, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign gmhdbjd for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. skip-issue-check Indicates that a PR no need to check linked issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

importinto: scanning large amount of compressed files is slow

1 participant