-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
statistics: use another way to merge topn #47765
base: master
Are you sure you want to change the base?
Conversation
dfddf2e
to
54fc562
Compare
54fc562
to
e0e2588
Compare
702a07f
to
547f9c6
Compare
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #47765 +/- ##
================================================
+ Coverage 73.0683% 73.6702% +0.6018%
================================================
Files 1687 1719 +32
Lines 466567 476392 +9825
================================================
+ Hits 340913 350959 +10046
+ Misses 104711 103715 -996
- Partials 20943 21718 +775
Flags with carried forward coverage won't be shown. Click here to find out more.
|
And some codes like the in-place updates for the heap are optimized for the memory. |
/retest |
globalTopN.Sort() | ||
return &globalTopN, remainedTopNs, hists, nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark as deprecated
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick run through the code, I'm still trying to understand the business logic of it. I'll look at it a couple more times soon to try and understand it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. But please move checkTheCurAndMoveForward out as a wrapper and add more comments. Thank you for working on this. Approve in advance.
affectedHist = append(affectedHist, int(histPos)) | ||
} | ||
// Hacking skip. | ||
if uint32(len(finalTopNs)) >= n { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure which one we prefer. len(finalTopNs)
vs. finalTopNs.Len()
. But I guess it doesn't matter.
[LGTM Timeline notifier]Timeline:
|
/test all |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Rustin170506 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What problem does this PR solve?
Issue Number: ref #50761
Problem Summary:
What is changed and how it works?
We use the property that items inside both the TopN and the histogram are ordered to speed up the process.
And use a heuristic cutting:
notNullCount / ndv_in_hist
.the sum occurrence of the affected TopNs + the avg num per distinct value of each affected histogram
In this way, the speed is improved while the CPU is saved.
The CPU usage. Previous VS This pull.
The mem usage is also in an acceptable range.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.