-
Notifications
You must be signed in to change notification settings - Fork 670
REFACTOR-#1763: Move logic of merge into the query compiler
#1764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REFACTOR-#1763: Move logic of merge into the query compiler
#1764
Conversation
f8aef29 to
94efa85
Compare
Codecov Report
@@ Coverage Diff @@
## master #1764 +/- ##
==========================================
+ Coverage 84.23% 84.25% +0.02%
==========================================
Files 77 77
Lines 8056 8062 +6
==========================================
+ Hits 6786 6793 +7
+ Misses 1270 1269 -1
Continue to review full report at Codecov.
|
94efa85 to
9905b60
Compare
66c2469 to
2dc1142
Compare
|
@devin-petersohn , just a friendly reminder to review. |
|
@YarShev As described, you made 3 independent changes in one PR. Could you split this one into several? |
|
@anmyachev , I don't think that these changes are so big and confused in order to split them into several PRs. If you have any questions on changes, I can explain. |
Indeed, there are few changes, but nevertheless, it is not worth mixing fundamental tasks such as refactoring and bug fixing. So there is minimum 2 PRs. |
|
The changes which are in the PR are dependent on each other. My description is just an explanation for improving of review. I can create an issue for each unit which I mentioned. But let's imagine, how many issues are needed for such changes like these #1638. So, I think it is just a wasting of time. |
This will be a challenge for #1638, but it will not be a challenge for your PR. |
|
#1638 is a corner case. Still, I believe that can be resolved without dividing into several PRs. |
merge into the query compilermerge into the query compiler
devin-petersohn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @YarShev, a couple of minor nit formatting changes.
merge into the query compilermerge into the query compiler
merge into the query compilermerge into the query compiler
7b70694 to
75946ab
Compare
merge into the query compilermerge into the query compiler
merge into the query compilermerge into the query compiler
75946ab to
835cf3f
Compare
6c283a7 to
708d50f
Compare
|
@devin-petersohn , just a friendly reminder to review. |
|
@devin-petersohn , I have increased the size of commit message header because I think, |
gshimansky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a small question, but my main concern is that there is no new test to verify that #1771 is fixed. We may have more regressions if we don't add tests for every bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sort if sort else sort is always False. Is it what is meant here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we don't want to sort cells within execution of function on partitions in order to avoid additional overhead. The logic regarding sorting is below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then should this line be replaced with kwargs["sort"] = False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it makes sense, done. Thanks!
|
@gshimansky , Regarding to the fix for #1771 . There is a test for this. Here when |
I didn't quite understand how #1771 bug is tested. Right now on master branch all tests pass while the bug exists. We need a test that would fail on current master branch to verify that this PR fixes this bug. If you say that existing tests are enough to test for this bug and require just a small change, can you add this change to this PR? |
|
@gshimansky , These changes are in this PR. It is the first one and the second one. The bug was that: if we call |
Yes I saw the bug fix in this PR. I didn't see any new test in PR. None of test files are modified in this PR. There is a general rule that when you fix a bug you add a test for it. This test should fail without this fix and pass with it. |
commitlint.config.js
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please open a new issue and new PR for this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, okay.
into the query compiler Signed-off-by: Igoshev, Yaroslav <yaroslav.igoshev@intel.com>
708d50f to
19c5109
Compare
|
@gshimansky , I added an appropriate test for #1771 . |
|
@YarShev After CI passes I will merge this. Thanks! |
into the query compiler Signed-off-by: Igoshev, Yaroslav <yaroslav.igoshev@intel.com>
What do these changes do?
These changes move out the logic of
mergeandsort_indexfrom API layer into the query compiler, fix incorrect computing ofrow_lengthsandcolumn_widthsas well.flake8 modinblack --check modingit commit -smergeinto the query compiler #1763, merge throws exception on MODIN_ENGINE=python #1771