-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Port test in select.rs to sqllogic #6158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tests that cannot be ported (yet):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much @aprimadi -- this looks great. I reviewed all the changes carefully
Cause I like easy task
😆 -- thank you -- this is really helpful overall and I think makes it easier, over the long term, to keep up with DataFusion and improve its test coverage
You are inspiring me to try and paralleize this work more -- I'll try and write up some tickets that can be split up for others who want to get involved with DataFusion with a useful but low overhead tasks (and it will be nice to keep track of how much we have to go)
false | ||
false | ||
|
||
# explain select between |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for testing the whole plan
@@ -168,7 +168,7 @@ async fn context_for_test_file(relative_path: &Path) -> SessionContext { | |||
let ctx = SessionContext::with_config(config); | |||
|
|||
match relative_path.file_name().unwrap().to_str().unwrap() { | |||
"aggregate.slt" | "select.slt" => { | |||
"aggregate.slt" => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
No worries @alamb. This keeps me engaged while I'm learning through the more difficult bits of the project. |
It is much appreciated. While you have a newcomer's eyes, I would be interested in the parts you found the most difficult to pick up (really I am interested in how to make it less difficult). |
The external sort come to mind and also the k-way sort preserving merge. K-way sort preserving merge because there is a clever bit of the code that uses an array representation of a binary tree (to implement losers tree). Like for example this code:
i.e. the root of the tree is an element at index 1 in the vector and the first child of the root is an element at index 2 in the vector and so on... I guess I'm just hesitant to work on larger tasks due to:
Currently I prefer to work on issue that I can finish on one weekend and yeah currently #4495 is my default go to issue if I don't find issues that are interesting or that I'm sure I can finish on one weekend 😆. |
Yes, I agree this is quite a nice piece of code that is non trivial. I think the parts of DataFusion that are the most performance critical are also the parts that are the hardest to understand. It would be great to improve the understandability. @aprimadi I wonder if you might be willing to take a shot at adding some comments (perhaps copy/paste/modify your comments in #6158 (comment)) to help other future readers Also, I will try and be better about triaging tickets as they come in and marking the relevant ones as "good first issues" -- https://github.com/apache/arrow-datafusion/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22 It is a little lacking now |
Which issue does this PR close?
More work toward #4495
Rationale for this change
Cause I like easy task
What changes are included in this PR?
Just porting some tests to sqllogic
Are these changes tested?
Yes
Are there any user-facing changes?
No