-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extract datafusion-physical-plan
to its own crate
#7432
Conversation
9b41002
to
605d8bb
Compare
use tempfile::TempDir; | ||
|
||
#[tokio::test] | ||
async fn simple_predicate() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
predicate evaluation in filters is well covered by sqllogictests and since this test used CsvExec (which is in datasource) I simply removed the test instead of trying to rewrite it
@@ -502,88 +500,14 @@ mod tests { | |||
binary(l, op, r, input_schema).unwrap() | |||
} | |||
|
|||
#[tokio::test] | |||
async fn project_first_column() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basic projections are well covered by end to end / sqllogic tests -- this test used csvexec
which isn't int he physical plan crate, so I removed this test rather than trying to port it
@@ -509,143 +442,6 @@ mod tests { | |||
Ok(()) | |||
} | |||
|
|||
#[tokio::test] | |||
async fn window_function_with_udaf() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is already covered in the end to end tests https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/tests/user_defined/user_defined_window_functions.rs
} | ||
|
||
#[tokio::test] | ||
async fn window_function() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is redundant with the coverage in https://github.com/apache/arrow-datafusion/blob/main/datafusion/sqllogictest/test_files/window.slt
let schema = csv.schema(); | ||
|
||
let sort_exec = Arc::new(SortExec::new( | ||
vec![ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of this is simply copy/paste and relied on the CsvExec unnecessarily. I updated the tests to only depend on generated data
c9aace4
to
cd059d8
Compare
cd059d8
to
c47ae0d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably fine, however, I think we probably shouldn't make fields of *Exec public, this is a very strong API commitment and quite hard to later evolve/retract. I additionally think some of the duplication of test utilities could be reduced by adding them to datafusion-common, perhaps with #[doc(hidden)]
.
@@ -242,45 +242,6 @@ pub fn with_new_children_if_necessary( | |||
/// Return a [wrapper](DisplayableExecutionPlan) around an | |||
/// [`ExecutionPlan`] which can be displayed in various easier to | |||
/// understand ways. | |||
/// | |||
/// ``` | |||
/// use datafusion::prelude::*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this has been removed because the frontend is not a dependency of physical-plan. Perhaps we could manually construct the physical plan? Don't feel strongly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that is correct. I don't think this API needs an end to end example (even though I think I initially wrote it)
.unwrap() | ||
/// A mock execution plan that simply returns the provided statistics | ||
#[derive(Debug, Clone)] | ||
pub struct StatisticsExec { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find where this was moved from. Possibly this is a duplicate? This seems generally useful for testing, and therefore not the end of the world were it be made public. We could always #[doc(hidden)]
'ify it if that is a concern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure -- there are definitely some areas that would be nice to clean up with respect to testing
I have made both these changes -- I think the PR is substantially nicer because of it. Thank you again for the review |
Clippy failures are due to #7572 |
This PR has been open for quite and while -- I am merging it in and I am happy to make any additional changes as follow on PRs. |
FYI @jimexist I finally did it! |
Which issue does this PR close?
Closes #1754
Rationale for this change
DataFusion is getting bigger and so to keep the code more manageable we need to divide it into smaller crates
What changes are included in this PR?
datafusion-physical-plan
crateNote I did copy several functions / testing structs from the core crate to the datafusion-physical-plan crate.
Are these changes tested?
Yes
Are there any user-facing changes?
I don't think so -- everything should be backwards compatible