-
Notifications
You must be signed in to change notification settings - Fork 1.5k
pipe column orderings into pruning predicate creation #15821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
let pruning_predicate = PruningPredicate::try_new( | ||
Arc::clone(predicate), | ||
self.schema(), | ||
vec![ColumnOrdering::Unknown; self.schema().fields().len()], | ||
)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add a new signature to avoid API churn, but I wanted to make it explicit for now to see all of the callsites
3e15afe
to
8c2ceb1
Compare
@adriangb please check out pydantic#28 |
@@ -1566,6 +1599,50 @@ fn build_predicate_expression( | |||
return expr; | |||
} | |||
|
|||
// Special handlng for floats. Because current Parquet statistics do not allow NaN, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block is why we need to get the column ordering info passed down. Here we know which column is being pruned and with which operation. For floats we can disallow pruning because the Parquet stats are incomplete. We can also skip pruning if the stats are not valid because an ordering is not defined for the type.
NaN
is present #15812