-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API, Core: Update inclusive metrics evaluator for extract and transforms #12311
API, Core: Update inclusive metrics evaluator for extract and transforms #12311
Conversation
|
||
@ParameterizedTest | ||
@FieldSource("MISSING_STATS_EXPRESSIONS") | ||
public void testMissingStats(Expression expr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the bounds are in metadata, they are trusted. Bounds should be missing in cases where there are variant values with incompatible types.
return current; | ||
} | ||
|
||
public static ByteBuffer serializeBounds(Map<String, VariantValue> bounds) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This serialization is temporary and will be replaced with a well-defined variant serialization.
0c80b18
to
09fcf7b
Compare
12b7f7c
to
b1b068d
Compare
b1b068d
to
8faa5f1
Compare
This needed changes to |
8faa5f1
to
bcdea90
Compare
This now relies on #12374 to move the serialized variant classes into API so that |
37a0c04
to
d1a260c
Compare
Thanks for reviewing, @danielcweeks! |
This updates
InclusiveMetricsEvaluator
that uses column stats to skip data files during scan planning.The evaluator was implementing the older
BoundExpressionVisitor
interface that only supportedBoundReference
and not otherBoundTerm
instances likeBoundTransform
. After #12304,BoundExtract
also needs to be supported. This PR includes #12304 and will be rebased when it is merged.Filtering works for transformed values when the transform is order preserving. If it is not order preserving (like
bucket
) the bounds cannot be used.Most of the changes to
BoundExpressionVisitor
are to produce the lower and upper bounds values that are tested:BoundReference
, deserialize the bound from the correctlowerBounds
orupperBounds
map (moved toparseLowerBound
andparseLowerBound
BoundTransform
, deserialize the bound and transform it if the transform is order-preserving (intransformLowerBound
andtransformUpperBound
)Extract
, deserialize the bound as a map from field name toVariantValue
, then convert the value to the internal representation (inextractLowerBound
andextractUpperBound
)This adds new test suites for the new
BoundTerm
cases that are supported:TestInclusiveMetricsEvaluatorWithExtract
tests variant casesTestInclusiveMetricsEvaluatorWithTransforms
tests transform cases