-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid handling JSON_ARRAY as multi value JSON during transformation #14738
Conversation
Please fix the tests. |
OK, looks like I got few things wrong about the extractor, fixing it. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14738 +/- ##
============================================
+ Coverage 61.75% 63.90% +2.15%
- Complexity 207 1607 +1400
============================================
Files 2436 2703 +267
Lines 133233 150731 +17498
Branches 20636 23290 +2654
============================================
+ Hits 82274 96329 +14055
- Misses 44911 47183 +2272
- Partials 6048 7219 +1171
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
…pache#14738) * Avoid handling JSON_ARRAY as multi value JSON during transformation * Revert "Avoid handling JSON_ARRAY as multi value JSON during transformation" This reverts commit 289b433. * handle empty JSON array during transformation * cosmetic
The transform pipeline is failing with an
ArrayIndexOutOfBoundsException
when it encounters a JSON column value with empty json array as JSON value is not standardised (empty array -> null) https://github.com/apache/pinot/blob/master/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/DataTypeTransformer.java#L97Earlier empty array for json datatype was getting extracted as string, now its getting extracted as Object[] due to the change at https://github.com/apache/pinot/pull/14547/files#diff-7ac5349f9d75e27a62a063dbf81db3ed30c8de052b4ffa7719187e4babaa60baR66
which leads to
isMultiValue
returning true for empty json arrayconvertMultiValue
returnsObject[]
whileconvertSingleValue
returns astring
https://github.com/apache/pinot/blob/master/pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/BaseRecordExtractor.java#L39
Updating the transform logic to handle the empty array for JSON datatype