Add support for gracefully handling the errors while transformations#9377
Add support for gracefully handling the errors while transformations#9377KKcorps merged 3 commits intoapache:masterfrom
Conversation
e1fe791 to
2574c9f
Compare
Codecov Report
@@ Coverage Diff @@
## master #9377 +/- ##
============================================
- Coverage 69.76% 68.40% -1.37%
+ Complexity 4787 4705 -82
============================================
Files 1885 1885
Lines 100293 100464 +171
Branches 15256 15292 +36
============================================
- Hits 69966 68718 -1248
- Misses 25381 26793 +1412
- Partials 4946 4953 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
| flattenMap(record, new ArrayList<>(record.getFieldToValueMap().keySet())); | ||
| for (String collection : _fieldsToUnnest) { | ||
| unnestCollection(record, collection); | ||
| try { |
There was a problem hiding this comment.
cc: @Jackie-Jiang I remember you mentioned that try-catch will impact the performance? shall do if-else check first?
Codewise I think this is simpler.
if (_continueOnError) {
flattenMap(record, new ArrayList<>(record.getFieldToValueMap().keySet()));
for (String collection : _fieldsToUnnest) {
unnestCollection(record, collection);
}
} else {
try {
flattenMap(record, new ArrayList<>(record.getFieldToValueMap().keySet()));
for (String collection : _fieldsToUnnest) {
unnestCollection(record, collection);
}
} catch (Exception e) {
record.putValue(GenericRow.INCOMPLETE_RECORD_KEY, true);
LOGGER.debug("Caught exception while transforming complex types for record: {}", record.toString(), e);
}
}
There was a problem hiding this comment.
I can do that. It just looked like code was being duplicated and hence didn't do it
There was a problem hiding this comment.
@Jackie-Jiang AFAIK, compiler doesn't optimize the code inside the try block, but other than that there's no performance overhead for just using a try block. Let me know if you have experienced otherwise.
There was a problem hiding this comment.
The current way should be fine. I don't remember seeing extra overhead on try-catch
There was a problem hiding this comment.
Thanks @Jackie-Jiang for confirming! @KKcorps can you also update the change in this PR(https://github.com/apache/pinot/pull/9376/files) to make the code consistent?
...-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/FilterTransformer.java
Outdated
Show resolved
Hide resolved
sajjad-moradi
left a comment
There was a problem hiding this comment.
This PR only handles segment generation part. We need to take care of the realtime ingestion as well.
...java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java
Show resolved
Hide resolved
| flattenMap(record, new ArrayList<>(record.getFieldToValueMap().keySet())); | ||
| for (String collection : _fieldsToUnnest) { | ||
| unnestCollection(record, collection); | ||
| try { |
There was a problem hiding this comment.
@Jackie-Jiang AFAIK, compiler doesn't optimize the code inside the try block, but other than that there's no performance overhead for just using a try block. Let me know if you have experienced otherwise.
| flattenMap(record, new ArrayList<>(record.getFieldToValueMap().keySet())); | ||
| for (String collection : _fieldsToUnnest) { | ||
| unnestCollection(record, collection); | ||
| try { |
There was a problem hiding this comment.
The current way should be fine. I don't remember seeing extra overhead on try-catch
...al/src/main/java/org/apache/pinot/segment/local/recordtransformer/ExpressionTransformer.java
Show resolved
Hide resolved
f7b3844 to
ebe6dad
Compare
Agreed. The reason I haven't tackled it in this PR is because it is not clear what user can do to fix missing data in case we ignore it in realtime-ingestion. |
IMO it should be up to users. If they set |
…pache#9377) * Add support for gracefully handling the errors while transformations * Maintain consistent checks across transformers * Add incompleteRowCount metric to realtime servers Co-authored-by: Kartik Khare <kharekartik@Kartiks-MacBook-Pro.local>
Add support for honouring
continueOnErrorduringAlso, adds support for keeping track of such records using
incompleteRecordCount