-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Record column name for exceptions while writing frames in RowBasedFrameWriter #16130
Record column name for exceptions while writing frames in RowBasedFrameWriter #16130
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
- I have added a few comments about the relevance of the new error class.
- Can you please add a test case where the exception gets generated?
- Also, what does it mean for MSQ, should MSQ retry if it encounters such exceptions or not? It seems like retrying shouldn't help here, but per my understanding, MSQ will still treat errors like these as
UnknownFaults,
and will retry. We should update that logic in MSQ (check how we do it forInvalidNullByte
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt that we need this class anywhere. Can't we use a generic RuntimeException
with the proper error message? Or better yet, use DruidException,
so that we can analyze the message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make a MSQ fault out of this. That way we donot have to adjust any retry logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting FrameFieldWriterException
now to an MSQFault
type
@@ -53,17 +50,16 @@ private InvalidNullByteException( | |||
@Nullable final Integer position | |||
) | |||
{ | |||
super(StringUtils.format( | |||
super(column, StringUtils.format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels weird that we have this extra layer of indirection, to what will essentially call the RuntimeException
. If we don't extract the relevant fields (.getColumn()
) from the FrameFieldWriterException
anywhere, or use check for it's instanceof, then I suppose we can get rid of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting to MSQFault now for which the specific exception class is required. InvalidNullByteException
now extends RuntimeException
directly.
processing/src/main/java/org/apache/druid/frame/write/RowBasedFrameWriter.java
Fixed
Show fixed
Hide fixed
Tried multiple routes incl insert via external inline tables and insert via joins, but an error isn't generated. Therefore not adding the test case.
The conversion to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment, overall the changes LGTM
FrameFieldWriterException.builder() | ||
.source(ffre.getSource()) | ||
.rowNumber(ffre.getRowNumber()) | ||
.column(ffre.getColumn()) | ||
.errorMsg(ffre.getErrorMsg()) | ||
.build(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use the normal constructor since we are using all the parameters in a single place. Invalid null byte exception has the builder because it populates different information at different places up the call stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, the ask is to remove the builder from FrameFieldWriterException
. But we do build it progressively in ScanQueryFrameProcessor
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add this class to MSQFaultsSerdeTest
for completeness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I've also added the description of the fault in reference.md
and removed the term frame
from the fault code and log message since that is user-facing.
@JsonTypeName(FrameFieldWriterFault.CODE) | ||
public class FrameFieldWriterFault extends BaseMSQFault | ||
{ | ||
static final String CODE = "FieldWriterError"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't match the class name, can you please update the class as well? Also, wrt the naming convention of other faults, we don't have "Error" in the suffix. I think something like "InvalidFieldFault" might be more suitable, and the name could be "InvalidField". I am open to other naming conventions as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went with your suggestion. We do have UnknownFault
->UnknownError
though, but InvalidField
sounded fine.
@@ -49,7 +51,7 @@ public FrameFieldWriterFault( | |||
super( | |||
CODE, | |||
StringUtils.format( | |||
"Error[%s] while writing frame for source[%s], rowNumber[%d] column[%s]", | |||
"Error[%s] while writing a field for source[%s], rowNumber[%d], column[%s].", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Duplicate error message as FrameFieldWriterException
, can extract the format string out. It will make the updates easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now propagate the RE's message directly to the Fault class.
.source(ffre.getSource()) | ||
.rowNumber(ffre.getRowNumber()) | ||
.column(ffre.getColumn()) | ||
.errorMsg(ffre.getErrorMsg()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also add the problematic field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field value isn't available at the place the exception starts -- which is RowBasedFrameWriter#writeDataUsingFieldWriters
@@ -299,11 +300,15 @@ | |||
.column(signature.getColumnName(i)) | |||
.build(); | |||
} | |||
catch (ParseException pe) { | |||
throw Throwables.propagate(pe); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note
Throwables.propagate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Failures are code coverage checks - but there isn't a known dataset that will trigger this. Since the changes are isolated, they shouldn't affect the current code.
Current Runtime Exceptions generated while writing frames only include the exception itself without including the name of the column they were encountered in, for e.g.
java.lang.RuntimeException: java.lang.ClassCastException: class java.lang.Long cannot be cast to class java.lang.String (java.lang.Long and java.lang.String are in module java.base of loader 'bootstrap')
Start including additional details for easier debugging.
New Exception:
[InvalidFieldError](https://druid.apache.org/docs/latest/multi-stage-query/reference.html#error_InvalidFieldError): Error[class java.lang.Long cannot be cast to class java.lang.String (java.lang.Long and java.lang.String are in module java.base of loader 'bootstrap')] while writing field for source[external input source: LocalInputSource{baseDir="null", filter=null, files=[foo]}], rowNumber[1], column[arrayString].
Tested the generated exception by raising it explicitly in a custom build.