-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initialization failure handling control #757
Comments
I've got a better idea. Instead of a boolean property we could have a
The use could then implement its own logic if they want to (inspired by discussion with @yruslan). Things for consideration:
|
When lineage is initialized in codeless mode , i.e, having to install the jar on spark driver (spark env: databricks) using init scripts, and when initialization is failed due to lineage server being not available or other issues, the jobs would not be carried on as said above. Rather, driver becomes unresponsive and spark commands get cancelled automatically without any error. To workaround this issue, I have used console as fallback dispatcher so that when there is a failure wrt to http server, jobs should process as is. Current behavior is that when server is not available, as per code, it just throws exception to the running environment and driver becomes unresponsive. Please consider this when the changes are made wrt to this feature. |
Add a configuration property to control how the agent should behave on initialization failures.
Currently when an error occurs during initialization phase of the agent (e.g. misconfiguration, failed handshake with the server etc) the error is logged and the Spark job carries on.
Such behavior was chosen with the aim to not affect the Spark job and do not interrupt potentially higher priority (from the operational perspective) processes. But sometimes, when lineage is mandatory for the user, they might prefer the Spark job to explicitly fail instead of silently continue without lineage tracking.
This behavior can be controlled by the config property:
spline.onInitFailure = LOG | BREAK
The default value would be
LOG
that corresponds to the current behavior. TheBREAK
mode would simply propagate the error to the Spark process causing the Spark job to fail.The text was updated successfully, but these errors were encountered: