-
Notifications
You must be signed in to change notification settings - Fork 16.4k
[FEAT] databricks repair run with reason match and appropriate new settings #41412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT] databricks repair run with reason match and appropriate new settings #41412
Conversation
b1e0418 to
4ad4885
Compare
4ad4885 to
a21faf3
Compare
98efbd6 to
3869e83
Compare
3869e83 to
67fb037
Compare
shahar1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Lee-W
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no huge issue, just a few nitpicks
|
@Lee-W require changes done, also added assert_called_with check for |
|
@Lee-W should we have one more configurable like user want to repair only in case if |
I though we already have |
currently the flow is if |
Yep, it sounds good. Is there anything missed? |
|
@Lee-W I'm thinking for an option not to trigger repair if reason matches nothing in the dict |
I don't think we need a new option. If the reason is provided and nothing is matched, then we should not repair. This should be the default behavior. If the user wants to repair at all cases, than the reason dict should not be passed |
|
thanks @Lee-W , making the changes, Yes, I also agree with same default behaviour. |
|
@Lee-W changes are done, can you please review |
|
LGTM 👍 |
|
@Lee-W currently we are checking in root state_message only, should we also check |
Could you provide an example to illustrate this? Thanks! |
|
For example, sometime in case of custom exception, when user want to retry with another job config, in root state_message we get only below one, and main exception in insider array |
|
@Lee-W any suggestions? |
|
It looks like the error is part of the |
|
ya but not always, sometime root |
|
if that's the case, it makes sense to have one |
|
cool, will create a new pr for the same. Thanks |
[FEAT] databricks repair run with reason match and appropriate new settings in
DatabricksRunNowOperatorThis is useful in scenario like where we get databricks exception like
AWS_INSUFFICIENT_INSTANCE_CAPACITY_FAILURE (CLIENT_ERROR).,AWS_MAX_SPOT_INSTANCE_COUNT_EXCEEDED_FAILUREwhere user want to repair run with let's say different zone of region or let's say with 100% ondemand^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.