-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Make update config behavior optional in GlueJobOperator #30162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make update config behavior optional in GlueJobOperator #30162
Conversation
| return self.get_conn().start_job_run(JobName=job_name, Arguments=script_arguments, **run_kwargs) | ||
| if self.update_config: | ||
| self.create_or_update_glue_job() | ||
| return self.get_conn().start_job_run( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small non-blocking nitpick, if you'd be so kind. We are trying to encourage using the self.conn cached property over the self.get_conn() getter method. If you are so inclined, could you make that change here and in the relevant test below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ferruzzi, sure I can change it and also other occurences in the file 👍
| if self.update_config: | ||
| self.create_or_update_glue_job() | ||
| return self.get_conn().start_job_run( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fast turnaround!
I don't think this quite matches the behaviour before your config update changes. Previously the code would check for an existing job with that name and then start it OR it would create the job if it didn't find an existing one, and then start that one. In this PR it seems if update_config is False (the default) we go straight to starting the job without first creating it if it doesn't exit (but I may be missing something).
I think you want something like:
try:
if self.update_config:
job_name = self.create_or_update_glue_job()
else:
job_name = self.get_or_create_glue_job()
return glue_client.start_job_run(JobName=job_name, Arguments=script_arguments, **run_kwargs)
except Exception as general_error:
self.log.error("Failed to run aws glue job, error: %s", general_error)
raiseThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @o-nikolas, oh nice catch indeed. this is exactly what the code was doing before, I am gonna change it
Closes: #29960