-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable_elastic_disk property incorrectly mapped when making a request to Databricks #25232
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
Maybe you can provide a PR fixing that? There are Databricks people here (for example @alexott ) that can do some review and double check it. Shall we assign it to you @aru-trackunit ? |
Basically - things get implemented here when someone implements it. Airflow is created by > 2100 people (most of them like you - users) so if you want to make sure a problem is fixed timely, the best way is to make a PR - otherwise it will have to wait for someone who will pick it up and implement. |
I marked it as "good first issue" but that's as much as I can do. Same with "eariler version" - if there is somone who commits to cherry-picking the fix and preparing an earlier version of the provider, they are free to make a PR so there must be someone who will take care of it The most cretain way is to just roll sleevs up and do it. See https://github.com/apache/airflow#release-process-for-providers |
Hi, I will provide fix for that issue. |
Cool! |
@potiuk Could you add me to the contributors list? I can't push my local dev branch. |
@jgr-trackunit you need to create your own fork, and create PR from it |
Yep. See https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst for all details about contribution (including the need to create a fork). |
Apache Airflow version
2.2.2
What happened
When using
apache-airflow-providers-databricks
in version 2.2.0 I am sending a request to databricks to submit a job.https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsCreate ->
api/2.0/jobs/runs/submit
Databricks is expecting a boolean on a property
enable_elastic_disk
whileairflow-databricks-provider
sends a string.And the property
enable_elastic_disk
is not set on databricks side. I did also the same request to databricks from a Postman and the property was set totrue
which means that the problem does not lie on databricks side.I have tried to find the problem and it apparently is this line. Before executing the line
enable_elastic_disk
isTrue
of type boolean but after it becomes a string'True'
which databricks does not parse.airflow/airflow/providers/databricks/operators/databricks.py
Line 381 in 1cb16d5
What you think should happen instead
After setting property
enable_elastic_disk
it should be propagated into databricks but it's not.How to reproduce
Try to run:
Make sure airflow connection named
databricks
is set and check whether databricks has the property set.After executing there is a need to check whether the property is set on databricks we can do it by using endpoint:
https://DATABRICKS_HOST/api/2.1/jobs/runs/get?run_id=123
Operating System
MWWA
Versions of Apache Airflow Providers
apache-airflow-providers-databricks
in version 2.2.0Deployment
MWAA
Deployment details
No response
Anything else
That's a permanent and repeatable problem. It would be great if this fix could be attached to lower versions for example
2.2.1
, because I am not sure when AWS decides to upgrade to the latest airflow code and I am also not sure if installing higher versions of databricks provider on airflow2.2.2
will not cause issues.Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: