Skip to content

Databricks create/reset then run-now #29733

@kyle-winkelman

Description

@kyle-winkelman

Description

Allow an Airflow DAG to define a Databricks job with the api/2.1/jobs/create (or api/2.1/jobs/reset) endpoint then run that same job with the api/2.1/jobs/run-now endpoint. This would give similar capabilities as the DatabricksSubmitRun operator, but the api/2.1/jobs/create endpoint supports additional parameters that the api/2.1/jobs/runs/submit doesn't (e.g. job_clusters, email_notifications, etc.).

Use case/motivation

Create and run a Databricks job all in the Airflow DAG. Currently, DatabricksSubmitRun operator uses the api/2.1/jobs/runs/submit endpoint which doesn't support all features and creates runs that aren't tied to a job in the Databricks UI. Also, DatabricksRunNow operator requires you to define the job either directly in the Databricks UI or through a separate CI/CD pipeline causing the headache of having to change code in multiple places.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions