Skip to content

[databricks] Refactor how Databricks workflows repair / repair all is implemented #40587

@tatiana

Description

@tatiana

Apache Airflow Provider(s)

databricks

Versions of Apache Airflow Providers

6.7.0

Apache Airflow version

2.9

Operating System

all

Explain the improvement

To expose repair and repair all tasks, the Databricks provider 6.7.0 relies on the soon-to-be-deprecated Airflow 2.x plugins. This was introduced in #40153. This is one of the most used features of the original https://github.com/astronomer/astro-provider-databricks, and we're completing the migration with this PR. I've interacted with at least five Astronomer customers who use this feature, and the project has received over 115k monthly downloads on PyPI.

To use plugins is suboptimal, but as of Airflow 2.9, the core Airflow doesn't offer a better way to implement this feature, as discussed in the thread:
#40153 (comment)

As part of Airflow 3.x, we want to find a better way to allow providers to implement this feature type. @potiuk is going to help us log this into the 3 roadmap (he already added to https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3+Workstreams#Airflow3Workstreams-Othercandidates.1), and Astronomer commits to migrating the repair and job links to the Airflow 3.x strategy. I also aligned with @cmarteepants on this topic and she's on agreement we'll reimplement this once there is an alternative approach.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions