-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Description
Each provider should have it's own independed sub-project in our mono-repo.
They should be fully "standalone" so that you can not only develop them completely independently from airflow core, but also that all the dependencies of thoseshould be stored in their own pyproject.toml. And uv workspace feature should be used to bind together all the provider sub-projects so that you can continue the development where you have airflow, task-sdk, providers and all the other sub-projects of Airlflow monorepo together in a single editable environment.
This move has been attempted earlier in #28292 and #28291 - but we did not have a good "workspace" solution and Airflow 2 namespace approach prevented us from making it good environment for provider development. With Airflow 3 and uv workspace feature that has been added - largely with our input so that Airlfow's provider structure could benefit from the uv workspace functionality, it's now entirely possible to do.
This means that dependencies should be moved from providrer.yaml files to pyproject.toml
The ideal setup there is to have this kind of structure (details to be worked out):
providers/
providers-amazon/
src
tests-integration
tests-system
tests
docs
pyproject.toml
...
Some important properties of the solution:
- Airflow "core" projects should not rely on providers being installed
- It should be possible to install all airflow core packages and providers and synchronize/resolves wit `uv sync
- it should be possible to install provider in
--editablemode treating it as separate project from the workspace - It should be possible to install provider with GitHub URL
- docs/ all kinds of tests, images etc. should all work independently (though thanks to monorepo, we can keep the code to run those in
breezeas we do currently. Some of the currentdoccode will need to be moved to breeze as well for that likely - we should be able to apply pyproject.toml changes for all providers automatically (might be semi-automated or with pre-commit). Quite often we make "global" changes there that affect all providers - and currently it is done via modifying breeze and templates for dynamically generated pyproject.toml file
- we need to keep reproducibility of provider packages intact - which likely means that they should be still generated with breeze - with all the "extra" stuff such as making sure we have controlled package build environment.
- we will need to change building of packages in CI in
docker containerenvironment - while currently we useflitas build backend and this comes from generatedpyproject.tomlthat is placed inside breeze, if we keep pyproject.toml files in the repository, incoming PRs from forks might change build backends and thus inject any code in our build process
The most likely way to implement it is to:
- manually convert one / few representative but not biggest providers first (POC) - and make a few releases with those - while updating our breeze automation to work in both cases - that will allow to iron out some teething problems
- develop automation for converting the providers - similar to Add script to move providers to a new directory structure #28291
- perform the test if the
uv workspacefeature is usable at scale of 100+ projects bound together (and work withuvteam to fix it if not) - apply - rather quickly, but incrementally - the automation to all the providers of ours - while letting all the in-progress contributors about the changes upfront and explaing what needs to be done
Sub-issues
Metadata
Metadata
Assignees
Labels
Type
Projects
Status