Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Team Based Configuration (AIP-67) #45016

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

o-nikolas
Copy link
Contributor

PLEASE READ DESCRIPTION

Context: In the context of multi team airflow, teams will have their own configuration files for the components that run on team-only hosts (for workers, dag parsers, etc running on those hosts). However some team components must live along side one another on the scheduler host, namely executors, and some airflow configuration affecting teams must also be accessible to the scheduler.

This commit delivers the capability for airflow conf to load and allow access to multiple team configurations in addition to the main/global configuration we have today.

A new config is added core.multi_team_configurations (the name is not set in stone) in which teams and their associated configuration files are specified. E.g.:

path/to/team_a/config:team_a,different/path/team_b/configuration:team_b

Airflow conf, during initialization, loads each of these configurations and makes them accessible by id, e.g.: conf.get("core", "executor", team_id="team_a")

Within those team configurations, teams can specify the executors they would like to use and the associated configuration for those executors. Since each team configuration is loaded and stored separately, this allows multiple instances of the same executor to be configured.

The Base executor has been updated with a config shim to allow easier access to team based executor configations and the AWS ECS executor has been updated to use it as a proof of concept. Other executors will need to be updated to be "multi team compliant" at a later time to minimize the size and scope of this commit.

NOTE: There was an initial proposal to move to TOML format for Airflow config and store all configuration (both team and gloabl) in one file. This is still a possibility in the future, but the approach in this commit was decided for the following reasons:

  1. It leverages the same configuration format as Airflow always had, so there is less barrier to entry and migration for users
  2. It is a simpler implementation, which simplifies the overall process of releasing an initial version of multi-team airflow
  3. Separate files for teams is actually a nice mechanism for management of the overall cluster. Teams can update their own files as they see fit and they only need to be synced onto the scheduler host for pickup, rather than the configuration changes needing to be made to a shared file which no team should be able to view (otherwise they would see the configuration from other teams).

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

Context: In the context of multi team airflow, teams will have their own
configuration files for the components that run on team-only hosts
(for workers, dag parsers, etc running on those hosts). However some team
components must live along side one another on the scheduler host, namely
executors, and some airflow configuration affecting teams must also be
accessible to the scheduler.

This commit delivers the capability for airflow `conf` to load and allow
access to multiple team configurations in addition to the main/global
configuration we have today.

A new config is added `core.multi_team_configurations` (the name is not
set in stone) in which teams and their associated configuration files
are specified. E.g.:
```
path/to/team_a/config:team_a,different/path/team_b/configuration:team_b
```

Airflow conf, during initialization, loads each of these configurations
and makes them accessible by id, e.g.: `conf.get("core", "executor",
team_id="team_a")`

Within those team configurations, teams can specify the executors they
would like to use and the associated configuration for those executors.
Since each team configuration is loaded and stored separately, this
allows multiple instances of the same executor to be configured.

The Base executor has been updated with a config shim to allow easier
access to team based executor configations and the AWS ECS executor has
been updated to use it as a proof of concept. Other executors will need
to be updated to be "multi team compliant" at a later time to minimize
the size and scope of this commit.
@boring-cyborg boring-cyborg bot added area:Executors-core LocalExecutor & SequentialExecutor area:providers provider:amazon-aws AWS/Amazon - related issues labels Dec 17, 2024
@o-nikolas o-nikolas added the aip-67 multi-team label Dec 17, 2024
@kaxil
Copy link
Member

kaxil commented Dec 18, 2024

Nice, should we convert it to a draft PR to avoid an accidental merge, since this is for 3.1

@o-nikolas
Copy link
Contributor Author

Nice, should we convert it to a draft PR to avoid an accidental merge, since this is for 3.1

@kaxil
I can update the PR to add more gating logic. Right now there is some in the executor loader, but I can add another one on the use of the core.multi_team_configurations config (such that users cannot enable multi team configurations).

Otherwise it will be difficult to develop the rest of the features without the merging of this one. WDYT?

@o-nikolas o-nikolas marked this pull request as draft December 20, 2024 00:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aip-67 multi-team area:Executors-core LocalExecutor & SequentialExecutor area:providers provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants