Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the ability to set parallelism to infinity in Airflow V3 #41162

Open
1 task done
o-nikolas opened this issue Jul 31, 2024 · 0 comments
Open
1 task done

Remove the ability to set parallelism to infinity in Airflow V3 #41162

o-nikolas opened this issue Jul 31, 2024 · 0 comments
Labels
airflow3.0:breaking Candidates for Airflow 3.0 that contain breaking changes airflow3.0:candidate Potential candidates for Airflow 3.0 kind:feature Feature Requests

Comments

@o-nikolas
Copy link
Contributor

o-nikolas commented Jul 31, 2024

Description

See this issue and this subsequent PR for context. But in short: There was a undocumented/untested/buggy feature to allow setting parallelism to infinity (by setting it to zero) which was dropped during the implementation of Multiple Executor Config. This raised a discussion of if this feature should really be supported. I propose we drop this feature in Airflow 3 because:

  1. It adds unnecessary complexity to the code
  2. This behaviour can be achieved by setting parallelism to a sufficiently high number
  3. (most importantly) I think it's actually important for the user to have to do 2), it forces them to actually think "hmm how parallel should I actually run tasks? Is infinity appropriated? Will infinity actually cause degraded performance?". I think allowing 0 gives an easy way for folks to set and forget without weighing the implications.
  4. scheduler.max_tis_per_query is a very important config for performance and depends on core.parallelism if it is set to 0 (which means to track the value of parallelism) then we may have infinite query sizes which would drastically impact performance. This is an easy trap for users to fall into.

Related issues

#41055
#41107

Code of Conduct

@o-nikolas o-nikolas added kind:feature Feature Requests needs-triage label for new issues that we didn't triage yet labels Jul 31, 2024
@o-nikolas o-nikolas added airflow3.0:candidate Potential candidates for Airflow 3.0 involves core breaking change labels Jul 31, 2024
@eladkal eladkal removed the needs-triage label for new issues that we didn't triage yet label Aug 1, 2024
@uranusjr uranusjr added airflow3.0:breaking Candidates for Airflow 3.0 that contain breaking changes involves core breaking change and removed involves core breaking change airflow3.0:breaking Candidates for Airflow 3.0 that contain breaking changes labels Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
airflow3.0:breaking Candidates for Airflow 3.0 that contain breaking changes airflow3.0:candidate Potential candidates for Airflow 3.0 kind:feature Feature Requests
Projects
None yet
Development

No branches or pull requests

3 participants