Fix misleading TODO comments about itertools.batched in selective_checks.py#61573
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
shahar1
left a comment
There was a problem hiding this comment.
Thanks for your contribution and welcome to Apache Airflow!
Looking forward for more contributions, feel free to pick any unassigned issue labeled as good first issue.
Also, it seems that you're aligned with the Gen. AI policy, which is great - I do recommend though to moderate its usage at least in your first PRs, so you could actually learn what's going on.
Specifically the TODO comment which you modified into a docstring looks like a great question for code interviews :)
Remove TODO comments suggesting itertools.batched can replace _split_list. These have different semantics: batched creates fixed-size batches while _split_list creates a fixed number of groups. Enhanced docstring to document why batched cannot be used and clarify function behavior.
673dafa to
d09a28f
Compare
|
Thank you for the warm welcome and for merging my first PR! 🙏 |
|
I assume the CI errors are unrelated to your change and are infrastructure related, so they migth go away with next commit. Thanks for the docs additions and clarifications - I assume the TODO needs to stay because yes, we can switch to the optimized approach but only once Python 3.12 is the minimum version. Currently we suppoer Python 3.10-3.13 and therefore we need to assume the code needs to be executed on lowest version as well. So in my view we need to keep the TODO comments. |
The thing is that the alleged "optimized approach" doesn't solve the same issue, which is demonstrated in the PR's description :) |
You are right. Would have been better to read the full PR primer. Convinced. |
) (cherry picked from commit 09a15a4) Co-authored-by: Ahsan Ellahi <37433424+MrEhsanEllahi@users.noreply.github.com>
…che#61573) (cherry picked from commit 09a15a4) Co-authored-by: Ahsan Ellahi <37433424+MrEhsanEllahi@users.noreply.github.com>
|
nice :) |
Description
Removed misleading TODO comments that incorrectly suggested
itertools.batched(Python 3.12+) could replace the custom _split_list function in the Airflow Breeze utilities.Problem
Two TODO comments (lines 422 and 1202 in dev/breeze/src/airflow_breeze/utils/selective_checks.py) suggested that Python 3.12's
itertools.batchedcould replace _split_list. However, investigation revealed these functions have fundamentally different semantics:itertools.batched(iterable, n): Creates batches of size n[[1,2,3,4], [5,6,7,8], [9,10]](3 batches)_split_list(list, n): Creates exactly n groups, distributed as evenly as possible
[[1,2,3,4], [5,6,7], [8,9,10]](3 groups)Testing showed 4 out of 7 test cases fail when attempting replacement, including critical edge cases like empty lists and ensuring exactly n groups are returned.
Solution
itertools.batchedcannot be usedChanges Made
File Modified:
Lines Changed:
Testing
Type of Change
Was generative AI tooling used to co-author this PR?
Generated-by: Google Gemini following the guidelines