Skip to content

Fix misleading TODO comments about itertools.batched in selective_checks.py#61573

Merged
shahar1 merged 2 commits intoapache:mainfrom
MrEhsanEllahi:fix/misleading-itertools-batched-todo
Feb 7, 2026
Merged

Fix misleading TODO comments about itertools.batched in selective_checks.py#61573
shahar1 merged 2 commits intoapache:mainfrom
MrEhsanEllahi:fix/misleading-itertools-batched-todo

Conversation

@MrEhsanEllahi
Copy link
Contributor

Description

Removed misleading TODO comments that incorrectly suggested itertools.batched (Python 3.12+) could replace the custom _split_list function in the Airflow Breeze utilities.

Problem

Two TODO comments (lines 422 and 1202 in dev/breeze/src/airflow_breeze/utils/selective_checks.py) suggested that Python 3.12's itertools.batched could replace _split_list. However, investigation revealed these functions have fundamentally different semantics:

  • itertools.batched(iterable, n): Creates batches of size n

    • Example: batched([1,2,3,4,5,6,7,8,9,10], 4) → [[1,2,3,4], [5,6,7,8], [9,10]] (3 batches)
  • _split_list(list, n): Creates exactly n groups, distributed as evenly as possible

    • Example: _split_list([1,2,3,4,5,6,7,8,9,10], 3) → [[1,2,3,4], [5,6,7], [8,9,10]] (3 groups)

Testing showed 4 out of 7 test cases fail when attempting replacement, including critical edge cases like empty lists and ensuring exactly n groups are returned.

Solution

  1. Removed misleading TODO comments (lines 422 and 1202)
  2. Enhanced function docstring with:
    • Clear explanation of the function's behavior
    • Note explaining why itertools.batched cannot be used
    • Detailed Args/Returns documentation
    • Practical usage example
    • Edge case documentation

Changes Made

File Modified:

  • dev/breeze/src/airflow_breeze/utils/selective_checks.py

Lines Changed:

  • Lines 422-440: Enhanced _split_list docstring and removed TODO comment
  • Line 1215: Removed duplicate TODO comment

Testing

  • No logic changes, only documentation improvements
  • Existing tests in dev/breeze/tests/test_selective_checks.py verify function behavior remains unchanged
  • Function is tested at lines 44 and 90 of the test file

Type of Change

  • Documentation update
  • Code quality improvement (removed misleading comments)

Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

Generated-by: Google Gemini following the guidelines


@boring-cyborg
Copy link

boring-cyborg bot commented Feb 7, 2026

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@boring-cyborg boring-cyborg bot added area:dev-tools backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch labels Feb 7, 2026
Copy link
Contributor

@shahar1 shahar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution and welcome to Apache Airflow!
Looking forward for more contributions, feel free to pick any unassigned issue labeled as good first issue.
Also, it seems that you're aligned with the Gen. AI policy, which is great - I do recommend though to moderate its usage at least in your first PRs, so you could actually learn what's going on.
Specifically the TODO comment which you modified into a docstring looks like a great question for code interviews :)

Remove TODO comments suggesting itertools.batched can replace _split_list.
These have different semantics: batched creates fixed-size batches while
_split_list creates a fixed number of groups. Enhanced docstring to
document why batched cannot be used and clarify function behavior.
@shahar1 shahar1 force-pushed the fix/misleading-itertools-batched-todo branch from 673dafa to d09a28f Compare February 7, 2026 08:06
@MrEhsanEllahi
Copy link
Contributor Author

Thank you for the warm welcome and for merging my first PR! 🙏
I appreciate the feedback on AI usage. To clarify, I actually ran all the tests myself (created 7 test cases, executed them on Python 3.12, analyzed the output) and did the investigation work manually. I used AI mainly to help format things according to contribution guidelines and write the PR description properly.
That said, you're absolutely right that I should moderate its use going forward to really learn the codebase patterns. Looking forward to contributing more!
Thanks again for the guidance and the kind words about the TODO investigation!

@jscheffl
Copy link
Contributor

jscheffl commented Feb 7, 2026

I assume the CI errors are unrelated to your change and are infrastructure related, so they migth go away with next commit.

Thanks for the docs additions and clarifications - I assume the TODO needs to stay because yes, we can switch to the optimized approach but only once Python 3.12 is the minimum version. Currently we suppoer Python 3.10-3.13 and therefore we need to assume the code needs to be executed on lowest version as well. So in my view we need to keep the TODO comments.

@shahar1
Copy link
Contributor

shahar1 commented Feb 7, 2026

I assume the CI errors are unrelated to your change and are infrastructure related, so they migth go away with next commit.

Thanks for the docs additions and clarifications - I assume the TODO needs to stay because yes, we can switch to the optimized approach but only once Python 3.12 is the minimum version. Currently we suppoer Python 3.10-3.13 and therefore we need to assume the code needs to be executed on lowest version as well. So in my view we need to keep the TODO comments.

The thing is that the alleged "optimized approach" doesn't solve the same issue, which is demonstrated in the PR's description :)

@jscheffl
Copy link
Contributor

jscheffl commented Feb 7, 2026

I assume the CI errors are unrelated to your change and are infrastructure related, so they migth go away with next commit.
Thanks for the docs additions and clarifications - I assume the TODO needs to stay because yes, we can switch to the optimized approach but only once Python 3.12 is the minimum version. Currently we suppoer Python 3.10-3.13 and therefore we need to assume the code needs to be executed on lowest version as well. So in my view we need to keep the TODO comments.

The thing is that the alleged "optimized approach" doesn't solve the same issue, which is demonstrated in the PR's description :)

You are right. Would have been better to read the full PR primer. Convinced.

@shahar1 shahar1 merged commit 09a15a4 into apache:main Feb 7, 2026
247 of 248 checks passed
github-actions bot pushed a commit that referenced this pull request Feb 7, 2026
)

(cherry picked from commit 09a15a4)

Co-authored-by: Ahsan Ellahi <37433424+MrEhsanEllahi@users.noreply.github.com>
@github-actions
Copy link

github-actions bot commented Feb 7, 2026

Backport successfully created: v3-1-test

Status Branch Result
v3-1-test PR Link

github-actions bot pushed a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Feb 7, 2026
…che#61573)

(cherry picked from commit 09a15a4)

Co-authored-by: Ahsan Ellahi <37433424+MrEhsanEllahi@users.noreply.github.com>
shahar1 pushed a commit that referenced this pull request Feb 7, 2026
) (#61593)

(cherry picked from commit 09a15a4)

Co-authored-by: Ahsan Ellahi <37433424+MrEhsanEllahi@users.noreply.github.com>
@potiuk
Copy link
Member

potiuk commented Feb 7, 2026

nice :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants