Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Nov 5, 2025

#57901)

After testing with Alpha 2 ATR (Apache Trusted Releases) of airflow-ctl it turned out that our source tarballs were not perfectly according to what ATR expected.

  • The -source tarballs did not have internal directory with -source sufffix

  • The verification of those tarballs with RAT 0.17.0 was done using the .rat-excludes from airflow sources - not from the package itself

  • For provider releases, we verified way more than we should with licence checks (we should really only verify the sources, not the whl or .sdist files.

The commands to generate the tarballs also had some inconsistencies and repetitions:

  • Literals were sprinkled across the code base

  • In some cases we would produce tarballs when generating distributions which was not exactly matching the general "prepare-airflow-tarball" behaviour regarding tags and versions.

  • It was difficult to test tarball preparation locally before tags were created for the version to test

  • Tarballs should never be produced with rc suffix. We always produce tarballs with the "final version" of the component because tarballs are not published in PyPI and can be promoted to the final version always.

This PR simplifies the scripts and makes it more consistent:

  • tarballs are only created with prepare-tarball command
  • Literals are removed and replaced wiuth TarballType enum (DRY)
  • behaviour of tarball generation is more predictable now and consistent:
    • you specify tarball type
    • if you do not specify version, tag is HEAD, and version is automatically retrieved based of the version of tarball type
    • if you specify version - prefix (airflow-ctl/, airflow-task-sdk/, providers/) is added to the version to derive a tag and that version is used as tarball version (prefixed with type)
    • tarball-type-version-source is used as top-level folder where the sources are placed
  • verification of licences is now consistent, simplified and produces better results - showing counts of checked files and exclusions used - not only files that have unknown or unapproved licences. You can actually see that RAT did it's job. (cherry picked from commit c3669c0)

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

…pache#57901)

After testing with Alpha 2 ATR (Apache Trusted Releases) of airflow-ctl
it turned out that our source tarballs were not perfectly according to
what ATR expected.

* The -source tarballs did not have internal directory with `-source`
  sufffix

* The verification of those tarballs with RAT 0.17.0 was done using
  the .rat-excludes from airflow sources - not from the package itself

* For provider releases, we verified way more than we should with
  licence checks (we should really only verify the sources, not
  the whl or .sdist files.

The commands to generate the tarballs also had some inconsistencies and
repetitions:

* Literals were sprinkled across the code base

* In some cases we would produce tarballs when generating distributions
  which was not exactly matching the general "prepare-airflow-tarball"
  behaviour regarding tags and versions.

* It was difficult to test tarball preparation locally before tags
  were created for the version to test

* Tarballs should never be produced with rc suffix. We always
  produce tarballs with the "final version" of the component
  because tarballs are not published in PyPI and can be promoted
  to the final version always.

This PR simplifies the scripts and makes it more consistent:

* tarballs are only created with `prepare-tarball` command
* Literals are removed and replaced wiuth TarballType enum (DRY)
* behaviour of tarball generation is more predictable now and
  consistent:
  * you specify tarball type
  * if you do not specify version, tag is HEAD, and version is
    automatically retrieved based of the version of tarball type
  * if you specify version - prefix (airflow-ctl/, airflow-task-sdk/,
    providers/) is added to the version to derive a tag and that
    version is used as tarball version (prefixed with type)
  * tarball-type-version-source is used as top-level folder
    where the sources are placed
* verification of licences is now consistent, simplified and
  produces better results - showing counts of checked files and
  exclusions used - not only files that have unknown or unapproved
  licences. You can actually see that RAT did it's job.
(cherry picked from commit c3669c0)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
@boring-cyborg boring-cyborg bot added area:dev-tools backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch labels Nov 5, 2025
@potiuk potiuk merged commit 7514f11 into apache:v3-1-test Nov 6, 2025
69 checks passed
@potiuk potiuk deleted the backport-c3669c0-v3-1-test branch November 6, 2025 00:09
@ephraimbuddy ephraimbuddy added this to the Airflow 3.1.3 milestone Nov 10, 2025
ephraimbuddy pushed a commit that referenced this pull request Nov 10, 2025
…57901) (#57906)

After testing with Alpha 2 ATR (Apache Trusted Releases) of airflow-ctl
it turned out that our source tarballs were not perfectly according to
what ATR expected.

* The -source tarballs did not have internal directory with `-source`
  sufffix

* The verification of those tarballs with RAT 0.17.0 was done using
  the .rat-excludes from airflow sources - not from the package itself

* For provider releases, we verified way more than we should with
  licence checks (we should really only verify the sources, not
  the whl or .sdist files.

The commands to generate the tarballs also had some inconsistencies and
repetitions:

* Literals were sprinkled across the code base

* In some cases we would produce tarballs when generating distributions
  which was not exactly matching the general "prepare-airflow-tarball"
  behaviour regarding tags and versions.

* It was difficult to test tarball preparation locally before tags
  were created for the version to test

* Tarballs should never be produced with rc suffix. We always
  produce tarballs with the "final version" of the component
  because tarballs are not published in PyPI and can be promoted
  to the final version always.

This PR simplifies the scripts and makes it more consistent:

* tarballs are only created with `prepare-tarball` command
* Literals are removed and replaced wiuth TarballType enum (DRY)
* behaviour of tarball generation is more predictable now and
  consistent:
  * you specify tarball type
  * if you do not specify version, tag is HEAD, and version is
    automatically retrieved based of the version of tarball type
  * if you specify version - prefix (airflow-ctl/, airflow-task-sdk/,
    providers/) is added to the version to derive a tag and that
    version is used as tarball version (prefixed with type)
  * tarball-type-version-source is used as top-level folder
    where the sources are placed
* verification of licences is now consistent, simplified and
  produces better results - showing counts of checked files and
  exclusions used - not only files that have unknown or unapproved
  licences. You can actually see that RAT did it's job.
(cherry picked from commit c3669c0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants