Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Nov 5, 2025

After testing with Alpha 2 ATR (Apache Trusted Releases) of airflow-ctl it turned out that our source tarballs were not perfectly according to what ATR expected.

  • The -source tarballs did not have internal directory with -source sufffix

  • The verification of those tarballs with RAT 0.17.0 was done using the .rat-excludes from airflow sources - not from the package itself

  • For provider releases, we verified way more than we should with licence checks (we should really only verify the sources, not the whl or .sdist files.

The commands to generate the tarballs also had some inconsistencies and repetitions:

  • Literals were sprinkled across the code base

  • In some cases we would produce tarballs when generating distributions which was not exactly matching the general "prepare-airflow-tarball" behaviour regarding tags and versions.

  • It was difficult to test tarball preparation locally before tags were created for the version to test

  • Tarballs should never be produced with rc suffix. We always produce tarballs with the "final version" of the component because tarballs are not published in PyPI and can be promoted to the final version always.

This PR simplifies the scripts and makes it more consistent:

  • tarballs are only created with prepare-tarball command
  • Literals are removed and replaced wiuth TarballType enum (DRY)
  • behaviour of tarball generation is more predictable now and consistent:
    • you specify tarball type
    • if you do not specify version, tag is HEAD, and version is automatically retrieved based of the version of tarball type
    • if you specify version - prefix (airflow-ctl/, airflow-task-sdk/, providers/) is added to the version to derive a tag and that version is used as tarball version (prefixed with type)
    • tarball-type-version-source is used as top-level folder where the sources are placed
  • verification of licences is now consistent, simplified and produces better results - showing counts of checked files and exclusions used - not only files that have unknown or unapproved licences. You can actually see that RAT did it's job.

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@bugraoz93
Copy link
Contributor

Thanks Jarek!

After testing with Alpha 2 ATR (Apache Trusted Releases) of airflow-ctl
it turned out that our source tarballs were not perfectly according to
what ATR expected.

* The -source tarballs did not have internal directory with `-source`
  sufffix

* The verification of those tarballs with RAT 0.17.0 was done using
  the .rat-excludes from airflow sources - not from the package itself

* For provider releases, we verified way more than we should with
  licence checks (we should really only verify the sources, not
  the whl or .sdist files.

The commands to generate the tarballs also had some inconsistencies and
repetitions:

* Literals were sprinkled across the code base

* In some cases we would produce tarballs when generating distributions
  which was not exactly matching the general "prepare-airflow-tarball"
  behaviour regarding tags and versions.

* It was difficult to test tarball preparation locally before tags
  were created for the version to test

* Tarballs should never be produced with rc suffix. We always
  produce tarballs with the "final version" of the component
  because tarballs are not published in PyPI and can be promoted
  to the final version always.

This PR simplifies the scripts and makes it more consistent:

* tarballs are only created with `prepare-tarball` command
* Literals are removed and replaced wiuth TarballType enum (DRY)
* behaviour of tarball generation is more predictable now and
  consistent:
  * you specify tarball type
  * if you do not specify version, tag is HEAD, and version is
    automatically retrieved based of the version of tarball type
  * if you specify version - prefix (airflow-ctl/, airflow-task-sdk/,
    providers/) is added to the version to derive a tag and that
    version is used as tarball version (prefixed with type)
  * tarball-type-version-source is used as top-level folder
    where the sources are placed
* verification of licences is now consistent, simplified and
  produces better results - showing counts of checked files and
  exclusions used - not only files that have unknown or unapproved
  licences. You can actually see that RAT did it's job.
@potiuk potiuk force-pushed the make-source-tarball-preparation-prepared-for-rat branch from 569e2b7 to 0f84aa2 Compare November 5, 2025 23:21
@potiuk potiuk merged commit c3669c0 into apache:main Nov 5, 2025
98 checks passed
@potiuk potiuk deleted the make-source-tarball-preparation-prepared-for-rat branch November 5, 2025 23:45
@github-actions
Copy link

github-actions bot commented Nov 5, 2025

Backport failed to create: v3-1-test. View the failure log Run details

Status Branch Result
v3-1-test Commit Link

You can attempt to backport this manually by running:

cherry_picker c3669c0 v3-1-test

This should apply the commit to the v3-1-test branch and leave the commit in conflict state marking
the files that need manual conflict resolution.

After you have resolved the conflicts, you can continue the backport process by running:

cherry_picker --continue

potiuk added a commit to potiuk/airflow that referenced this pull request Nov 5, 2025
…pache#57901)

After testing with Alpha 2 ATR (Apache Trusted Releases) of airflow-ctl
it turned out that our source tarballs were not perfectly according to
what ATR expected.

* The -source tarballs did not have internal directory with `-source`
  sufffix

* The verification of those tarballs with RAT 0.17.0 was done using
  the .rat-excludes from airflow sources - not from the package itself

* For provider releases, we verified way more than we should with
  licence checks (we should really only verify the sources, not
  the whl or .sdist files.

The commands to generate the tarballs also had some inconsistencies and
repetitions:

* Literals were sprinkled across the code base

* In some cases we would produce tarballs when generating distributions
  which was not exactly matching the general "prepare-airflow-tarball"
  behaviour regarding tags and versions.

* It was difficult to test tarball preparation locally before tags
  were created for the version to test

* Tarballs should never be produced with rc suffix. We always
  produce tarballs with the "final version" of the component
  because tarballs are not published in PyPI and can be promoted
  to the final version always.

This PR simplifies the scripts and makes it more consistent:

* tarballs are only created with `prepare-tarball` command
* Literals are removed and replaced wiuth TarballType enum (DRY)
* behaviour of tarball generation is more predictable now and
  consistent:
  * you specify tarball type
  * if you do not specify version, tag is HEAD, and version is
    automatically retrieved based of the version of tarball type
  * if you specify version - prefix (airflow-ctl/, airflow-task-sdk/,
    providers/) is added to the version to derive a tag and that
    version is used as tarball version (prefixed with type)
  * tarball-type-version-source is used as top-level folder
    where the sources are placed
* verification of licences is now consistent, simplified and
  produces better results - showing counts of checked files and
  exclusions used - not only files that have unknown or unapproved
  licences. You can actually see that RAT did it's job.
(cherry picked from commit c3669c0)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
@potiuk
Copy link
Member Author

potiuk commented Nov 5, 2025

Manual backport #57906

potiuk added a commit that referenced this pull request Nov 6, 2025
…57901) (#57906)

After testing with Alpha 2 ATR (Apache Trusted Releases) of airflow-ctl
it turned out that our source tarballs were not perfectly according to
what ATR expected.

* The -source tarballs did not have internal directory with `-source`
  sufffix

* The verification of those tarballs with RAT 0.17.0 was done using
  the .rat-excludes from airflow sources - not from the package itself

* For provider releases, we verified way more than we should with
  licence checks (we should really only verify the sources, not
  the whl or .sdist files.

The commands to generate the tarballs also had some inconsistencies and
repetitions:

* Literals were sprinkled across the code base

* In some cases we would produce tarballs when generating distributions
  which was not exactly matching the general "prepare-airflow-tarball"
  behaviour regarding tags and versions.

* It was difficult to test tarball preparation locally before tags
  were created for the version to test

* Tarballs should never be produced with rc suffix. We always
  produce tarballs with the "final version" of the component
  because tarballs are not published in PyPI and can be promoted
  to the final version always.

This PR simplifies the scripts and makes it more consistent:

* tarballs are only created with `prepare-tarball` command
* Literals are removed and replaced wiuth TarballType enum (DRY)
* behaviour of tarball generation is more predictable now and
  consistent:
  * you specify tarball type
  * if you do not specify version, tag is HEAD, and version is
    automatically retrieved based of the version of tarball type
  * if you specify version - prefix (airflow-ctl/, airflow-task-sdk/,
    providers/) is added to the version to derive a tag and that
    version is used as tarball version (prefixed with type)
  * tarball-type-version-source is used as top-level folder
    where the sources are placed
* verification of licences is now consistent, simplified and
  produces better results - showing counts of checked files and
  exclusions used - not only files that have unknown or unapproved
  licences. You can actually see that RAT did it's job.
(cherry picked from commit c3669c0)
xchwan pushed a commit to xchwan/airflow that referenced this pull request Nov 6, 2025
After testing with Alpha 2 ATR (Apache Trusted Releases) of airflow-ctl
it turned out that our source tarballs were not perfectly according to
what ATR expected.

* The -source tarballs did not have internal directory with `-source`
  sufffix

* The verification of those tarballs with RAT 0.17.0 was done using
  the .rat-excludes from airflow sources - not from the package itself

* For provider releases, we verified way more than we should with
  licence checks (we should really only verify the sources, not
  the whl or .sdist files.

The commands to generate the tarballs also had some inconsistencies and
repetitions:

* Literals were sprinkled across the code base

* In some cases we would produce tarballs when generating distributions
  which was not exactly matching the general "prepare-airflow-tarball"
  behaviour regarding tags and versions.

* It was difficult to test tarball preparation locally before tags
  were created for the version to test

* Tarballs should never be produced with rc suffix. We always
  produce tarballs with the "final version" of the component
  because tarballs are not published in PyPI and can be promoted
  to the final version always.

This PR simplifies the scripts and makes it more consistent:

* tarballs are only created with `prepare-tarball` command
* Literals are removed and replaced wiuth TarballType enum (DRY)
* behaviour of tarball generation is more predictable now and
  consistent:
  * you specify tarball type
  * if you do not specify version, tag is HEAD, and version is
    automatically retrieved based of the version of tarball type
  * if you specify version - prefix (airflow-ctl/, airflow-task-sdk/,
    providers/) is added to the version to derive a tag and that
    version is used as tarball version (prefixed with type)
  * tarball-type-version-source is used as top-level folder
    where the sources are placed
* verification of licences is now consistent, simplified and
  produces better results - showing counts of checked files and
  exclusions used - not only files that have unknown or unapproved
  licences. You can actually see that RAT did it's job.
ephraimbuddy pushed a commit that referenced this pull request Nov 10, 2025
…57901) (#57906)

After testing with Alpha 2 ATR (Apache Trusted Releases) of airflow-ctl
it turned out that our source tarballs were not perfectly according to
what ATR expected.

* The -source tarballs did not have internal directory with `-source`
  sufffix

* The verification of those tarballs with RAT 0.17.0 was done using
  the .rat-excludes from airflow sources - not from the package itself

* For provider releases, we verified way more than we should with
  licence checks (we should really only verify the sources, not
  the whl or .sdist files.

The commands to generate the tarballs also had some inconsistencies and
repetitions:

* Literals were sprinkled across the code base

* In some cases we would produce tarballs when generating distributions
  which was not exactly matching the general "prepare-airflow-tarball"
  behaviour regarding tags and versions.

* It was difficult to test tarball preparation locally before tags
  were created for the version to test

* Tarballs should never be produced with rc suffix. We always
  produce tarballs with the "final version" of the component
  because tarballs are not published in PyPI and can be promoted
  to the final version always.

This PR simplifies the scripts and makes it more consistent:

* tarballs are only created with `prepare-tarball` command
* Literals are removed and replaced wiuth TarballType enum (DRY)
* behaviour of tarball generation is more predictable now and
  consistent:
  * you specify tarball type
  * if you do not specify version, tag is HEAD, and version is
    automatically retrieved based of the version of tarball type
  * if you specify version - prefix (airflow-ctl/, airflow-task-sdk/,
    providers/) is added to the version to derive a tag and that
    version is used as tarball version (prefixed with type)
  * tarball-type-version-source is used as top-level folder
    where the sources are placed
* verification of licences is now consistent, simplified and
  produces better results - showing counts of checked files and
  exclusions used - not only files that have unknown or unapproved
  licences. You can actually see that RAT did it's job.
(cherry picked from commit c3669c0)
Copilot AI pushed a commit to jason810496/airflow that referenced this pull request Dec 5, 2025
After testing with Alpha 2 ATR (Apache Trusted Releases) of airflow-ctl
it turned out that our source tarballs were not perfectly according to
what ATR expected.

* The -source tarballs did not have internal directory with `-source`
  sufffix

* The verification of those tarballs with RAT 0.17.0 was done using
  the .rat-excludes from airflow sources - not from the package itself

* For provider releases, we verified way more than we should with
  licence checks (we should really only verify the sources, not
  the whl or .sdist files.

The commands to generate the tarballs also had some inconsistencies and
repetitions:

* Literals were sprinkled across the code base

* In some cases we would produce tarballs when generating distributions
  which was not exactly matching the general "prepare-airflow-tarball"
  behaviour regarding tags and versions.

* It was difficult to test tarball preparation locally before tags
  were created for the version to test

* Tarballs should never be produced with rc suffix. We always
  produce tarballs with the "final version" of the component
  because tarballs are not published in PyPI and can be promoted
  to the final version always.

This PR simplifies the scripts and makes it more consistent:

* tarballs are only created with `prepare-tarball` command
* Literals are removed and replaced wiuth TarballType enum (DRY)
* behaviour of tarball generation is more predictable now and
  consistent:
  * you specify tarball type
  * if you do not specify version, tag is HEAD, and version is
    automatically retrieved based of the version of tarball type
  * if you specify version - prefix (airflow-ctl/, airflow-task-sdk/,
    providers/) is added to the version to derive a tag and that
    version is used as tarball version (prefixed with type)
  * tarball-type-version-source is used as top-level folder
    where the sources are placed
* verification of licences is now consistent, simplified and
  produces better results - showing counts of checked files and
  exclusions used - not only files that have unknown or unapproved
  licences. You can actually see that RAT did it's job.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants