-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Add support for reproducible build date epoch for Airflow releases #36726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Quick follow-up after #36537 - adding nice, reproducible build support for Airflow packages. |
dev/breeze/src/airflow_breeze/commands/release_management_commands.py
Outdated
Show resolved
Hide resolved
724d459 to
6d22583
Compare
|
Had to move the location of the reproducible_build.yaml - it is now in "airflow" root - which is better because it will also allow to run reproducible build for anyone who whill just get airflow sources. I wil likely later have to contribute a small thing (or maybe plugin will be enough) to make it possible for hatchling to use that information without setting the environment variable first. Might be a nice contribution to hatchling :) |
e448125 to
12bcae8
Compare
|
Would love to merge that one - then I could cherry-pick it to 2.8.1 and get reproducible 2.8.1 build already :) |
hussein-awala
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a small nit, LGTM
Hatch has built-in support for reproducible builds, however it uses a hard-coded 2020 date to generate the reproducible binaries, which produces whl, tar.gz files that contain file dates that are pretty old. This might be confusing for anyone who is looking at the file contents and timestamp inside. This PR adds support (similar to provider approach) to store current reproducible date in the repository - so that it can be committed and tagged together with Airflow sources. It is updated fully automaticallly by pre-commit whenever release notes change, which basically means that whenever release notes are update just before release, the reproducible date is updated to current date. For now we only check if the packages produced by hatchling build are reproducible.
12bcae8 to
f85da7b
Compare
|
Resolved both :) |
…36726) Hatch has built-in support for reproducible builds, however it uses a hard-coded 2020 date to generate the reproducible binaries, which produces whl, tar.gz files that contain file dates that are pretty old. This might be confusing for anyone who is looking at the file contents and timestamp inside. This PR adds support (similar to provider approach) to store current reproducible date in the repository - so that it can be committed and tagged together with Airflow sources. It is updated fully automaticallly by pre-commit whenever release notes change, which basically means that whenever release notes are update just before release, the reproducible date is updated to current date. For now we only check if the packages produced by hatchling build are reproducible. (cherry picked from commit a2d6c38)
…36726) Hatch has built-in support for reproducible builds, however it uses a hard-coded 2020 date to generate the reproducible binaries, which produces whl, tar.gz files that contain file dates that are pretty old. This might be confusing for anyone who is looking at the file contents and timestamp inside. This PR adds support (similar to provider approach) to store current reproducible date in the repository - so that it can be committed and tagged together with Airflow sources. It is updated fully automaticallly by pre-commit whenever release notes change, which basically means that whenever release notes are update just before release, the reproducible date is updated to current date. For now we only check if the packages produced by hatchling build are reproducible. (cherry picked from commit a2d6c38)
Following apache#36726, apache#36744, apache#36763, apache#36819 this PR adds the feature of making source tarball that we release as an official release of the ASF for Helm Chart into reproducible tarball. This means that anyone should be able to produce such tarball using the sources of airflow and verify that he tarball pushed to SVN by the release manager is built from our source repositories. We also do the same with Helm package. It turns out that gpg signing of the package does not modify the .tgz file - it just adds .prov file containing checksum and signature, so we can safely re-pack the .tar.gz package in a reproducible way, this way we have both reproduciblity and provenance check nicely working together. There are few changes in this PR that are related: * Bumped Helm version in our environment to use the latest one and using the `breeze k8s setup-env` environment to run all the release commands - this way we can be sure same helm version is used to build the package, further making it more reproducible. * The reproducible packaging utility we have has been refeactored now - we take "source" archive as parameter rather than directory and simply repack it in reproducible way. * The tool also applies group/other ownership removal on its own, because helm package has no option to umask the generated files. * In this change we also ignore subcharts from being exported to the source tarball package as we shoudl not include source files from postgres in our source package.. * Both - the tarball and helm package are generated in `dist` folder similarly as all our other packages. * Documentation for releasing the packages and verifying them is updated. * CI jobs are updated to use the new commands and generated packages are produced as artifacts so that we can be sure the commands continue working and produce the right output.
Following #36726, #36744, #36763, #36819 this PR adds the feature of making source tarball that we release as an official release of the ASF for Helm Chart into reproducible tarball. This means that anyone should be able to produce such tarball using the sources of airflow and verify that he tarball pushed to SVN by the release manager is built from our source repositories. We also do the same with Helm package. It turns out that gpg signing of the package does not modify the .tgz file - it just adds .prov file containing checksum and signature, so we can safely re-pack the .tar.gz package in a reproducible way, this way we have both reproduciblity and provenance check nicely working together. There are few changes in this PR that are related: * Bumped Helm version in our environment to use the latest one and using the `breeze k8s setup-env` environment to run all the release commands - this way we can be sure same helm version is used to build the package, further making it more reproducible. * The reproducible packaging utility we have has been refeactored now - we take "source" archive as parameter rather than directory and simply repack it in reproducible way. * The tool also applies group/other ownership removal on its own, because helm package has no option to umask the generated files. * In this change we also ignore subcharts from being exported to the source tarball package as we shoudl not include source files from postgres in our source package.. * Both - the tarball and helm package are generated in `dist` folder similarly as all our other packages. * Documentation for releasing the packages and verifying them is updated. * CI jobs are updated to use the new commands and generated packages are produced as artifacts so that we can be sure the commands continue working and produce the right output.
Following #36726, #36744, #36763, #36819 this PR adds the feature of making source tarball that we release as an official release of the ASF for Helm Chart into reproducible tarball. This means that anyone should be able to produce such tarball using the sources of airflow and verify that he tarball pushed to SVN by the release manager is built from our source repositories. We also do the same with Helm package. It turns out that gpg signing of the package does not modify the .tgz file - it just adds .prov file containing checksum and signature, so we can safely re-pack the .tar.gz package in a reproducible way, this way we have both reproduciblity and provenance check nicely working together. There are few changes in this PR that are related: * Bumped Helm version in our environment to use the latest one and using the `breeze k8s setup-env` environment to run all the release commands - this way we can be sure same helm version is used to build the package, further making it more reproducible. * The reproducible packaging utility we have has been refeactored now - we take "source" archive as parameter rather than directory and simply repack it in reproducible way. * The tool also applies group/other ownership removal on its own, because helm package has no option to umask the generated files. * In this change we also ignore subcharts from being exported to the source tarball package as we shoudl not include source files from postgres in our source package.. * Both - the tarball and helm package are generated in `dist` folder similarly as all our other packages. * Documentation for releasing the packages and verifying them is updated. * CI jobs are updated to use the new commands and generated packages are produced as artifacts so that we can be sure the commands continue working and produce the right output. (cherry picked from commit 48158c9)
Following #36726, #36744, #36763, #36819 this PR adds the feature of making source tarball that we release as an official release of the ASF for Helm Chart into reproducible tarball. This means that anyone should be able to produce such tarball using the sources of airflow and verify that he tarball pushed to SVN by the release manager is built from our source repositories. We also do the same with Helm package. It turns out that gpg signing of the package does not modify the .tgz file - it just adds .prov file containing checksum and signature, so we can safely re-pack the .tar.gz package in a reproducible way, this way we have both reproduciblity and provenance check nicely working together. There are few changes in this PR that are related: * Bumped Helm version in our environment to use the latest one and using the `breeze k8s setup-env` environment to run all the release commands - this way we can be sure same helm version is used to build the package, further making it more reproducible. * The reproducible packaging utility we have has been refeactored now - we take "source" archive as parameter rather than directory and simply repack it in reproducible way. * The tool also applies group/other ownership removal on its own, because helm package has no option to umask the generated files. * In this change we also ignore subcharts from being exported to the source tarball package as we shoudl not include source files from postgres in our source package.. * Both - the tarball and helm package are generated in `dist` folder similarly as all our other packages. * Documentation for releasing the packages and verifying them is updated. * CI jobs are updated to use the new commands and generated packages are produced as artifacts so that we can be sure the commands continue working and produce the right output. (cherry picked from commit 48158c9)
Hatch has built-in support for reproducible builds, however it uses a hard-coded 2020 date to generate the reproducible binaries, which produces whl, tar.gz files that contain file dates that are pretty old. This might be confusing for anyone who is looking at the file contents and timestamp inside.
This PR adds support (similar to provider approach) to store current reproducible date in the repository - so that it can be committed and tagged together with Airflow sources. It is updated fully automaticallly by pre-commit whenever release notes change, which basically means that whenever release notes are update just before release, the reproducible date is updated to current date.
For now we only check if the packages produced by hatchling build are reproducible.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.