-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Update Airflow release process to include reproducible tarballs #36744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Airflow release process to include reproducible tarballs #36744
Conversation
f446d2d to
81096f8
Compare
ephraimbuddy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
dev/breeze/src/airflow_breeze/commands/release_candidate_command.py
Outdated
Show resolved
Hide resolved
dev/breeze/src/airflow_breeze/commands/release_candidate_command.py
Outdated
Show resolved
Hide resolved
|
Yeah. As the next step after that, I want to turn that script into something that will be regularly run in our CI - similarly as our other release commands in Breeze - that will hopefully prevent any typos and makes it |
amoghrajesh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@potiuk 🚢 it!
Only a few nits!
dev/breeze/src/airflow_breeze/commands/release_management_commands.py
Outdated
Show resolved
Hide resolved
Source tarball is the main artifact produced by the release process - one that is the "official" release and named like that by the Apache Software Foundation. This PR makes the source tarball generation reproducible - following reproducibility of the `.whl` and `sdist` packages. This change adds: * vendors-in reproducible.py script that repacks .tar.gz package in reproducible way using source-date-epoch as timestamps * breeze release-management prepare-airflow-tarball command * adds verification of the tarballs to PMC verification process * adds --use-local-hatch for package building command to allow for faster / non-docker build of packages for PMC verification * improves diagnostic output of the release and build commands
04bb8eb to
64f6e8e
Compare
Source tarball is the main artifact produced by the release process - one that is the "official" release and named like that by the Apache Software Foundation. This PR makes the source tarball generation reproducible - following reproducibility of the `.whl` and `sdist` packages. This change adds: * vendors-in reproducible.py script that repacks .tar.gz package in reproducible way using source-date-epoch as timestamps * breeze release-management prepare-airflow-tarball command * adds verification of the tarballs to PMC verification process * adds --use-local-hatch for package building command to allow for faster / non-docker build of packages for PMC verification * improves diagnostic output of the release and build commands (cherry picked from commit 72a571d)
Source tarball is the main artifact produced by the release process - one that is the "official" release and named like that by the Apache Software Foundation. This PR makes the source tarball generation reproducible - following reproducibility of the `.whl` and `sdist` packages. This change adds: * vendors-in reproducible.py script that repacks .tar.gz package in reproducible way using source-date-epoch as timestamps * breeze release-management prepare-airflow-tarball command * adds verification of the tarballs to PMC verification process * adds --use-local-hatch for package building command to allow for faster / non-docker build of packages for PMC verification * improves diagnostic output of the release and build commands (cherry picked from commit 72a571d)
Following apache#36726, apache#36744, apache#36763, apache#36819 this PR adds the feature of making source tarball that we release as an official release of the ASF for Helm Chart into reproducible tarball. This means that anyone should be able to produce such tarball using the sources of airflow and verify that he tarball pushed to SVN by the release manager is built from our source repositories. We also do the same with Helm package. It turns out that gpg signing of the package does not modify the .tgz file - it just adds .prov file containing checksum and signature, so we can safely re-pack the .tar.gz package in a reproducible way, this way we have both reproduciblity and provenance check nicely working together. There are few changes in this PR that are related: * Bumped Helm version in our environment to use the latest one and using the `breeze k8s setup-env` environment to run all the release commands - this way we can be sure same helm version is used to build the package, further making it more reproducible. * The reproducible packaging utility we have has been refeactored now - we take "source" archive as parameter rather than directory and simply repack it in reproducible way. * The tool also applies group/other ownership removal on its own, because helm package has no option to umask the generated files. * In this change we also ignore subcharts from being exported to the source tarball package as we shoudl not include source files from postgres in our source package.. * Both - the tarball and helm package are generated in `dist` folder similarly as all our other packages. * Documentation for releasing the packages and verifying them is updated. * CI jobs are updated to use the new commands and generated packages are produced as artifacts so that we can be sure the commands continue working and produce the right output.
Following #36726, #36744, #36763, #36819 this PR adds the feature of making source tarball that we release as an official release of the ASF for Helm Chart into reproducible tarball. This means that anyone should be able to produce such tarball using the sources of airflow and verify that he tarball pushed to SVN by the release manager is built from our source repositories. We also do the same with Helm package. It turns out that gpg signing of the package does not modify the .tgz file - it just adds .prov file containing checksum and signature, so we can safely re-pack the .tar.gz package in a reproducible way, this way we have both reproduciblity and provenance check nicely working together. There are few changes in this PR that are related: * Bumped Helm version in our environment to use the latest one and using the `breeze k8s setup-env` environment to run all the release commands - this way we can be sure same helm version is used to build the package, further making it more reproducible. * The reproducible packaging utility we have has been refeactored now - we take "source" archive as parameter rather than directory and simply repack it in reproducible way. * The tool also applies group/other ownership removal on its own, because helm package has no option to umask the generated files. * In this change we also ignore subcharts from being exported to the source tarball package as we shoudl not include source files from postgres in our source package.. * Both - the tarball and helm package are generated in `dist` folder similarly as all our other packages. * Documentation for releasing the packages and verifying them is updated. * CI jobs are updated to use the new commands and generated packages are produced as artifacts so that we can be sure the commands continue working and produce the right output.
Following #36726, #36744, #36763, #36819 this PR adds the feature of making source tarball that we release as an official release of the ASF for Helm Chart into reproducible tarball. This means that anyone should be able to produce such tarball using the sources of airflow and verify that he tarball pushed to SVN by the release manager is built from our source repositories. We also do the same with Helm package. It turns out that gpg signing of the package does not modify the .tgz file - it just adds .prov file containing checksum and signature, so we can safely re-pack the .tar.gz package in a reproducible way, this way we have both reproduciblity and provenance check nicely working together. There are few changes in this PR that are related: * Bumped Helm version in our environment to use the latest one and using the `breeze k8s setup-env` environment to run all the release commands - this way we can be sure same helm version is used to build the package, further making it more reproducible. * The reproducible packaging utility we have has been refeactored now - we take "source" archive as parameter rather than directory and simply repack it in reproducible way. * The tool also applies group/other ownership removal on its own, because helm package has no option to umask the generated files. * In this change we also ignore subcharts from being exported to the source tarball package as we shoudl not include source files from postgres in our source package.. * Both - the tarball and helm package are generated in `dist` folder similarly as all our other packages. * Documentation for releasing the packages and verifying them is updated. * CI jobs are updated to use the new commands and generated packages are produced as artifacts so that we can be sure the commands continue working and produce the right output. (cherry picked from commit 48158c9)
Following #36726, #36744, #36763, #36819 this PR adds the feature of making source tarball that we release as an official release of the ASF for Helm Chart into reproducible tarball. This means that anyone should be able to produce such tarball using the sources of airflow and verify that he tarball pushed to SVN by the release manager is built from our source repositories. We also do the same with Helm package. It turns out that gpg signing of the package does not modify the .tgz file - it just adds .prov file containing checksum and signature, so we can safely re-pack the .tar.gz package in a reproducible way, this way we have both reproduciblity and provenance check nicely working together. There are few changes in this PR that are related: * Bumped Helm version in our environment to use the latest one and using the `breeze k8s setup-env` environment to run all the release commands - this way we can be sure same helm version is used to build the package, further making it more reproducible. * The reproducible packaging utility we have has been refeactored now - we take "source" archive as parameter rather than directory and simply repack it in reproducible way. * The tool also applies group/other ownership removal on its own, because helm package has no option to umask the generated files. * In this change we also ignore subcharts from being exported to the source tarball package as we shoudl not include source files from postgres in our source package.. * Both - the tarball and helm package are generated in `dist` folder similarly as all our other packages. * Documentation for releasing the packages and verifying them is updated. * CI jobs are updated to use the new commands and generated packages are produced as artifacts so that we can be sure the commands continue working and produce the right output. (cherry picked from commit 48158c9)
Source tarball is the main artifact produced by the release process - one that is the "official" release and named like that by the Apache Software Foundation.
This PR makes the source tarball generation reproducible - following reproducibility of the
.whlandsdistpackages.This change adds:
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.