Skip to content

Commit

Permalink
Update Airflow release process to include reproducible tarballs (apac…
Browse files Browse the repository at this point in the history
…he#36744)

Source tarball is the main artifact produced by the release
process - one that is the "official" release and named like that
by the Apache Software Foundation.

This PR makes the source tarball generation reproducible - following
reproducibility of the `.whl` and `sdist` packages.

This change adds:

* vendors-in reproducible.py script that repacks .tar.gz package
  in reproducible way using source-date-epoch as timestamps
* breeze release-management prepare-airflow-tarball command
* adds verification of the tarballs to PMC verification process
* adds --use-local-hatch for package building command to allow for
  faster / non-docker build of packages for PMC verification
* improves diagnostic output of the release and build commands
  • Loading branch information
potiuk authored Jan 12, 2024
1 parent 512461c commit 72a571d
Show file tree
Hide file tree
Showing 19 changed files with 514 additions and 120 deletions.
2 changes: 2 additions & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,8 @@ scripts/*
images/*
dev/*
chart/*.iml
out/*
airflow-build-dockerfile*

# Sha files
.*sha256
Expand Down
8 changes: 8 additions & 0 deletions 3rd-party-licenses/LICENSE-reproducible.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Copyright 2013 The Servo Project Developers.
# Copyright 2017 zerolib Developers.
#
# Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
# http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
# <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
# option. This file may not be copied, modified, or distributed
# except according to those terms.
24 changes: 24 additions & 0 deletions BREEZE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1981,6 +1981,30 @@ default is to build ``both`` type of packages ``sdist`` and ``wheel``.
:alt: Breeze release-management prepare-airflow-package
Preparing airflow tarball
"""""""""""""""""""""""""
You can prepare airflow source tarball using Breeze:
.. code-block:: bash
breeze release-management prepare-airflow-tarball
This prepares airflow -source.tar.gz package in the dist folder.
You must specify ``--version`` flag which is a pre-release version of Airflow you are preparing the
tarball for.
.. code-block:: bash
breeze release-management prepare-airflow-tarball --version 2.8.0rc1
.. image:: ./images/breeze/output_release-management_prepare-airflow-tarball.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/images/breeze/output_release-management_prepare-airflow-tarball.svg
:width: 100%
:alt: Breeze release-management prepare-airflow-tarball
Start minor branch of Airflow
"""""""""""""""""""""""""""""
Expand Down
15 changes: 11 additions & 4 deletions dev/README_RELEASE_AIRFLOW.md
Original file line number Diff line number Diff line change
Expand Up @@ -585,16 +585,23 @@ Airflow supports reproducible builds, which means that the packages prepared fro
produce binary identical packages in reproducible way. You should check if the packages can be
binary-reproduced when built from the sources.

Checkout airflow sources and build packages in dist folder:
Checkout airflow sources and build packages in dist folder (replace X.Y.Zrc1 with the version
you are checking):

```shell script
git checkout X.Y.Zrc1
VERSION=X.Y.Zrc1
git checkout ${VERSION}
export AIRFLOW_REPO_ROOT=$(pwd)
rm -rf dist/*
breeze release-management prepare-airflow-package --package-format both
breeze release-management prepare-airflow-tarball --version ${VERSION}
breeze release-management prepare-airflow-package --package-format both --use-local-hatch
```

That should produce `.whl` and `.tar.gz` packages in dist folder.
Note that you need to have `hatch` installed in order to build the packages with the last command.
If you do not have `hatch`, you can remove the `--use-local-hatch` flag and it will build and use
docker image that has `hatch` and other necessary tools installed.

That should produce `-source.tar.gz` tarball of sources and `.whl`, `.tar.gz` packages in dist folder.

Change to the directory where you have the packages from svn:

Expand Down
136 changes: 100 additions & 36 deletions dev/breeze/src/airflow_breeze/commands/release_candidate_command.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from __future__ import annotations

import os
import shutil

import click

Expand All @@ -25,6 +26,7 @@
from airflow_breeze.utils.confirm import confirm_action
from airflow_breeze.utils.console import console_print
from airflow_breeze.utils.path_utils import AIRFLOW_SOURCES_ROOT
from airflow_breeze.utils.reproducible import archive_deterministically, get_source_date_epoch
from airflow_breeze.utils.run_utils import run_command

CI = os.environ.get("CI")
Expand Down Expand Up @@ -59,21 +61,30 @@ def merge_pr(version_branch):
def git_tag(version):
if confirm_action(f"Tag {version}?"):
run_command(["git", "tag", "-s", f"{version}", "-m", f"Apache Airflow {version}"], check=True)
console_print("Tagged")
console_print("[success]Tagged")


def git_clean():
if confirm_action("Clean git repo?"):
run_command(["breeze", "ci", "fix-ownership"], dry_run_override=DRY_RUN, check=True)
run_command(["git", "clean", "-fxd"], dry_run_override=DRY_RUN, check=True)
console_print("Git repo cleaned")
console_print("[success]Git repo cleaned")


def tarball_release(version, version_without_rc):
if confirm_action("Create tarball?"):
run_command(["rm", "-rf", "dist"], check=True)
DIST_DIR = AIRFLOW_SOURCES_ROOT / "dist"
OUT_DIR = AIRFLOW_SOURCES_ROOT / "out"
REPRODUCIBLE_DIR = OUT_DIR / "reproducible"


run_command(["mkdir", "dist"], check=True)
def tarball_release(version: str, version_without_rc: str, source_date_epoch: int):
if confirm_action("Create tarball?"):
console_print(f"[info]Creating tarball for Airflow {version}")
shutil.rmtree(OUT_DIR, ignore_errors=True)
DIST_DIR.mkdir(exist_ok=True)
OUT_DIR.mkdir(exist_ok=True)
REPRODUCIBLE_DIR.mkdir(exist_ok=True)
archive_name = f"apache-airflow-{version_without_rc}-source.tar.gz"
temporary_archive = OUT_DIR / archive_name
run_command(
[
"git",
Expand All @@ -82,19 +93,48 @@ def tarball_release(version, version_without_rc):
f"{version}",
f"--prefix=apache-airflow-{version_without_rc}/",
"-o",
f"dist/apache-airflow-{version_without_rc}-source.tar.gz",
temporary_archive.as_posix(),
],
check=True,
)
console_print("Tarball created")
run_command(
[
"tar",
"-xf",
temporary_archive.as_posix(),
"-C",
REPRODUCIBLE_DIR.as_posix(),
"--strip",
"1",
]
)
final_archive = DIST_DIR / archive_name
archive_deterministically(
dir_to_archive=REPRODUCIBLE_DIR.as_posix(),
dest_archive=final_archive.as_posix(),
prepend_path=None,
timestamp=source_date_epoch,
)
console_print(f"[success]Tarball created in {final_archive}")


def create_artifacts_with_sdist():
run_command(["hatch", "build", "-t", "sdist", "-t", "wheel"], check=True)
console_print("Artifacts created")
def create_artifacts_with_hatch(source_date_epoch: int):
console_print("[info]Creating artifacts with hatch")
shutil.rmtree(DIST_DIR, ignore_errors=True)
DIST_DIR.mkdir(exist_ok=True)
env_copy = os.environ.copy()
env_copy["SOURCE_DATE_EPOCH"] = str(source_date_epoch)
run_command(
["hatch", "build", "-c", "-t", "custom", "-t", "sdist", "-t", "wheel"], check=True, env=env_copy
)
console_print("[success]Successfully prepared Airflow packages:")
for file in sorted(DIST_DIR.glob("apache_airflow*")):
console_print(print(file.name))
console_print()


def create_artifacts_with_breeze():
def create_artifacts_with_docker():
console_print("[info]Creating artifacts with docker")
run_command(
[
"breeze",
Expand All @@ -105,14 +145,14 @@ def create_artifacts_with_breeze():
],
check=True,
)
console_print("Artifacts created")
console_print("[success]Artifacts created")


def sign_the_release(repo_root):
if confirm_action("Do you want to sign the release?"):
os.chdir(f"{repo_root}/dist")
run_command("./../dev/sign.sh *", dry_run_override=DRY_RUN, check=True, shell=True)
console_print("Release signed")
console_print("[success]Release signed")


def tag_and_push_constraints(version, version_branch):
Expand All @@ -135,7 +175,7 @@ def tag_and_push_constraints(version, version_branch):
run_command(
["git", "push", "origin", "tag", f"constraints-{version}"], dry_run_override=DRY_RUN, check=True
)
console_print("Constraints tagged and pushed")
console_print("[success]Constraints tagged and pushed")


def clone_asf_repo(version, repo_root):
Expand All @@ -146,15 +186,15 @@ def clone_asf_repo(version, repo_root):
check=True,
)
run_command(["svn", "update", "--set-depth=infinity", "asf-dist/dev/airflow"], check=True)
console_print("Cloned ASF repo successfully")
console_print("[success]Cloned ASF repo successfully")


def move_artifacts_to_svn(version, repo_root):
if confirm_action("Do you want to move artifacts to SVN?"):
os.chdir(f"{repo_root}/asf-dist/dev/airflow")
run_command(["svn", "mkdir", f"{version}"], dry_run_override=DRY_RUN, check=True)
run_command(f"mv {repo_root}/dist/* {version}/", dry_run_override=DRY_RUN, check=True, shell=True)
console_print("Moved artifacts to SVN:")
console_print("[success]Moved artifacts to SVN:")
run_command(["ls"], dry_run_override=DRY_RUN)


Expand All @@ -171,7 +211,7 @@ def push_artifacts_to_asf_repo(version, repo_root):
dry_run_override=DRY_RUN,
check=True,
)
console_print("Files pushed to svn")
console_print("[success]Files pushed to svn")


def delete_asf_repo(repo_root):
Expand All @@ -182,7 +222,7 @@ def delete_asf_repo(repo_root):

def prepare_pypi_packages(version, version_suffix, repo_root):
if confirm_action("Prepare pypi packages?"):
console_print("Preparing PyPI packages")
console_print("[info]Preparing PyPI packages")
os.chdir(repo_root)
run_command(["git", "checkout", f"{version}"], dry_run_override=DRY_RUN, check=True)
run_command(
Expand All @@ -198,13 +238,13 @@ def prepare_pypi_packages(version, version_suffix, repo_root):
check=True,
)
run_command(["twine", "check", "dist/*"], check=True)
console_print("PyPI packages prepared")
console_print("[success]PyPI packages prepared")


def push_packages_to_pypi(version):
if confirm_action("Do you want to push packages to production PyPI?"):
run_command(["twine", "upload", "-r", "pypi", "dist/*"], dry_run_override=DRY_RUN, check=True)
console_print("Packages pushed to production PyPI")
console_print("[success]Packages pushed to production PyPI")
console_print(
"Again, confirm that the package is available here: https://pypi.python.org/pypi/apache-airflow"
)
Expand Down Expand Up @@ -240,7 +280,7 @@ def push_release_candidate_tag_to_github(version):
)
confirm_action(f"Confirm that {version} is pushed to PyPI(not PyPI test). Is it pushed?", abort=True)
run_command(["git", "push", "origin", "tag", f"{version}"], dry_run_override=DRY_RUN, check=True)
console_print("Release candidate tag pushed to GitHub")
console_print("[success]Release candidate tag pushed to GitHub")


def create_issue_for_testing(version, previous_version, github_token):
Expand Down Expand Up @@ -293,10 +333,31 @@ def remove_old_releases(version, repo_root):
dry_run_override=DRY_RUN,
check=True,
)

console_print("[success]Old releases removed")
os.chdir(repo_root)


@release_management.command(
name="prepare-airflow-tarball",
help="Prepare airflow's source tarball.",
)
@click.option(
"--version", required=True, help="The release candidate version e.g. 2.4.3rc1", envvar="VERSION"
)
def prepare_airflow_tarball(version: str):
from packaging.version import Version

airflow_version = Version(version)
if not airflow_version.is_prerelease:
exit("--version value must be a pre-release")
source_date_epoch = get_source_date_epoch()
version_without_rc = airflow_version.base_version
# Create the tarball
tarball_release(
version=version, version_without_rc=version_without_rc, source_date_epoch=source_date_epoch
)


@release_management.command(
name="start-rc-process",
short_help="Start RC process",
Expand All @@ -311,7 +372,8 @@ def remove_old_releases(version, repo_root):
def publish_release_candidate(version, previous_version, github_token):
from packaging.version import Version

if not Version(version).is_prerelease:
airflow_version = Version(version)
if not airflow_version.is_prerelease:
exit("--version value must be a pre-release")
if Version(previous_version).is_prerelease:
exit("--previous-version value must be a release not a pre-release")
Expand All @@ -320,9 +382,10 @@ def publish_release_candidate(version, previous_version, github_token):
if not github_token:
console_print("GITHUB_TOKEN is not set! Issue generation will fail.")
confirm_action("Do you want to continue?", abort=True)
version_suffix = version[5:]
version_branch = version[:3].replace(".", "-")
version_without_rc = version[:5]

version_suffix = airflow_version.pre[0] + str(airflow_version.pre[1])
version_branch = str(airflow_version.release[0]) + "-" + str(airflow_version.release[1])
version_without_rc = airflow_version.base_version
os.chdir(AIRFLOW_SOURCES_ROOT)
airflow_repo_root = os.getcwd()

Expand All @@ -343,20 +406,21 @@ def publish_release_candidate(version, previous_version, github_token):
confirm_action("Pushes will be made to origin. Do you want to continue?", abort=True)
# Merge the sync PR
merge_pr(version_branch)

# Tag & clean the repo
#
# # Tag & clean the repo
git_tag(version)
git_clean()
# Build the latest image
if confirm_action("Build latest breeze image?"):
run_command(["breeze", "ci-image", "build", "--python", "3.8"], dry_run_override=DRY_RUN, check=True)
source_date_epoch = get_source_date_epoch()
shutil.rmtree(DIST_DIR, ignore_errors=True)
# Create the tarball
tarball_release(version, version_without_rc)
tarball_release(
version=version, version_without_rc=version_without_rc, source_date_epoch=source_date_epoch
)
# Create the artifacts
if confirm_action("Use breeze to create artifacts?"):
create_artifacts_with_breeze()
if confirm_action("Use docker to create artifacts?"):
create_artifacts_with_docker()
elif confirm_action("Use hatch to create artifacts?"):
create_artifacts_with_sdist()
create_artifacts_with_hatch()
# Sign the release
sign_the_release(airflow_repo_root)
# Tag and push constraints
Expand Down
Loading

0 comments on commit 72a571d

Please sign in to comment.