Skip to content

Commit 3a8d6ba

Browse files
Improve reproduciblity for release candidate artifacts
The SBT native packager plugin is used to build helper binaries for release candidate. In some cases these binaries are difficult to check for reproducibility due to metadata that is embedded in the files. This modifies our SBT configurations where possible to remove as much variance as possible. For tar, this adds options (based on tar reproducibility documentation) that sets things like userid's and modification times to consistent values. Note that the --sort=name option does not work on the version of tar available on GitHub Windows and MacOS systems, so we now only generate the tar on Linux CI. For rpm, this sets a number of macros (e.g buildhost) so that the embedded values in the RPM are always the same regardless of the actual environment properties, which can differ between systems. We also change the shebang in the bash script to be more portable. Note that are still some macros in RPM that cannot be controlled my %defines, so in general a same or similar environment is needed for reproducible RPMs. For msi, there is nothing more we can do. There are only a couple of timestamps and UUID's that cannot be changed. msidiff is a useful tool that shows these are the only differences. Zip artifacts are already reproducible and do not need changes. DAFFODIL-2971
1 parent 131bf2c commit 3a8d6ba

File tree

3 files changed

+57
-10
lines changed

3 files changed

+57
-10
lines changed

.github/workflows/main.yml

+4-4
Original file line numberDiff line numberDiff line change
@@ -184,12 +184,12 @@ jobs:
184184
- name: Build Documentation
185185
run: $SBT unidoc genTunablesDoc
186186

187-
- name: Package Zip & Tar
188-
run: $SBT daffodil-cli/Universal/packageBin daffodil-cli/Universal/packageZipTarball
187+
- name: Package Zip
188+
run: $SBT daffodil-cli/Universal/packageBin
189189

190-
- name: Package RPM (Linux)
190+
- name: Package RPM & Tar (Linux)
191191
if: runner.os == 'Linux'
192-
run: $SBT daffodil-cli/Rpm/packageBin
192+
run: $SBT daffodil-cli/Rpm/packageBin daffodil-cli/Universal/packageZipTarball
193193

194194
############################################################
195195
# Check

daffodil-cli/build.sbt

+52-5
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,27 @@ Linux / packageName := executableScriptName.value
3030
Rpm / packageName := "apache-" + executableScriptName.value
3131
Windows / packageName := executableScriptName.value
3232

33+
val optSourceDateEpoch = scala.util.Properties.envOrNone("SOURCE_DATE_EPOCH")
34+
35+
// prepend additional options to the tar command for reproducibility. We prepend because the
36+
// default value of this setting includes the -f option at the end, which needs to stay at the
37+
// end since sbt-native-packager provides the archive file immediately after
38+
Universal / packageZipTarball / universalArchiveOptions := {
39+
val optMtime = optSourceDateEpoch.map { epoch =>
40+
val fmt = new java.text.SimpleDateFormat("yyyy-MM-dd HH:mm:ssZ")
41+
fmt.setTimeZone(java.util.TimeZone.getTimeZone("UTC"))
42+
val mtime = fmt.format(new java.util.Date(epoch.toLong * 1000))
43+
s"--mtime=$mtime"
44+
}
45+
val newOptions = Seq(
46+
"--sort=name",
47+
"--owner=0",
48+
"--group=0",
49+
"--numeric-owner"
50+
) ++ optMtime
51+
newOptions ++ (Universal / packageZipTarball / universalArchiveOptions).value
52+
}
53+
3354
Universal / mappings ++= Seq(
3455
baseDirectory.value / "bin.LICENSE" -> "LICENSE",
3556
baseDirectory.value / "bin.NOTICE" -> "NOTICE",
@@ -83,12 +104,38 @@ carried by data processing frameworks so as to bypass any XML/JSON overheads.
83104
// rpmbuild behavior, we can simply append them to the RPM description and
84105
// things still work as expected.
85106
//
86-
// In this case, we want to disable zstd compression which isn't supported by
87-
// older versions of RPM. So we add the following special rpm %define's to use
88-
// gzip compression instead, which is supported by all versions of RPM.
107+
// Older versions of RPM do not support zstd compression. To disable this we can
108+
// define _source_payload and _binary_payload to use gzip compression.
109+
// Additionally, the bulk of the RPM is jars which are already compressed and
110+
// won't really compress any further, so we set the compression level to zero
111+
// for faster builds.
112+
//
113+
// _buildhost is set to ensure reproducible builds regardless of the hostname of
114+
// the system where where we are building the RPM.
115+
//
116+
// optflags is set to empty for reproducible builds--different systems use
117+
// different values of optflags and store the value in the RPM metadata. It
118+
// doesn't matter that we set it to nil because the macro is only used for
119+
// things like CFLAGS, CXXFLAGS, etc. and the way use rpmbuild it does not use
120+
// this flags, since it just packages files already built by SBT.
121+
//
122+
// Even with these above settings, different systems still might create RPMs
123+
// with different internal tags. For example, the CLASSDICT and FILECLASS tags
124+
// cannot be controlled by the spec file, and include human readable
125+
// descriptions of each installed file. These descriptions are created by
126+
// libmagic and can differ depending on the version of libmagic on a system. RPM
127+
// also includes a PLATFORM tag that usually includes the distribution (e.g.
128+
// redhat vs debian), again something the spec file cannot change. And RPM also
129+
// includes the version of RPM used to build the RPM file. All that to say that
130+
// although we can minimze differences by changing some macros, the same or very
131+
// similar environment is still needed for byte exact reproducible RPM builds.
132+
// However, this is usually enough for the rpmdiff tool to report no differences
133+
// since it doesn't look at tags that don't really matter.
89134
Rpm / packageDescription := (Rpm / packageDescription).value + """
90-
%define _source_payload w9.gzdio
91-
%define _binary_payload w9.gzdio
135+
%define _source_payload w0.gzdio
136+
%define _binary_payload w0.gzdio
137+
%define _buildhost daffodil.build
138+
%define optflags %{nil}
92139
"""
93140

94141
Rpm / version := {

daffodil-cli/src/templates/bash-template

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#!/bin/bash
1+
#!/usr/bin/env bash
22
#
33
# Licensed to the Apache Software Foundation (ASF) under one or more
44
# contributor license agreements. See the NOTICE file distributed with

0 commit comments

Comments
 (0)