-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-35373][BUILD] Check Maven artifact checksum in build/mvn #32505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -26,36 +26,67 @@ _COMPILE_JVM_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=1g" | |
|
|
||
| # Installs any application tarball given a URL, the expected tarball name, | ||
| # and, optionally, a checkable binary path to determine if the binary has | ||
| # already been installed | ||
| ## Arg1 - URL | ||
| ## Arg2 - Tarball Name | ||
| ## Arg3 - Checkable Binary | ||
| # already been installed. Arguments: | ||
| # 1 - Mirror host | ||
| # 2 - URL path on host | ||
| # 3 - URL query string | ||
| # 4 - checksum suffix | ||
| # 5 - Tarball Name | ||
| # 6 - Checkable Binary | ||
| install_app() { | ||
| local remote_tarball="$1" | ||
| local local_tarball="${_DIR}/$2" | ||
| local binary="${_DIR}/$3" | ||
| local mirror_host="$1" | ||
| local url_path="$2" | ||
| local url_query="$3" | ||
| local checksum_suffix="$4" | ||
| local local_tarball="${_DIR}/$5" | ||
| local binary="${_DIR}/$6" | ||
| local remote_tarball="${mirror_host}/${url_path}${url_query}" | ||
| local local_checksum="${local_tarball}.${checksum_suffix}" | ||
| local remote_checksum="https://archive.apache.org/dist/${url_path}.${checksum_suffix}" | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One more point : the checksums are not mirrored (well, not consistently). But we very much want to check only the ASF's version of the checksum. If an attacker modified a mirror's artifact, they could just modify the checksum too of course. |
||
|
|
||
| local curl_opts="--silent --show-error -L" | ||
| local wget_opts="--no-verbose" | ||
|
|
||
| if [ -z "$3" -o ! -f "$binary" ]; then | ||
| if [ ! -f "$binary" ]; then | ||
srowen marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| # check if we already have the tarball | ||
| # check if we have curl installed | ||
| # download application | ||
| [ ! -f "${local_tarball}" ] && [ $(command -v curl) ] && \ | ||
| echo "exec: curl ${curl_opts} ${remote_tarball}" 1>&2 && \ | ||
| if [ ! -f "${local_tarball}" -a $(command -v curl) ]; then | ||
srowen marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| echo "exec: curl ${curl_opts} ${remote_tarball}" 1>&2 | ||
| curl ${curl_opts} "${remote_tarball}" > "${local_tarball}" | ||
| if [ ! -z "${checksum_suffix}" ]; then | ||
| echo "exec: curl ${curl_opts} ${remote_checksum}" 1>&2 | ||
| curl ${curl_opts} "${remote_checksum}" > "${local_checksum}" | ||
| fi | ||
| fi | ||
| # if the file still doesn't exist, lets try `wget` and cross our fingers | ||
| [ ! -f "${local_tarball}" ] && [ $(command -v wget) ] && \ | ||
| echo "exec: wget ${wget_opts} ${remote_tarball}" 1>&2 && \ | ||
| if [ ! -f "${local_tarball}" -a $(command -v wget) ]; then | ||
| echo "exec: wget ${wget_opts} ${remote_tarball}" 1>&2 | ||
| wget ${wget_opts} -O "${local_tarball}" "${remote_tarball}" | ||
| if [ ! -z "${checksum_suffix}" ]; then | ||
| echo "exec: wget ${wget_opts} ${remote_checksum}" 1>&2 | ||
| wget ${wget_opts} -O "${local_checksum}" "${remote_checksum}" | ||
| fi | ||
| fi | ||
| # if both were unsuccessful, exit | ||
| [ ! -f "${local_tarball}" ] && \ | ||
| echo -n "ERROR: Cannot download $2 with cURL or wget; " && \ | ||
| echo "please install manually and try again." && \ | ||
| if [ ! -f "${local_tarball}" ]; then | ||
| echo -n "ERROR: Cannot download ${remote_tarball} with cURL or wget; please install manually and try again." | ||
| exit 2 | ||
| cd "${_DIR}" && tar -xzf "$2" | ||
| rm -rf "$local_tarball" | ||
| fi | ||
| # Checksum may not have been specified; don't check if doesn't exist | ||
| if [ -f "${local_checksum}" ]; then | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't need to add
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh really? shoot. That seemed to be the one command that was on Linux and OS X. Well, we can fail outright if not present, which would stop the build. Is that better or worse than just silently letting it proceed without verification? I don't know. Would developers generally build on this distro?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yea, for Amazon Linux2, we need to run
IMO, since this check exists to prevent cyber attacks, it is safer to make
Not sure, but that's the default option in aws ec2. I noticed this issue because my daily build broke after this commit merged.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK. I can submit a follow-up to fail if shasum is not present. That might inconvenience some developers, but, this is a security issue, and of course developers are welcome to install mvn / scala directly if desired.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 |
||
| echo " ${local_tarball}" >> ${local_checksum} # two spaces + file are important! | ||
| # Assuming SHA512 here for now | ||
| echo "Veryfing checksum from ${local_checksum}" 1>&2 | ||
| if ! shasum -a 512 -q -c "${local_checksum}" ; then | ||
| echo "Bad checksum from ${remote_checksum}" | ||
| exit 2 | ||
| fi | ||
| fi | ||
|
|
||
| cd "${_DIR}" && tar -xzf "${local_tarball}" | ||
| rm -rf "${local_tarball}" | ||
| rm -f "${local_checksum}" | ||
| fi | ||
| } | ||
|
|
||
|
|
@@ -71,21 +102,26 @@ install_mvn() { | |
| local MVN_DETECTED_VERSION="$(mvn --version | head -n1 | awk '{print $3}')" | ||
| fi | ||
| if [ $(version $MVN_DETECTED_VERSION) -lt $(version $MVN_VERSION) ]; then | ||
| local FILE_PATH="maven/maven-3/${MVN_VERSION}/binaries/apache-maven-${MVN_VERSION}-bin.tar.gz" | ||
| local MVN_TARBALL="apache-maven-${MVN_VERSION}-bin.tar.gz" | ||
| local FILE_PATH="maven/maven-3/${MVN_VERSION}/binaries/${MVN_TARBALL}" | ||
| local APACHE_MIRROR=${APACHE_MIRROR:-'https://www.apache.org/dyn/closer.lua'} | ||
| local MIRROR_URL="${APACHE_MIRROR}/${FILE_PATH}?action=download" | ||
| local MIRROR_URL_QUERY="?action=download" | ||
|
|
||
| if [ $(command -v curl) ]; then | ||
| if ! curl -L --output /dev/null --silent --head --fail "${MIRROR_URL}" ; then | ||
| if ! curl -L --output /dev/null --silent --head --fail "${APACHE_MIRROR}/${FILE_PATH}${MIRROR_URL_QUERY}" ; then | ||
| # Fall back to archive.apache.org for older Maven | ||
| echo "Falling back to archive.apache.org to download Maven" | ||
| MIRROR_URL="https://archive.apache.org/dist/${FILE_PATH}" | ||
| APACHE_MIRROR="https://archive.apache.org/dist" | ||
| MIRROR_URL_QUERY="" | ||
| fi | ||
| fi | ||
|
|
||
| install_app \ | ||
| "${MIRROR_URL}" \ | ||
| "apache-maven-${MVN_VERSION}-bin.tar.gz" \ | ||
| "${APACHE_MIRROR}" \ | ||
| "${FILE_PATH}" \ | ||
| "${MIRROR_URL_QUERY}" \ | ||
| "sha512" \ | ||
| "${MVN_TARBALL}" \ | ||
| "apache-maven-${MVN_VERSION}/bin/mvn" | ||
|
|
||
| MVN_BIN="${_DIR}/apache-maven-${MVN_VERSION}/bin/mvn" | ||
|
|
@@ -101,10 +137,14 @@ install_scala() { | |
| local scala_version=`grep "scala.version" "${_DIR}/../pom.xml" | grep ${scala_binary_version} | head -n1 | awk -F '[<>]' '{print $3}'` | ||
| local scala_bin="${_DIR}/scala-${scala_version}/bin/scala" | ||
| local TYPESAFE_MIRROR=${TYPESAFE_MIRROR:-https://downloads.lightbend.com} | ||
| local SCALA_TARBALL="scala-${scala_version}.tgz" | ||
|
|
||
| install_app \ | ||
| "${TYPESAFE_MIRROR}/scala/${scala_version}/scala-${scala_version}.tgz" \ | ||
| "scala-${scala_version}.tgz" \ | ||
| "${TYPESAFE_MIRROR}" \ | ||
| "scala/${scala_version}/${SCALA_TARBALL}" \ | ||
| "" \ | ||
| "" \ | ||
srowen marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ${SCALA_TARBALL} \ | ||
| "scala-${scala_version}/bin/scala" | ||
|
|
||
| SCALA_COMPILER="$(cd "$(dirname "${scala_bin}")/../lib" && pwd)/scala-compiler.jar" | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to break down the arguments further to make the logic work below. As part of that I tried to refactor this method a bit, which was getting hard to understand.