Skip to content

Commit 2d2e801

Browse files
committed
[SPARK-18639] Build only a single pip package
## What changes were proposed in this pull request? We current build 5 separate pip binary tar balls, doubling the release script runtime. It'd be better to build one, especially for use cases that are just using Spark locally. In the long run, it would make more sense to have Hadoop support be pluggable. ## How was this patch tested? N/A - this is a release build script that doesn't have any automated test coverage. We will know if it goes wrong when we prepare releases. Author: Reynold Xin <rxin@databricks.com> Closes #16072 from rxin/SPARK-18639. (cherry picked from commit 37e52f8) Signed-off-by: Reynold Xin <rxin@databricks.com>
1 parent 4746674 commit 2d2e801

File tree

1 file changed

+27
-18
lines changed

1 file changed

+27
-18
lines changed

dev/create-release/release-build.sh

Lines changed: 27 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,7 @@ if [[ "$1" == "package" ]]; then
150150
NAME=$1
151151
FLAGS=$2
152152
ZINC_PORT=$3
153+
BUILD_PIP_PACKAGE=$4
153154
cp -r spark spark-$SPARK_VERSION-bin-$NAME
154155

155156
cd spark-$SPARK_VERSION-bin-$NAME
@@ -170,24 +171,32 @@ if [[ "$1" == "package" ]]; then
170171
# Get maven home set by MVN
171172
MVN_HOME=`$MVN -version 2>&1 | grep 'Maven home' | awk '{print $NF}'`
172173

173-
echo "Creating distribution"
174-
./dev/make-distribution.sh --name $NAME --mvn $MVN_HOME/bin/mvn --tgz --pip $FLAGS \
175-
-DzincPort=$ZINC_PORT 2>&1 > ../binary-release-$NAME.log
176-
cd ..
177174

178-
echo "Copying and signing python distribution"
179-
PYTHON_DIST_NAME=pyspark-$PYSPARK_VERSION.tar.gz
180-
cp spark-$SPARK_VERSION-bin-$NAME/python/dist/$PYTHON_DIST_NAME .
181-
182-
echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \
183-
--output $PYTHON_DIST_NAME.asc \
184-
--detach-sig $PYTHON_DIST_NAME
185-
echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
186-
MD5 $PYTHON_DIST_NAME > \
187-
$PYTHON_DIST_NAME.md5
188-
echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
189-
SHA512 $PYTHON_DIST_NAME > \
190-
$PYTHON_DIST_NAME.sha
175+
if [ -z "$BUILD_PIP_PACKAGE" ]; then
176+
echo "Creating distribution without PIP package"
177+
./dev/make-distribution.sh --name $NAME --mvn $MVN_HOME/bin/mvn --tgz $FLAGS \
178+
-DzincPort=$ZINC_PORT 2>&1 > ../binary-release-$NAME.log
179+
cd ..
180+
else
181+
echo "Creating distribution with PIP package"
182+
./dev/make-distribution.sh --name $NAME --mvn $MVN_HOME/bin/mvn --tgz --pip $FLAGS \
183+
-DzincPort=$ZINC_PORT 2>&1 > ../binary-release-$NAME.log
184+
cd ..
185+
186+
echo "Copying and signing python distribution"
187+
PYTHON_DIST_NAME=pyspark-$PYSPARK_VERSION.tar.gz
188+
cp spark-$SPARK_VERSION-bin-$NAME/python/dist/$PYTHON_DIST_NAME .
189+
190+
echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \
191+
--output $PYTHON_DIST_NAME.asc \
192+
--detach-sig $PYTHON_DIST_NAME
193+
echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
194+
MD5 $PYTHON_DIST_NAME > \
195+
$PYTHON_DIST_NAME.md5
196+
echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
197+
SHA512 $PYTHON_DIST_NAME > \
198+
$PYTHON_DIST_NAME.sha
199+
fi
191200

192201
echo "Copying and signing regular binary distribution"
193202
cp spark-$SPARK_VERSION-bin-$NAME/spark-$SPARK_VERSION-bin-$NAME.tgz .
@@ -211,7 +220,7 @@ if [[ "$1" == "package" ]]; then
211220
make_binary_release "hadoop2.3" "-Phadoop-2.3 $FLAGS" "3033" &
212221
make_binary_release "hadoop2.4" "-Phadoop-2.4 $FLAGS" "3034" &
213222
make_binary_release "hadoop2.6" "-Phadoop-2.6 $FLAGS" "3035" &
214-
make_binary_release "hadoop2.7" "-Phadoop-2.7 $FLAGS" "3036" &
223+
make_binary_release "hadoop2.7" "-Phadoop-2.7 $FLAGS" "3036" "withpip" &
215224
make_binary_release "hadoop2.4-without-hive" "-Psparkr -Phadoop-2.4 -Pyarn -Pmesos" "3037" &
216225
make_binary_release "without-hadoop" "-Psparkr -Phadoop-provided -Pyarn -Pmesos" "3038" &
217226
wait

0 commit comments

Comments
 (0)