Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-1579: [Java] Adding containerized Spark Integration tests #1319

Prev Previous commit
Next Next commit
using build/mvn script for spark builds
  • Loading branch information
BryanCutler committed Jan 25, 2018
commit e38d43db3a5206c1562d0478f4c911aca7189d3b
4 changes: 1 addition & 3 deletions dev/spark_integration/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,7 @@ WORKDIR /apache-arrow
# Basic OS utilities
RUN apt-get update && apt-get install -y \
wget \
git \
software-properties-common

git
# This will install conda in /home/ubuntu/miniconda
#RUN wget -O /tmp/miniconda.sh \
# https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
Expand Down
6 changes: 4 additions & 2 deletions dev/spark_integration/spark_integration.sh
Original file line number Diff line number Diff line change
Expand Up @@ -47,18 +47,20 @@ git clone https://github.com/apache/spark.git
pushd spark
sed -i -e "s/\(.*<arrow.version>\).*\(<\/arrow.version>\)/\1$ARROW_VERSION\2/g" ./pom.xml
echo "Building Spark with Arrow $ARROW_VERSION"
mvn -DskipTests clean package
build/mvn -DskipTests clean package

# Run Arrow related Scala tests only, NOTE: -Dtest=_NonExist_ is to enable surefire test discovery without running any tests so that Scalatest can run
SPARK_SCALA_TESTS="org.apache.spark.sql.execution.arrow,org.apache.spark.sql.execution.vectorized.ColumnarBatchSuite,org.apache.spark.sql.execution.vectorized.ArrowColumnVectorSuite"
echo "Testing Spark $SPARK_SCALA_TESTS"
mvn -Dtest=_NonExist_ -DwildcardSuites="'$SPARK_SCALA_TESTS'" test -pl sql/core
# TODO: should be able to only build spark-sql tests with adding "-pl sql/core" but not currently working
build/mvn -Dtest=none -DwildcardSuites="$SPARK_SCALA_TESTS" test

# Run pyarrow related Python tests only
#SPARK_TESTING=1 bin/pyspark pyspark.sql.tests ArrowTests GroupbyApplyTests VectorizedUDFTests
popd

# Clean up
echo "Cleaning up.."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for these two line, at the end the environment is thrown away anyways

#rm -rf spark .local
rm -rf spark

Expand Down