Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-1579: [Java] Adding containerized Spark Integration tests #1319

Prev Previous commit
now building with ARROW_BUILD_TOOLCHAIN set to conda env
  • Loading branch information
BryanCutler committed Feb 14, 2018
commit 3f9f483d45f0c1b5a9052eb2950b48e8fb328318
3 changes: 0 additions & 3 deletions dev/spark_integration/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,6 @@ RUN apt-get update && apt-get install -y \
git build-essential \
software-properties-common

#RUN apt-add-repository -y ppa:ubuntu-toolchain-r/test \
# && apt-get update && apt-get install -y gcc-4.9 g++-4.9

# This will install conda in /home/ubuntu/miniconda
RUN wget -O /tmp/miniconda.sh \
https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
Expand Down
33 changes: 9 additions & 24 deletions dev/spark_integration/spark_integration.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,41 +16,36 @@
# limitations under the License.
#

# Exit on any error
set -e

# Set up environment and working directory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call set -e here, then a command failure in the script will lead to a failure of the whole script. Then you can get rid of the if [[ $? -ne 0 ]]; then blocks later on

cd /apache-arrow

# Activate our pyarrow-dev conda env
source activate pyarrow-dev

export ARROW_BUILD_TYPE=Release
export ARROW_HOME=$(pwd)/arrow
#export ARROW_BUILD_TOOLCHAIN=$CONDA_PREFIX
export BOOST_ROOT=$CONDA_PREFIX
CONDA_BASE=/home/ubuntu/miniconda
export LD_LIBRARY_PATH=${ARROW_HOME}/lib:${CONDA_BASE}/lib:${LD_LIBRARY_PATH}
export PYTHONPATH=${ARROW_HOME}/python:${PYTHONPATH}
export ARROW_BUILD_TYPE=release
export ARROW_BUILD_TOOLCHAIN=$CONDA_PREFIX
export LD_LIBRARY_PATH=${ARROW_HOME}/lib:${LD_LIBRARY_PATH}
export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"

# Build Arrow C++
pushd arrow/cpp
rm -rf build/*
mkdir -p build
cd build/
cmake -DARROW_PYTHON=on -DARROW_HDFS=on -DCMAKE_BUILD_TYPE=release -DCMAKE_INSTALL_PREFIX=$ARROW_HOME ..
cmake -DCMAKE_CXX_FLAGS="-D_GLIBCXX_USE_CXX11_ABI=0" -DARROW_PYTHON=on -DARROW_HDFS=on -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE -DCMAKE_INSTALL_PREFIX=$ARROW_HOME ..
make -j4
if [[ $? -ne 0 ]]; then
exit 1
fi
make install
popd

# Build pyarrow and install inplace
export PYARROW_CXXFLAGS="-D_GLIBCXX_USE_CXX11_ABI=0"
pushd arrow/python
python setup.py clean
python setup.py build_ext --build-type=release --inplace
if [[ $? -ne 0 ]]; then
exit 1
fi
python setup.py build_ext --build-type=$ARROW_BUILD_TYPE install
popd

# Install Arrow to local maven repo and get the version
Expand Down Expand Up @@ -88,20 +83,10 @@ SPARK_SCALA_TESTS="org.apache.spark.sql.execution.arrow,org.apache.spark.sql.exe
echo "Testing Spark: $SPARK_SCALA_TESTS"
# TODO: should be able to only build spark-sql tests with adding "-pl sql/core" but not currently working
build/mvn -Dtest=none -DwildcardSuites="$SPARK_SCALA_TESTS" test
if [[ $? -ne 0 ]]; then
exit 1
fi

# Run pyarrow related Python tests only
SPARK_PYTHON_TESTS="ArrowTests PandasUDFTests ScalarPandasUDFTests GroupedMapPandasUDFTests GroupedAggPandasUDFTests"
echo "Testing PySpark: $SPARK_PYTHON_TESTS"
SPARK_TESTING=1 bin/pyspark pyspark.sql.tests $SPARK_PYTHON_TESTS
if [[ $? -ne 0 ]]; then
exit 1
fi
popd

# Clean up
echo "Cleaning up.."
source deactivate