Skip to content

Commit 4da93b0

Browse files
committed
[SPARK-32363][PYTHON][BUILD] Fix flakiness in pip package testing in Jenkins
### What changes were proposed in this pull request? This PR proposes: - Don't use `--user` in pip packaging test - Pull `source` out of the subshell, and place it first. - Exclude user sitepackages in Python path during pip installation test to address the flakiness of the pip packaging test in Jenkins. (I think) #29116 caused this flakiness given my observation in the Jenkins log. I had to work around by specifying `--user` but it turned out that it does not properly work in old Conda on Jenkins for some reasons. Therefore, reverting this change back. (I think) the installation at user site-packages affects other environments created by Conda in the old Conda version that Jenkins has. Seems it fails to isolate the environments for some reasons. So, it excludes user sitepackages in the Python path during the test. In addition, #29116 also added some fallback logics of `conda (de)activate` and `source (de)activate` because Conda prefers to use `conda (de)activate` now per the official documentation and `source (de)activate` doesn't work for some reasons in certain environments (see also conda/conda#7980). The problem was that `source` loads things to the current shell so does not affect the current shell. Therefore, this PR pulls `source` out of the subshell. Disclaimer: I made the analysis purely based on Jenkins machine's log in this PR. It may have a different reason I missed during my observation. ### Why are the changes needed? To make the build and tests pass in Jenkins. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? Jenkins tests should test it out. Closes #29117 from HyukjinKwon/debug-conda. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
1 parent 8c7d6f9 commit 4da93b0

File tree

1 file changed

+7
-6
lines changed

1 file changed

+7
-6
lines changed

dev/run-pip-tests

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -63,11 +63,15 @@ fi
6363
PYSPARK_VERSION=$(python3 -c "exec(open('python/pyspark/version.py').read());print(__version__)")
6464
PYSPARK_DIST="$FWDIR/python/dist/pyspark-$PYSPARK_VERSION.tar.gz"
6565
# The pip install options we use for all the pip commands
66-
PIP_OPTIONS="--user --upgrade --no-cache-dir --force-reinstall "
66+
PIP_OPTIONS="--upgrade --no-cache-dir --force-reinstall"
6767
# Test both regular user and edit/dev install modes.
6868
PIP_COMMANDS=("pip install $PIP_OPTIONS $PYSPARK_DIST"
6969
"pip install $PIP_OPTIONS -e python/")
7070

71+
# Jenkins has PySpark installed under user sitepackages shared for some reasons.
72+
# In this test, explicitly exclude user sitepackages to prevent side effects
73+
export PYTHONNOUSERSITE=1
74+
7175
for python in "${PYTHON_EXECS[@]}"; do
7276
for install_command in "${PIP_COMMANDS[@]}"; do
7377
echo "Testing pip installation with python $python"
@@ -81,7 +85,7 @@ for python in "${PYTHON_EXECS[@]}"; do
8185
source "$CONDA_PREFIX/etc/profile.d/conda.sh"
8286
fi
8387
conda create -y -p "$VIRTUALENV_PATH" python=$python numpy pandas pip setuptools
84-
conda activate "$VIRTUALENV_PATH" || (echo "Falling back to 'source activate'" && source activate "$VIRTUALENV_PATH")
88+
source activate "$VIRTUALENV_PATH" || (echo "Falling back to 'conda activate'" && conda activate "$VIRTUALENV_PATH")
8589
else
8690
mkdir -p "$VIRTUALENV_PATH"
8791
virtualenv --python=$python "$VIRTUALENV_PATH"
@@ -96,8 +100,6 @@ for python in "${PYTHON_EXECS[@]}"; do
96100
cd "$FWDIR"/python
97101
# Delete the egg info file if it exists, this can cache the setup file.
98102
rm -rf pyspark.egg-info || echo "No existing egg info file, skipping deletion"
99-
# Also, delete the symbolic link if exists. It can be left over from the previous editable mode installation.
100-
python3 -c "from distutils.sysconfig import get_python_lib; import os; f = os.path.join(get_python_lib(), 'pyspark.egg-link'); os.unlink(f) if os.path.isfile(f) else 0"
101103
python3 setup.py sdist
102104

103105

@@ -116,7 +118,6 @@ for python in "${PYTHON_EXECS[@]}"; do
116118
cd /
117119

118120
echo "Run basic sanity check on pip installed version with spark-submit"
119-
export PATH="$(python3 -m site --user-base)/bin:$PATH"
120121
spark-submit "$FWDIR"/dev/pip-sanity-check.py
121122
echo "Run basic sanity check with import based"
122123
python3 "$FWDIR"/dev/pip-sanity-check.py
@@ -127,7 +128,7 @@ for python in "${PYTHON_EXECS[@]}"; do
127128

128129
# conda / virtualenv environments need to be deactivated differently
129130
if [ -n "$USE_CONDA" ]; then
130-
conda deactivate || (echo "Falling back to 'source deactivate'" && source deactivate)
131+
source deactivate || (echo "Falling back to 'conda deactivate'" && conda deactivate)
131132
else
132133
deactivate
133134
fi

0 commit comments

Comments
 (0)