-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The KFP preloaded XGboost sample is broken and out-dated. #5089
Comments
Completely agree with this! |
While the issue is mitigated by temporarily removing the sample from preloads and tests, we may still want to rewrite the XGBoost Spark-Dataproc sample using the latest XGBoost library. Keep this issue open for tracking. |
…low#5100) * Revert "fix(samples): Remove broken xgboost sample (kubeflow#5091)" This reverts commit 1dcda80. * fix(backend): Replaced the XGBoost sample * Fixed the backend image build * Updated the frontend tests
TL;DR: The preload XGBoost sample is currently broken.
Proposing we remove this sample from KFP preload and from sample test until we got a chance to refresh the sample.
The direct cause was that it used the Dataproc 1.2 image which is based on Python 2.7, and pip 21.0 dropped support for Python 2.7.
The symptom is that
dataproc_create_cluster
fails on initialization.and the specific error is mentioned here.
#5062 made an attempted fix by upgrading to Dataproc 1.5 image. It fixed the Dataproc cluster creation issue, but we hit an error later at the Trainer step.
We were advised that newer versions of Dataproc images likely don't have XGBoost library preinstalled, as there's now an initialization action that goes through extra steps to install XGBoost libraries.
Following that route, I tried installing the default XGBoost version using the rapids script, then hit the error as follows:
I then realized that the sample is based on the code from the deprecated component path, which was deleted by #5045.
Specifically, the not found method from the above error was used here:
pipelines/components/deprecated/dataproc/train/src/XGBoostTrainer.scala
Line 121 in 32ce8d8
And
trainWithDataFrame
only exists in XGBoost 0.72, but not seen from any versions beyond.XGBoost 0.72 is too old and not even available from https://repo1.maven.org/maven2/com/nvidia/, which is used by rapids to download XGBoost.
At this point, I feel like we'd rather invest to rewrite the XGBoost sample using the latest XGBoost library than patching the existing one if we do think it's worth demoing running a XGBoost-on-Dataproc pipeline.
Util we have the sample working, I propose we remove it from the KFP preloaded pipelines and sample-tests.
The text was updated successfully, but these errors were encountered: