Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Test issues with sklearn 1.4 #1062

Closed
tillea opened this issue Feb 20, 2024 · 3 comments · Fixed by #1072
Closed

[BUG] Test issues with sklearn 1.4 #1062

tillea opened this issue Feb 20, 2024 · 3 comments · Fixed by #1072

Comments

@tillea
Copy link

tillea commented Feb 20, 2024

Describe the bug

I've updated the Debian package of sklearn to 1.4 which is supposed to work with imbalanced-learn 0.12.0. Unfortunately the build on Debian fails as you can see in the build log in our CI

Steps/Code to Reproduce

Install sklearn 1.4 and do

 python3 -m pytest -k "not test_rusboost"

on a Debian sid.

Expected Results

Passing its own test suite.

Actual Results

The errors in the log are starting with

=================================== FAILURES ===================================
_______________ test_fit_predict_on_pipeline_without_fit_predict _______________
self = <sklearn.utils._available_if._AvailableIfDescriptor object at 0x7fe089a69950>
obj = Pipeline(steps=[('scaler', StandardScaler()), ('pca', PCA(svd_solver='full'))])
owner = <class 'imblearn.pipeline.Pipeline'>
    def _check(self, obj, owner):
        attr_err_msg = (
            f"This {repr(owner.__name__)} has no attribute {repr(self.attribute_name)}"
        )
        try:
>           check_result = self.check(obj)
/usr/lib/python3/dist-packages/sklearn/utils/_available_if.py:29:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = Pipeline(steps=[('scaler', StandardScaler()), ('pca', PCA(svd_solver='full'))])
    def check(self):
        # raise original `AttributeError` if `attr` does not exist
>       getattr(self._final_estimator, attr)
E       AttributeError: 'PCA' object has no attribute 'fit_predict'
/usr/lib/python3/dist-packages/sklearn/pipeline.py:53: AttributeError
The above exception was the direct cause of the following exception:
    def test_fit_predict_on_pipeline_without_fit_predict():
        # tests that a pipeline does not have fit_predict method when final
        # step of pipeline does not have fit_predict defined
        scaler = StandardScaler()
        pca = PCA(svd_solver="full")
        pipe = Pipeline([("scaler", scaler), ("pca", pca)])
        error_regex = "'PCA' object has no attribute 'fit_predict'"
        with raises(AttributeError, match=error_regex):
>           getattr(pipe, "fit_predict")
imblearn/tests/test_pipeline.py:415:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/lib/python3/dist-packages/sklearn/utils/_available_if.py:40: in __get__
    self._check(obj, owner=owner)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <sklearn.utils._available_if._AvailableIfDescriptor object at 0x7fe089a69950>
obj = Pipeline(steps=[('scaler', StandardScaler()), ('pca', PCA(svd_solver='full'))])
owner = <class 'imblearn.pipeline.Pipeline'>
    def _check(self, obj, owner):
        attr_err_msg = (
            f"This {repr(owner.__name__)} has no attribute {repr(self.attribute_name)}"
        )
        try:
            check_result = self.check(obj)
        except Exception as e:
>           raise AttributeError(attr_err_msg) from e
E           AttributeError: This 'Pipeline' has no attribute 'fit_predict'
/usr/lib/python3/dist-packages/sklearn/utils/_available_if.py:31: AttributeError
During handling of the above exception, another exception occurred:
    def test_fit_predict_on_pipeline_without_fit_predict():
        # tests that a pipeline does not have fit_predict method when final
        # step of pipeline does not have fit_predict defined
        scaler = StandardScaler()
        pca = PCA(svd_solver="full")
        pipe = Pipeline([("scaler", scaler), ("pca", pca)])
        error_regex = "'PCA' object has no attribute 'fit_predict'"
>       with raises(AttributeError, match=error_regex):
E       AssertionError: Regex pattern did not match.
E        Regex: "'PCA' object has no attribute 'fit_predict'"
E        Input: "This 'Pipeline' has no attribute 'fit_predict'"
imblearn/tests/test_pipeline.py:414: AssertionError
_____________ test_score_samples_on_pipeline_without_score_samples _____________
...

Versions

Python 3.11.8 (main, Feb 7 2024, 21:52:08) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import sklearn; sklearn.show_versions()

System:
python: 3.11.8 (main, Feb 7 2024, 21:52:08) [GCC 13.2.0]
executable: /usr/bin/python3
machine: Linux-6.6.13-amd64-x86_64-with-glibc2.37

Python dependencies:
sklearn: 1.4.1.post1
pip: 24.0
setuptools: 68.1.2
numpy: 1.24.2
scipy: 1.10.1
Cython: 0.29.37
pandas: 2.1.4+dfsg
matplotlib: 3.6.3
joblib: 1.3.2
threadpoolctl: 3.1.0

Built with OpenMP: True

threadpoolctl info:
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so
version: 0.3.26
threading_layer: pthreads
architecture: Haswell
num_threads: 4

   user_api: openmp

internal_api: openmp
prefix: libgomp
filepath: /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
version: None
num_threads: 4

import sys; print("Python", sys.version)
Python 3.11.8 (main, Feb 7 2024, 21:52:08) [GCC 13.2.0]

import numpy; print("NumPy", numpy.version)
NumPy 1.24.2

import scipy; print("SciPy", scipy.version)
SciPy 1.11.4

import sklearn; print("Scikit-Learn", sklearn.version)
Scikit-Learn 1.4.1.post1

@mr-c
Copy link
Contributor

mr-c commented Feb 26, 2024

I was able to fix this for Debian by adapting the tests for the new error message format:

--- imbalanced-learn.orig/imblearn/tests/test_pipeline.py
+++ imbalanced-learn/imblearn/tests/test_pipeline.py
@@ -410,7 +410,7 @@
     scaler = StandardScaler()
     pca = PCA(svd_solver="full")
     pipe = Pipeline([("scaler", scaler), ("pca", pca)])
-    error_regex = "'PCA' object has no attribute 'fit_predict'"
+    error_regex = "This 'Pipeline' has no attribute 'fit_predict'"
     with raises(AttributeError, match=error_regex):
         getattr(pipe, "fit_predict")
 
@@ -1219,7 +1219,7 @@
     pipe.fit(X, y)
     with pytest.raises(
         AttributeError,
-        match="'LogisticRegression' object has no attribute 'score_samples'",
+        match="This 'Pipeline' has no attribute 'score_samples'",
     ):
         pipe.score_samples(X)

@glemaitre
Copy link
Member

Yep the regex has changed for the error match. Not big deal for user facing code.

I'll fix it in main

@penguinpee
Copy link

I'll fix it in main

Has that happened yet? I was looking for a relevant commit on that file, but couldn't find one. If not, I can apply @mr-c's patch. We are facing the same issue now sklearn has been updated to 1.4 in Fedora.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants