[python-package] fix mypy errors about custom eval and metric functions #5790

jameslamb · 2023-03-17T04:21:27Z

Contributes to #3867.

Fixes the following mypy errors:

engine.py:258: error: Argument 1 to "eval_train" of "Booster" has incompatible type "Union[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]], List[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]]], None]"; expected "Union[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]], List[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]]], None]"  [arg-type]
engine.py:259: error: Argument 1 to "eval_valid" of "Booster" has incompatible type "Union[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]], List[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]]], None]"; expected "Union[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]], List[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]]], None]"  [arg-type]

sklearn.py:130: error: Too few arguments  [call-arg]
sklearn.py:130: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:132: error: Too many arguments  [call-arg]
sklearn.py:132: error: Too few arguments  [call-arg]
sklearn.py:132: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:132: error: Argument 3 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:134: error: Too many arguments  [call-arg]
sklearn.py:134: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:134: error: Argument 3 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:134: error: Argument 4 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:208: error: Too few arguments  [call-arg]
sklearn.py:208: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:210: error: Too many arguments  [call-arg]
sklearn.py:210: error: Too few arguments  [call-arg]
sklearn.py:210: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:210: error: Argument 3 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:212: error: Too many arguments  [call-arg]
sklearn.py:212: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:212: error: Argument 3 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:212: error: Argument 4 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:814: error: Argument "feval" to "train" has incompatible type "List[_EvalFunctionWrapper]"; expected "Union[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]], List[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]]], None]"  [arg-type]
sklearn.py:814: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance
sklearn.py:814: note: Consider using "Sequence" instead, which is covariant

…tions

jameslamb · 2023-03-19T04:28:26Z

python-package/lightgbm/sklearn.py

    Callable[
-        [np.ndarray, np.ndarray],
+        [Optional[np.ndarray], np.ndarray],


why add all these Optionals?

mypy is struggling with the facts that Dataset.get_field() can return None.

LightGBM/python-package/lightgbm/basic.py

Line 2704 in 2fe2bf0

def get_label(self) -> Optional[np.ndarray]:

LightGBM/python-package/lightgbm/sklearn.py

Line 127 in 2fe2bf0

labels = dataset.get_label()

LightGBM/python-package/lightgbm/basic.py

Line 2393 in 2fe2bf0

def get_field(self, field_name: str) -> Optional[np.ndarray]:

This PR proposes updating the type hints for custom metric and objective functions to match that behavior.

I intentionally chose not to update the user-facing docs about custom metric and objective functions to reflect that the label, group, and weights passed to these functions can technically be None... in almost all situations, they should be non-None. I don't think complicating the docs is worth it.

jameslamb · 2023-03-19T04:31:44Z

python-package/lightgbm/sklearn.py

+    Callable[
+        [Optional[np.ndarray], np.ndarray, Optional[np.ndarray], Optional[np.ndarray]],
+        _LGBM_EvalFunctionResultType
+    ],
+    Callable[
+        [Optional[np.ndarray], np.ndarray, Optional[np.ndarray], Optional[np.ndarray]],
+        List[_LGBM_EvalFunctionResultType]
+    ]


Splitting cases like

Callable[ [np.ndarray, np.ndarray], Union[_LGBM_EvalFunctionResultType, List[_LGBM_EvalFunctionResultType]] ]

into

Union[ Callable[ [np.ndarray, np.ndarray], _LGBM_EvalFunctionResultType ], Callable[ [np.ndarray, np.ndarray], List[_LGBM_EvalFunctionResultType] ] ]

helps mypy with errors like this:

Argument 1 to "eval_valid" of "Booster" has incompatible type "Union[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]], List[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]]], None]"; expected "Union[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]], List[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]]], None]" [arg-type]

And I think it's slightly more correct... I'd expect people to provide a custom metric function that returns a single tuple on every iteration or one that returns a list of tuples on each iteration, but not one that could return either of those.

jameslamb · 2023-03-19T04:33:27Z

python-package/lightgbm/sklearn.py

        elif argc == 4:
-            grad, hess = self.func(labels, preds, dataset.get_weight(), dataset.get_group())
+            grad, hess = self.func(labels, preds, dataset.get_weight(), dataset.get_group())  # type: ignore [call-arg]


mypy isn't able to break apart the individual Union items base on how many arguments inspect.signature() returns. This PR proposes just ignoring errors like this:

sklearn.py:132: error: Too many arguments [call-arg] sklearn.py:132: error: Too few arguments [call-arg]

jameslamb · 2023-03-19T04:37:59Z

python-package/lightgbm/sklearn.py

@@ -811,7 +829,7 @@ def _get_meta_data(collection, name, i):
            num_boost_round=self.n_estimators,
            valid_sets=valid_sets,
            valid_names=eval_names,
-            feval=eval_metrics_callable,
+            feval=eval_metrics_callable,  # type: ignore[arg-type]


This # type: ignore fixes this error:

Argument "feval" to "train" has incompatible type "List[_EvalFunctionWrapper]"; expected "Union[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]], List[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]]], None]" [arg-type] sklearn.py:814: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance sklearn.py:814: note: Consider using "Sequence" instead, which is covariant

I'm not sure why, but mypy isn't able to figure out that _EvalFunctionWrapper is actually a Callable[[ndarray, Dataset], Tuple[str, float, bool]].

I think that's maybe related to some issues it has with comparing lists to union types including lists? e.g. python/mypy#6463

This PR proposes just ignoring that error for now... the use of custom eval metric functions is well-covered by the unit tests in test_sklearn.py and test_dask.py.

jameslamb · 2023-03-19T04:39:15Z

I was looking into some other mypy errors tonight and they required touching the same type annotations as this PR I'd opened a few days ago. So I just pushed those changes here, in d0d8d9e.

This is ready for review.

github-actions · 2023-08-15T20:17:34Z

This pull request has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

jameslamb added 2 commits March 16, 2023 23:05

[python-package] ignore mypy errors about custom eval and metric func…

8718901

…tions

fix hints

e1146e6

jameslamb added awaiting review maintenance labels Mar 17, 2023

jameslamb requested review from StrikerRUS, shiyu1994 and jmoralez as code owners March 17, 2023 04:21

jameslamb requested a review from guolinke March 19, 2023 02:01

fix issues with Union types

d0d8d9e

jameslamb commented Mar 19, 2023

View reviewed changes

jameslamb changed the title ~~[python-package] ignore mypy errors about custom eval and metric functions~~ [python-package] fix mypy errors about custom eval and metric functions Mar 19, 2023

Merge branch 'master' into ci/mypy-sklearn-custom-functions

836c339

guolinke approved these changes Mar 30, 2023

View reviewed changes

jameslamb merged commit a528598 into master Mar 30, 2023

jameslamb deleted the ci/mypy-sklearn-custom-functions branch March 30, 2023 02:47

jameslamb removed the awaiting review label Mar 30, 2023

github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package] fix mypy errors about custom eval and metric functions #5790

[python-package] fix mypy errors about custom eval and metric functions #5790

jameslamb commented Mar 17, 2023 •

edited

Loading

jameslamb Mar 19, 2023

jameslamb Mar 19, 2023

jameslamb Mar 19, 2023

jameslamb Mar 19, 2023

jameslamb commented Mar 19, 2023

github-actions bot commented Aug 15, 2023

[python-package] fix mypy errors about custom eval and metric functions #5790

[python-package] fix mypy errors about custom eval and metric functions #5790

Conversation

jameslamb commented Mar 17, 2023 • edited Loading

jameslamb Mar 19, 2023

Choose a reason for hiding this comment

why add all these Optionals?

jameslamb Mar 19, 2023

Choose a reason for hiding this comment

jameslamb Mar 19, 2023

Choose a reason for hiding this comment

jameslamb Mar 19, 2023

Choose a reason for hiding this comment

jameslamb commented Mar 19, 2023

github-actions bot commented Aug 15, 2023

jameslamb commented Mar 17, 2023 •

edited

Loading

why add all these `Optional`s?