-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] fix mypy errors about custom eval and metric functions #5790
Conversation
Callable[ | ||
[np.ndarray, np.ndarray], | ||
[Optional[np.ndarray], np.ndarray], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why add all these Optional
s?
mypy
is struggling with the facts that Dataset.get_field()
can return None
.
LightGBM/python-package/lightgbm/basic.py
Line 2704 in 2fe2bf0
def get_label(self) -> Optional[np.ndarray]: LightGBM/python-package/lightgbm/sklearn.py
Line 127 in 2fe2bf0
labels = dataset.get_label() LightGBM/python-package/lightgbm/basic.py
Line 2393 in 2fe2bf0
def get_field(self, field_name: str) -> Optional[np.ndarray]:
This PR proposes updating the type hints for custom metric and objective functions to match that behavior.
I intentionally chose not to update the user-facing docs about custom metric and objective functions to reflect that the label
, group
, and weights
passed to these functions can technically be None
... in almost all situations, they should be non-None
. I don't think complicating the docs is worth it.
Callable[ | ||
[Optional[np.ndarray], np.ndarray, Optional[np.ndarray], Optional[np.ndarray]], | ||
_LGBM_EvalFunctionResultType | ||
], | ||
Callable[ | ||
[Optional[np.ndarray], np.ndarray, Optional[np.ndarray], Optional[np.ndarray]], | ||
List[_LGBM_EvalFunctionResultType] | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitting cases like
Callable[
[np.ndarray, np.ndarray],
Union[_LGBM_EvalFunctionResultType, List[_LGBM_EvalFunctionResultType]]
]
into
Union[
Callable[
[np.ndarray, np.ndarray],
_LGBM_EvalFunctionResultType
],
Callable[
[np.ndarray, np.ndarray],
List[_LGBM_EvalFunctionResultType]
]
]
helps mypy
with errors like this:
Argument 1 to "eval_valid" of "Booster" has incompatible type "Union[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]], List[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]]], None]"; expected "Union[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]], List[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]]], None]" [arg-type]
And I think it's slightly more correct... I'd expect people to provide a custom metric function that returns a single tuple on every iteration or one that returns a list of tuples on each iteration, but not one that could return either of those.
elif argc == 4: | ||
grad, hess = self.func(labels, preds, dataset.get_weight(), dataset.get_group()) | ||
grad, hess = self.func(labels, preds, dataset.get_weight(), dataset.get_group()) # type: ignore [call-arg] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mypy
isn't able to break apart the individual Union
items base on how many arguments inspect.signature()
returns. This PR proposes just ignoring errors like this:
sklearn.py:132: error: Too many arguments [call-arg]
sklearn.py:132: error: Too few arguments [call-arg]
@@ -811,7 +829,7 @@ def _get_meta_data(collection, name, i): | |||
num_boost_round=self.n_estimators, | |||
valid_sets=valid_sets, | |||
valid_names=eval_names, | |||
feval=eval_metrics_callable, | |||
feval=eval_metrics_callable, # type: ignore[arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This # type: ignore
fixes this error:
Argument "feval" to "train" has incompatible type "List[_EvalFunctionWrapper]"; expected "Union[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]], List[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]]], None]" [arg-type]
sklearn.py:814: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance
sklearn.py:814: note: Consider using "Sequence" instead, which is covariant
I'm not sure why, but mypy
isn't able to figure out that _EvalFunctionWrapper
is actually a Callable[[ndarray, Dataset], Tuple[str, float, bool]]
.
I think that's maybe related to some issues it has with comparing lists to union types including lists? e.g. python/mypy#6463
This PR proposes just ignoring that error for now... the use of custom eval metric functions is well-covered by the unit tests in test_sklearn.py
and test_dask.py
.
I was looking into some other This is ready for review. |
This pull request has been automatically locked since there has not been any recent activity since it was closed. |
Contributes to #3867.
Fixes the following
mypy
errors: