-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STYLE: fix pylint unneeded-not warnings #49382
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Moisan, nice catch disabling those purposeful not
usages instead of removing!
assert not p1 == p3 | ||
assert not p1 == p3 # pylint: disable=unneeded-not | ||
assert hash(p1) == hash(p2) | ||
assert not hash(p1) == hash(p3) | ||
assert hash(p1) != hash(p3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for my understanding, why does the first one need the check turning off, but the second one can be changed to LSH != RHS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have been a bit too quick on that one. The first one seems to be a bug either on Pandas or Pyarrow side:
>>> p1 = ArrowIntervalType(pa.int64(), "left")
>>> p3 = ArrowIntervalType(pa.int64(), "right")
>>> p1 == p3
False
>>> p1 != p3
False
>>> p1.equals(p3)
False
I looked trough the ArrowIntervalType
implementation and the pyarrow.ExtensionType
source, and I cannot pinpoint why p1 != p3
would return False
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 @jorisvandenbossche is this a bug? Does ArrowIntervalType
need something like
def __ne__(self, other):
return not self.__eq__(other)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @WillAyd in case you have comments on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that seems a bug in ArrowIntervalType, and probably needs such a __ne__
(for example the base ExtensionDtype also has such a method)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea very interesting. My guess is that this stems from pyarrow somewhere
>>> getattr(pandas.core.arrays.arrow.extension_types.ArrowIntervalType, "__ne__")
<slot wrapper '__ne__' of 'pyarrow.lib.ExtensionType' objects>
If you traverse the mro its interesting to see this change through the inheritance hierarchy:
for cls in pandas.core.arrays.arrow.extension_types.ArrowIntervalType.__mro__:
print(cls, getattr(cls, '__ne__'))
<class 'pandas.core.arrays.arrow.extension_types.ArrowIntervalType'> <slot wrapper '__ne__' of 'pyarrow.lib.ExtensionType' objects>
<class 'pyarrow.lib.ExtensionType'> <slot wrapper '__ne__' of 'pyarrow.lib.ExtensionType' objects>
<class 'pyarrow.lib.BaseExtensionType'> <slot wrapper '__ne__' of 'pyarrow.lib.DataType' objects>
<class 'pyarrow.lib.DataType'> <slot wrapper '__ne__' of 'pyarrow.lib.DataType' objects>
<class 'pyarrow.lib._Weakrefable'> <slot wrapper '__ne__' of 'object' objects>
<class 'object'> <slot wrapper '__ne__' of 'object' objects>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly enough I didn't see anything in Arrow's build system that would set the language_level for the Cython extension to 3. We do this in setup.py, I think they would ideally do it in their CMake file.
So if its a matter of 2/3 compat with the generated source file could be a nice upstream patch:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said for the scope of this PR I think should just ignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly enough I didn't see anything in Arrow's build system that would set the language_level for the Cython extension to 3.
I don't think that matters; if you run Python 3, it has the effect that the default __ne__
will use __eq__
automatically.
But what is happening here is that pyarrow.lib.ExtensionType is a cython class that implements __eq__
, and cython defaults to implement a richcompare
slot that also adds __ne__
based on the __eq__
of that class. So just like when subclassing a builtin, if you override __eq__
you also need to override __ne__
when subclassing a cython class with a __eq__
.
This also happens in pandas, for example we have PeriodDtypeBase in cython with __eq__
, and that initially caused the same issue (#37265), and that was fixed by adding __ne__
in the python subclass
Now, we could maybe fix this on the PyArrow side as well by adding an explicit __ne__
that is defined to be return not self == other
, which might ensure that subclasses would only need to override __eq__
, because the parent __ne__
might then work correctly (something to test).
(but regardless of that we should fix this on the pandas side)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your investigations
If this can be resolved as simply as adding
def __ne__(self, other):
return not self.__eq__(other)
(with type annotations), then I'd suggest we do it directly in this PR
The primary motivation for using linters and all it to help catch issues, and I'm glad pylint
has helped find this one - I'd suggest it just be fixed in this PR
assert not p1 == p3 | ||
assert not p1 == p3 # pylint: disable=unneeded-not | ||
assert hash(p1) == hash(p2) | ||
assert not hash(p1) == hash(p3) | ||
assert hash(p1) != hash(p3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @WillAyd in case you have comments on this
assert not p1 == p3 | ||
assert not p1 == p3 # pylint: disable=unneeded-not | ||
assert hash(p1) == hash(p2) | ||
assert not hash(p1) == hash(p3) | ||
assert hash(p1) != hash(p3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your investigations
If this can be resolved as simply as adding
def __ne__(self, other):
return not self.__eq__(other)
(with type annotations), then I'd suggest we do it directly in this PR
The primary motivation for using linters and all it to help catch issues, and I'm glad pylint
has helped find this one - I'd suggest it just be fixed in this PR
549c23d
to
89185f5
Compare
@@ -35,6 +35,9 @@ def __eq__(self, other): | |||
else: | |||
return NotImplemented | |||
|
|||
def __ne__(self, other) -> bool: | |||
return not self.__eq__(other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return not self.__eq__(other) | |
return not self == other |
Based on the links I read about this, they recommended using this instead of __eq__
(https://stackoverflow.com/questions/4352244/should-ne-be-implemented-as-the-negation-of-eq)
@@ -91,6 +94,9 @@ def __eq__(self, other): | |||
else: | |||
return NotImplemented | |||
|
|||
def __ne__(self, other) -> bool: | |||
return not self.__eq__(other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return not self.__eq__(other) | |
return not self == other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me pending approval from Will and Joris, thanks @Moisan !
Awesome, thanks again @Moisan |
* STYLE: fix pylint unneeded-not warnings * fixup! STYLE: fix pylint unneeded-not warnings * Add __ne__ method to ArrowPeriodType and ArrowIntervalType * fixup! Add __ne__ method to ArrowPeriodType and ArrowIntervalType
* STYLE: fix pylint unneeded-not warnings * fixup! STYLE: fix pylint unneeded-not warnings * Add __ne__ method to ArrowPeriodType and ArrowIntervalType * fixup! Add __ne__ method to ArrowPeriodType and ArrowIntervalType
Related to #48855
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.I added
# pylint: disable=unneeded-not
for tests where the intention was clearly to test thenot
operator.