-
-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concat
of types Iterable[Any]
cause Pyright to think subsequent code is unreachable because of recent stubs change
#888
Comments
Interesting. We were trying to prevent Asking @twoertwein to look at this. |
Note that returning |
In this case, if you do If we move that overload down, then it will never match, because The main issue here is the "wildcard" nature of |
I'd rather say the main issue here is that you're trying to support a case (i.e. all Instead when one omits type annotations from function returns it seems that many type checkers will implicitly consider the return type to be Given type annotations are purely optional, I'd say no combination of type checkers and stubs should ever declare code unreachable when no type annotations are present and no obvious inference of unreachability can be made (like e.g. |
This scenario may be more interesting than explicit pd.concat([])
print("Unreachable?!") FWIW, Pylance had two issues filed on this (one with 5 up votes) within about 12 hours of shipping this change. |
I think that is debatable. If the type checker can find the error before getting to runtime, that's a good thing.
Yes, that's true, but the question is whether our goal is to support well-typed code that calls pandas, or untyped code that calls pandas. We've chosen the former from a philosophical point of view.
The unreachable conclusion made by The python typing page says the following about
That's exactly what we used it for. |
And that code IS unreachable!! If you try to execute
|
Ah, ok. I thought the exception was only thrown if all of the collection's (one or more) elements were
I believe the "never be called" case is talking about using |
Apologies for my ignorance, but shouldn't untyped code be considered the default case? Unless these stubs are widely used by different IDEs and type checkers then what is the point of making them? And unless the latter don't behave sanely and usefuly for the base-case of untyped code, then what's the point of those in turn? My point being, again, it seems to me that you are prioritising here a very particular edge case purely for the sake of prioritising that niche case. I on the other hand have work to do, and I need a solid cross-OS IDE for various languages and in particular Python. Apologies for the likely palpable frustration on my side, but I hope I'm making it clear where I'm coming from on this. You said you guys made a philosophical choice, I'm instead urging you to consider pragmatism. |
That's unclear. See https://mypy.readthedocs.io/en/latest/more_types.html#the-noreturn-type with respect to So our interpretation of |
No!!! When my team had a large untyped code, and we then converted it to have types, the static typing checks actually found bugs that we didn't know existed. Here's a really simple example: s = pd.Series([3.4, 2.3])
r: Any = 3.2
s.round(r) This code will fail, because the argument for If you remove the The problem with using |
@torext I cannot reproduce your error on main with the latest pyright (even when adding |
I understand that the If it is "only" a non-default issue, I would expect that people who care about type checks (which is persumably why they enabled a non-default setting) to also minimize the number of If it is an issue in the default configuration, we should change it. |
Pyright reports unreachability as a diagnostic hint. In VS Code that causes the unreachable code to be dimmed. You won't see a diagnostic in the Problems pane. You'll see this even with |
Thank you for the explanation - I'm using pyright on the command line. So, is it an issue in PyLance default configuration? |
It's an issue in the default configurations of both Pylance and Pyright (when run as a language server). |
How does pyright/pylance handle "Unknown" in this case? Is it also flagged as unreachable? I think Assuming Unknown triggers the unreachable diagnostic: I'm open to removing the
Is there an option to get these warnings when running Pyright outside an editor? If so, we can enable that on the CI. |
It would be better if the illegal cases could be reported as errors by a type checker. Returning Unfortunately, I don't think the Python typing system currently has a way to represent the concept of an @erictraut might have ideas.
No. Currently they're filtered out. If you think it's important, you could file an enhancement request. |
This is a problem with pylance. See microsoft/pylance-release#5640 (comment) for a simple example that illustrates the issue on code that has nothing to do with I'm closing this issue here. |
The issue you created (microsoft/pylance-release#5640 (comment)) doesn't address the bug I'm reporting here! I'm reporting here an issue where I don't have type checking set to off in Pylance and I don't plan to. I want some basic type checking and stubs that do the right thing, even if there's some |
This is what I call the "Goldilocks problem." If the stubs are too wide, then you don't catch errors in typical code. If the stubs are too narrow, then the type checker flags errors on valid code. So we try to find something that is "just right", i.e., something in the middle that will catch the most typical use cases. You're arguing that the stub in this case is too narrow. I think it is better that we don't accept untyped arguments, which IMHO, type stubs are there to help people who have declared types in their code. |
One other point here. The type checker ( |
Alright, I give up. I'm thoroughly convinced at this point that I've made a clear and strong argument here that you guys are far from "trying to find something that is 'just right'" and rather obsessing over a narrow corner-case in a way that provably isn't helping anyone in this particular instance. Rationalize it all you will, fact is people are having issues because of your changes and are hacking the stubs locally to circumvent them. Also the fact remains that logically what you did is plain wrong: it's not true that |
It only returns something if you have Consider the following code: from typing import Iterable, Any
import pandas as pd
def foo() -> Iterable[Any]:
return [3, 4, 5]
pd.concat(foo())
print("unreachable") Both From a type checking perspective, we've done the right thing, because type checkers don't report unreachable code. VS Code does. That's an issue with that tool, not the type stubs.
No, pyright does NOT report that. If you run the code with command line pyright, no errors are reported. The issue here is that within VS Code, pyright is configured differently and that's why I opened up the issue there. |
Consider the following code: # pyright: basic
from typing import Iterable, Any
import pandas as pd
def foo() -> Iterable[Any]:
return [3, 4, 5]
x: int = "str"
pd.concat(foo())
y: int = "str" Although pyright (on the command line) does not report that the last line is unreachable, it also doesn't report that there's a typing error there -- pyright will not report diagnostics within unreachable code. Pyright run on the code above emits this:
Whereas, if the
So the call to I'd argue that the |
That's an interesting example, but it points out that it is the VSCode/pylance integration that is problematic.
That's only true if you do it within VS Code! Which is my point - the reachability analysis in VS Code is what is problematic. From the command line on the above code for both
Reopening to see if @twoertwein has any further thoughts on this. |
Hmm, that's not what I'm seeing. I'm guessing you ran pyright without access to the pandas stubs? Or with old stubs? I copied the diagnostic output above from runs of pyright on the command line. VS Code was not involved.
|
I ran with the released version of Let's see what @twoertwein has to say. |
I think pandas-dev/pandas#57846 would be the ideal solution as it would avoid the false-negatives and false-positives mentioned in this issue: the As long as there is no pushback at pandas, I wouldn't mind making the change here at pandas-stubs before pandas releases a new version that has this change |
Or we can go ahead and make the change to not accept |
That's because mypy is evaluating from typing import Iterable, Any
import pandas as pd
#
# concat with Iterable[Any]
#
def foo() -> Iterable[Any]:
return [3, 4, 5]
x = pd.concat(foo())
reveal_type(x) # Any
#
# concat with Iterable[None]
#
reveal_type(pd.concat([None])) # Never
y: Any = pd.concat([None])
reveal_type(y) # Mypy never reveals this type
#
# Obvious error
#
z: int = "str" Mypy emits the following. Note that there's no error for the last line and no result for
But if you comment out this part: #
# concat with Iterable[None]
#
reveal_type(pd.concat([None])) # Never
y: Any = pd.concat([None])
reveal_type(y) # Mypy never reveals this type ...you'll get:
|
For those following this thread, I found a fix in #890 that supports our use case (disallow |
Sorry, I know that this is closed, but, I want to understand, so... Why is it important to explicitly create an overloaded function signature for the case the caller provides the wrong input types? To make a contrived example, why not also have one for |
Ok, I'm with you. Giving type hints allow type checkers to see if you have given an incorrect type. If you have labelled your types, and they don't match, then the type checker flags it and an editor like VSCode will show it with a red squiggly line. Cool, good, normal type checking.
Sure, makes sense. Whichever signature matches an Any case is the default when type checking is off. If your Any case says it will Never return, and you get a false positive (incorrectly state there will be a problem). I think I understand what you are showing me. But, my question was actually maybe a little more about the design philosophy. If type checkers can already tell you that the types don't match any of the signatures, then why is there a signature that catches the case that it would not match any of the other signatures? Obviously incorrect types will cause problems; that's why type checkers flag it as a problem. Or to put it another way: what is the situation where a type checker telling you there's an error in your program is insufficient? |
This more relates to what we are trying to do for We want to accept So to put it another way, it is acceptable to pass a list where you mix If we only had
I don't understand what you mean by "insufficient". My team uses the stubs for pandas (and other libraries) to help us flag incorrect code. There are occasions where the code is correct, and the type checker will say it isn't. That's usually an issue with the stubs, and in some cases it is because the type of the object cannot be fully inferred. For example, if you have a There are also occasions where the type checker will say the code is correct, but the code will fail at runtime. In those cases, it's either because variables of our program don't have all the types set up right (or a variable is inferred as A good example is with parameters that need to be a sequence of strings. If we declare the type of a parameter as |
It seems like you are really using the typing system to do a lot!
Ah, I see. My initial instinct tells me this should be a runtime check instead. But, of course, if you can do it with typing, why not do it with typing? I misunderstood the purpose of the signature, and thus, this realisation answers my query. Although, from my understanding of inferred types, this really only covers the case that the user creates a list that is guaranteed to only contain Nones. Which is a pretty weird edge case to check for. But, whatever, if you detect can such subcases as being problematic in typing, then why not? Side note: I actually would have expected passing
If I recall correctly, those are called "runtime errors". (Sorry, I had to. I do appreciate you answering my questions, but this was too good an opportunity) Anyway, I see what you mean that it is an issue with pylance. Either the argument is strictly an Thanks for answering my questions. I learned something! |
I have discovered lots of bugs by converting untyped code to typed code. In addition, turning on type checking has prevented runtime errors from occurring.
It is a runtime check in pandas.
It's more that if we can catch it with typing, then we can avoid the runtime error from occurring.
Agreed, but it is what the pandas source does.
You could contribute to the discussion at pandas-dev/pandas#57846 . It's not clear why the design of the pandas API accepts
A rose is a rose by any other name!
You're welcome. |
The fix took care of most of the concat calls in my codebase (thanks!), but there's one stubborn case in one of my files for for an import pandas as pd
# no param
def foo():
return pd.DataFrame()
df_acc = None
df_new = foo()
pd.concat([df_acc, df_new])
print("reachable")
# the param makes the second concat call unreachable if the return is stored
def foo2(df):
return pd.DataFrame()
pd.concat([df_acc, foo2()]) # direct use works
print("reachable")
# the param makes the second concat call unreachable
df_new = foo2()
pd.concat([df_acc, df_new]) # var use breaks
print("unreachable") I'll open a new issue. |
Describe the bug
In the following code the last line is wrongly considered to be unreachable by Pyright:
This is caused by the recent change #858 adding a stub returning
Never
. This is presumably a misuse ofNever
or the stub as a whole should be moved to the very end so thatconcat
on typesIterable[Any]
can first match a stub with valid return.To Reproduce
See microsoft/pylance-release#5631 and microsoft/pylance-release#5630 for issues reported on the Pylance repo.
Please complete the following information:
pandas-stubs
: the one shipped with above Pylance versionThe text was updated successfully, but these errors were encountered: