-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An experiment in disallowing expression statements (or "unused values"), must_use/MustUse etc. #8951
Comments
One more data perspective: there are ~21.5k
So those 409 "unused function call values" are ~1.87% of all function calls returning something non- |
This is a lot of work, nice. I have a couple questions but don't take this as me trying to knock what you've done, that's not my intention at all I've not used Rust so I'm having trouble imagining a case where you're required to use the return value of a function. I can imagine something like returning a file handler and then needing to close it after, but that's usually handled with Also, does this make sense to have as a feature in mypy or would it be better to use in a general purpose linter, where you check if a var goes unused at all? For example, in Go, unused variables are not allowed, whether that be from a return or anything else. package main
func fn() string {
return "foo"
}
func main() {
result := fn()
}
// $ go run main.go
// # command-line-arguments
// ./main.go:8:2: result declared but not used In other words, is it worth implementing a check to selectively flag unused vars over just checking everything? |
Hey, sure, that's fair and I totally understand the questions, even more so considering how controversial I expect the topic to be (although maybe I overestimated how controversial, since yours is the only response so far – should I post this somewhere else? I still feel like this is more linter-territory than typing-territory). So, about the questions at hand:
My motivation is that the more I think about it the more cases I see where, if one doesn't use a value returned by an expression (a function call in particular) it's an indicator of a programming error and Mypy (and other linters) are all about helping to eliminate those. Let's say you divide all functions into three categories:
Now, the first class of functions can be ignored here, since they don't return anything. :) The second class is almost certainly an error if you ignore the values they return, for example all (I think?) The third class is tricky and we have two subclasses: 3a. Functions that return an "important" value, like So, to summarize, my claim is this: the majority of values returned from functions (either considering the number of unique functions returning something or the number of function calls) belong to classes 2 and 3a, so it makes sense for linters to complain about ignored return values as they indicate programming errors. Now: unfortunately there's no way for linters to distinguish between classes 3a and 3b currently and ideally there would be a way to opt out of this on a per-function basis, like: class list[T]:
def pop(self, index) -> Ignorable[T]:
... without this we have false-positives like I listed in the original post.
(...)
Linters already complain about unused variables and that's fine, I think this should be kept together with something like the error I propose here. If you have an unused variable at least you can see that a function you use return something. If you just call name = "jonathan"
name.capitalize()
print(name) Flake8 will be very happy with this – no unused variables after all – yet the code contains a programming error. Since non-type-checking linters can't know if a function returns something this needs to be checked in a static type checker. Also: I'd think assigning to name = "Kółko"
# Just verifying the name is ASCII-only, we don't care about the result
_ = name.encode('ascii') |
Thanks for the reply. The capitalization example would make sense for this since, like you said, it has no side effects and relies solely on the return value. So if you're expecting it to mutate the string that is in fact a programming error. I understand the use-case but I keep coming back to the idea that, if a function returns something, it's assumed you'll use it. Hence, any function you don't use the return value of would potentially be a programming error. The set of functions that have a so-called "optional return value" is not very large. It's safe to assume the vast majority of returning functions have a value which is intended to be used. If you were going to flag functions with something about return value usage, it would make more sense to flag the functions whose return value can be ignored since they are the minority. I guess this brings me back to my original question about just checking all returns. If you're not assigning or otherwise using a return value in some way, that is almost certainly an overlooked error. If you as the programmer know what you're doing you can silence the error with Making it a general purpose linter check would also mean it's opt-out not opt-in (assuming you enable the check). The problem with making it opt-in with some sort of flag is that:
It has some practical uses and looks good on paper but that doesn't mean much if people don't use it which is my concern. |
I think it's an interesting proposal and a nice use of In Rust, afaik, flagging something as must_use is opt-in. Simple mistakes like calling a function that is immutable without using its result I know it seems like it might take awhile, but it might be an idea and less controversial to instead propose an opt-in MustUse annotation and push for getting the relevant functions in typeshed annotated with this. Not sure if https://github.com/python/typing/issues is a better place to discuss this though. |
@Kangaroux I feel like we almost totally agree but for some reason you think we don't – I'd like to understand the confusion. What exactly do you mean by "general purpose linter check"? Maybe we're using different words for the same thing here. :) @syastrov I considered opt-in approach but I did some back of the envelope mental calculations and ended up an estimation of 95+% functions having that
So we have 54 relevant functions. I just went through all of them and there are only 2 which return something that can be ignored and it's not a programing error, so that's 2 out of 54 = 3.7% false positive. But by looking at this I discovered that the fluent API that's exposed by those functions is not used anyway, so those return values can be safely removed. So for I expect my estimate (95+% of non-special functions returning something relevant as long as they return anything) to be reasonably accurate and, considering this, I don't believe in the opt-in way here (opt-in as in Mypy strictness flag – yes; opt-in function marker – no) so I can't really be the one to push things in this direction. |
What I mean by general purpose linter would be something like PyLint or flake8. Something that could be applicable to all Python code, not just projects that use mypy |
I would love to see this functionality in mypy. It's one of the bigger concerns I have with unchecked corners of the language for the reasons @jstasiak has listed in previous comments. If I wanted to dive into the codebase to attempt an implementation, where should I start? Would a new (off by default) option be accepted? I may be way out of my depth in trying to implement this but I do think it's an important feature to have. |
I don't think implementing this is the problem (you can find my experimental implementation here, I used it to generate the stats above), it's rather about getting people on board. I don't know how/whether to proceed here and I'd like some input from the mypy devs, I'm happy to submit this as a pull request (a mypy switch disabled by default) but this will require more work so I want to gather the necessary information first. @ilevkivskyi @JukkaL @gvanrossum do you have any thoughts on this? |
Ah I'm sorry for missing that in your original post @jstasiak - thank you for the pointer. |
If prior art helps at all, pyright added this in 1.1.89 with a |
We have class that is immutable data structure, and its |
Having worked with Rust for a while I grew fond of Rust telling me I'm not using some value returned from a function. After reading #6936 I started thinking about a possible approach to get something similar in Mypy, possibly by adding metadata to return types using PEP 593's
Annotated
:But then I considered it some more and figured out that really the majority of return values from the functions I deal with should be handled one way or another, so it's possible the "must use" behavior should be the default. Well, not necessarily default-default, but gated with a Mypy flag possibly. I implemented a dirty version of this where an error is raised on every expression in statement context (not only function calls), you can find it in the following commit: jstasiak@8e8667b (I'm skipping literal values reporting /because docstrings/, ellipsis (because it can't possibly be an error in this context) and -values).
I also added some logging code in the same branch so I can gather some stats and open discussion on this: jstasiak@d685a01
I ran mypy self-test with this and here are the results:
So ~0.18% of all expressions in Mypy are currently used in statement context. The counts of specific types of expressions are as follows:
The only expression statements in Mypy are call expressions, which isn't surprising:
I log the following data in those cases so we can dig deeper:
Numbers by the function called:
This is clearly wrong because I fail to extract the full names of the majority of the callees here but it's not obvious to me why.
Numbers by the type returned (ignore the string-cutting artifacts in complex types containing whitespace, I just use
cut
here for simplicity):I can't comment on Mypy-specific types, but as for some others:
argparse.Action
is returned by all those argparse calls:and since there are a lot of arguments there's a lot of those false-positives.
stderror.write()
,file.write()
and other similar calls:Those are almost entirely false-positives – all cases I looked at here were text tiles and
TextIO.write()
never performs partial writes so its return value is irrelevant.__init__()
methods returnAny
(?)dict.pop()
andlist.pop()
return values but they're commonly used to just remove values from the collectionsI'm attaching a full report of running modified Mypy on itself for your consideration. After looking at the output of this experiment I'm tempted to claim that instead of having an opt-in way to force use of values it makes more sense to have a flag to make it default and also provide an opt-out mechanism.
report.txt
Edit: by an opt-out mechanism I mean something like:
The text was updated successfully, but these errors were encountered: