-
Notifications
You must be signed in to change notification settings - Fork 103
fix overloads for read_csv and read_table #109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this; I had tried to fix this but similarly scratched my head.
There are a few errors that show when I'm backporting these, noted in the review. We really should get some sort of CI running here.
@@ -147,7 +250,7 @@ def read_table( | |||
date_parser: Optional[Callable] = ..., | |||
dayfirst: bool = ..., | |||
cache_dates: bool = ..., | |||
iterator: bool = ..., | |||
iterator: Literal[True], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
iterator: Literal[False], | ||
chunksize: int, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And lastly these two.
iterator: Literal[False], | ||
chunksize: int, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here.
@jakebailey - what are you doing to see those error messages? |
I'm just plopping them into our internal tree (i.e. what is copied and turns into Pylance's bundled/stub directory) and the diagnostics show up in VS Code with Pylance enabled; these rules aren't type checking rules, but language rules (the code should crash at runtime when the signature is processed), so there shouldn't be anything needed to see them. |
Let me test and see if they appear if I just open this repo for editing. |
Yeah, it reproduces when you open this repo, though you'll have to disable the |
FWIW I can add the requisite |
Well, adding the @overload
def myfun(
fake: str,
first: float = ...,
i1: Literal[True] = ...,
cs: Optional[int] = ...,
ext: str = ...,
) -> str:
...
@overload
def myfun(
fake: str,
first: float = ...,
i1: Literal[False] = ...,
cs: int = ...,
ext: str = ...,
) -> str:
...
@overload
def myfun(
fake: str,
first: float = ...,
i1: Literal[False] = ...,
cs: None = ...,
ext: str = ...,
) -> int:
...
@overload
def myfun(
fake: str, first: float = ..., i1: bool = ..., cs: int = ..., ext: str = ...
) -> str:
...
def myfun(
fake: str,
first: float = -33.0,
i1: bool = False,
cs: Optional[int] = None,
ext: str = "",
) -> Union[str, int]:
print(f"i1 is {i1} cs is {cs} result is ", end="")
if i1:
if cs is not None:
return "TextFileReader"
else:
return "TextFileReader"
else:
if cs is not None:
return "TextFileReader"
else:
return -1
res1: int = myfun("meh")
res2: int = myfun("meh", i1=False)
res3: int = myfun("meh", i1=False, cs=None)
res4: str = myfun("meh", i1=False, cs=23)
res5: str = myfun("meh", i1=True)
res6: str = myfun("meh", i1=True, cs=None)
res7: str = myfun("meh", i1=True, cs=23)
res8: int = myfun("meh", cs=None)
res9: str = myfun("meh", cs=23) Just by adding the Looking for suggestions on how to handle that. It seems that if the two arguments that control the return type are arguments with default values, and they have other arguments before and after the those 2 arguments, that's where you run into difficulty. |
My assumption is that most people don't use read_csv with 50+ positional arguments; they typically specify the first few arguments positionally and then use keyword arguments for the remainder. If my assumption is correct, then we should model it like that in the overload list. It's important for the overloads that support keyword arguments to return the correct (specific) return type. A final catch-all overload can handle the positional argument case, perhaps with a return type of So my suggestion is to add the |
Actually most people use
Little confused here - above you said add Having said that, since I think your assumption is wrong, does that change things? Or are you saying use 4 overloads, all with a |
Sorry, the markdown processor ate the My assumption is in line with what you said. Place the |
Sorry to misinterpret the original |
Just did a commit 28473c3 that I think works. Thanks for the suggestions @jakebailey and @erictraut . Tested with the following: from io import StringIO
import pandas as pd
from pandas.io.parsers import TextFileReader
dio = StringIO("a,b\n 1,2\n 3,4\n")
df: pd.DataFrame = pd.read_csv(dio)
print(df)
dio.seek(0)
tr: pd.DataFrame = pd.read_csv(dio)
dio.seek(0)
tr2: TextFileReader = pd.read_csv(dio, iterator=True)
dio.seek(0)
tr3: pd.DataFrame = pd.read_csv(dio, iterator=False)
dio.seek(0)
tr7: pd.DataFrame = pd.read_csv(dio, header="infer")
dio.seek(0)
tr8: TextFileReader = pd.read_csv(dio, header="infer", iterator=True)
ftr1: TextFileReader = pd.read_csv("hey.csv")
ftr2: pd.DataFrame = pd.read_csv("hey.csv", iterator=True)
ftr3: TextFileReader = pd.read_csv("hey.csv", iterator=False)
ftr5: pd.DataFrame = pd.read_csv("hey.csv", chunksize=10)
ftr6: TextFileReader = pd.read_csv("he.csv", chunksize=None)
ftr7: TextFileReader = pd.read_csv("hey.csv", header="infer")
ftr8: pd.DataFrame = pd.read_csv("hey.csv", header="infer", iterator=True) The last 7 examples all report incompatible assignments, as expected. |
Thanks, I'll go import it and see. We (luckily) delayed our weekly release, so I should be able to sneak this one in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Looks good to me.
Closes #87 based on suggestions there