-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Private-looking symbols in the public API? #56874
Comments
I definitely think these should not be renamed since these methods should not be used directly by users because they are called internally instead. So I guess I would support removing this from the API reference, but IMO it would be nice that these methods have exposure somewhere in the docs to signal that these are important for EA authors to implement. |
I think in other programming languages, this would correspond to a "protected" symbol. Not sure whether there is a PEP for public/protected/private in Python, but I understand now why it would be good to keep them documented.
Should they then be exposed in pandas-stubs? I guess it depends on whether we expect/want EA authors to use pandas or pandas-stubs @Dr-Irv |
I would expect that an EA author would use pandas-stubs. You'd want to make sure your EA code and tests of your EA were properly typed. This issue came from a report in pandas-stubs where someone is trying to write an EA using the stubs to support typing. Having said that, one of the issues with the EA interface design is that EA authors have to implement a bunch of methods and some of them have "private" signatures, and others do not. See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.api.extensions.ExtensionArray.html#pandas.api.extensions.ExtensionArray Why do we tell EA authors to implement |
Because the public vs private is for users of the EAs, not the authors of the EA. Unfortunately, Python only provides public vs internal indications, but we need three: public to EA user, internal to EA user but "public" to EA author, internal to EA author.
We don't want EA users to call |
Maybe I'm just missing something here, but let's say we have the following:
My question is then would there be any reason for Bob to call the methods on the And, if that's the case, is there really an "EA user", especially since EA is documented as an abstract class, which requires you to be an EA Author (like Mary) to use it? |
The common case would probably be that the user would just use the Series methods, yes. But there is the public |
And the underlying EA could also expose additional functionality that isn't directly used by Series/DataFrame. |
Yes, but in my example, wouldn't that be part of the documentation of |
Copying that text here, because it is quite relevant (thanks @jbrockmendel ): For historical reasons we've built up an EA namespace without much internal
Thoughts? |
I'm giving a reason why a user might need access to the underlying EA via
In both of the proposed conventions, EA authors need to implement |
...and typed. |
Pandas version checks
main
hereLocation of the documentation
doc/source/reference/extensions.rst
Documentation problem
doc/source/reference/extensions.rst (and probably some of the other references) documents methods to be public even though they start with a leading _ (which typically is used to indicate private symbols). I know pandas declares that the documentation dictates what is public (as opposed to typical Python conventions) but it seems very confusing to document private-looking symbols as being part of the public API.
xref pandas-dev/pandas-stubs#850
Suggested fix for documentation
It would be good to either
After that, it would be great to have a script in pre-commit to check for private-looking symbols in the public documentation.
The text was updated successfully, but these errors were encountered: