Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cut disused recode_encoding logic in _PyBytes_DecodeEscape. #16013

Merged
merged 2 commits into from
Sep 12, 2019

Conversation

gnprice
Copy link
Contributor

@gnprice gnprice commented Sep 12, 2019

All call sites pass NULL for recode_encoding, so this path is
completely untested. That's been true since before Python 3.0.
It adds significant complexity to this logic, so it's best to
take it out.

All call sites now have a literal NULL, and that's been true since
commit 768921c eliminated a conditional (foo ? bar : NULL) at
the call site in Python/ast.c where we're parsing a bytes literal.
But even before then, that condition foo had been a constant
since unadorned string literals started meaning Unicode, in commit
572dbf8 aka v3.0a1~1035 .

The unicode parameter is already unused, so mark it as unused too.
The code that acted on it was also taken out before Python 3.0, in
commit 8d30cc0 aka v3.0a1~1031 .

The function (PyBytes_DecodeEscape) is exposed in the API, but it's
never been documented.

All call sites pass NULL for `recode_encoding`, so this path is
completely untested.  That's been true since before Python 3.0.
It adds significant complexity to this logic, so it's best to
take it out.

All call sites now have a literal NULL, and that's been true since
commit 768921c eliminated a conditional (`foo ? bar : NULL`) at
the call site in Python/ast.c where we're parsing a bytes literal.
But even before then, that condition `foo` had been a constant
since unadorned string literals started meaning Unicode, in commit
572dbf8 aka v3.0a1~1035 .

The `unicode` parameter is already unused, so mark it as unused too.
The code that acted on it was also taken out before Python 3.0, in
commit 8d30cc0 aka v3.0a1~1031 .

The function (PyBytes_DecodeEscape) is exposed in the API, but it's
never been documented.
@benjaminp
Copy link
Contributor

Is there a reason not to eliminate the unicode and recode_encoding parameters from _PyBytes_DecodeEscape completely?

@gpshead gpshead requested a review from benjaminp September 12, 2019 13:07
@gpshead
Copy link
Member

gpshead commented Sep 12, 2019

agreed, this is a static internal only function. just remove the unused parameters.

@gpshead gpshead self-requested a review September 12, 2019 13:09
@benjaminp
Copy link
Contributor

_PyBytes_DecodeEscape isn't static but the internal criteria still applies.

@gnprice
Copy link
Contributor Author

gnprice commented Sep 12, 2019

Thanks both for the reviews!

just remove the unused parameters.

That sounds even better 🙂 The reason I hesitated is that these do appear in Include/bytesobject.h:

PyAPI_FUNC(PyObject *) PyBytes_DecodeEscape(const char *, Py_ssize_t,
                                            const char *, Py_ssize_t,
                                            const char *);
#ifndef Py_LIMITED_API
/* Helper for PyBytes_DecodeEscape that detects invalid escape chars. */
PyAPI_FUNC(PyObject *) _PyBytes_DecodeEscape(const char *, Py_ssize_t,
                                             const char *, Py_ssize_t,
                                             const char *,
                                             const char **);
#endif

and in particular PyBytes_DecodeEscape is outside any #ifndef Py_LIMITED_API block.

I wasn't sure changing the signatures would be acceptable, so I went for the most minimal change.

I'll update this PR to drop these parameters from _PyBytes_DecodeEscape, but leave them on PyBytes_DecodeEscape (I think that's what you're both suggesting.)

What are the criteria for saying an API is internal and can be freely changed? I see this one has two internal-flavored markings: the leading _ and the #ifndef Py_LIMITED_API. Is one of those considered definitive?

@benjaminp
Copy link
Contributor

As described in https://docs.python.org/3/c-api/intro.html#include-files, no extensions should use _Py-prefixed names. Those names are not part of the public C API. (It would be better if installed headers defined only public interfaces, but that's not where we are today.) Py_LIMITED_API refers to a restricted subset of the public C API defined by PEP 384.

@benjaminp benjaminp merged commit 3a4f667 into python:master Sep 12, 2019
@gnprice gnprice deleted the pr-decodeescape-unused branch September 18, 2019 07:45
tacaswell added a commit to tacaswell/typed_ast that referenced this pull request Nov 10, 2019
python/cpython#16013
python/cpython@3a4f667

removed two arguments from _PyBytes_DecodeEscape .

This calls _PyBytes_DecodeEscape with the new signature if the minor
version is 9 or greater.
@gvanrossum
Copy link
Member

Is there a reason not to eliminate the unicode and recode_encoding parameters from _PyBytes_DecodeEscape completely?

Well, it was also called from typed_ast/ast3/Python/ast.c (which is an external backport of ast.c and uses some internals like this one), but we're fixing that now, so no need to roll this back or anything. See python/typed_ast#128

gvanrossum pushed a commit to python/typed_ast that referenced this pull request Nov 11, 2019
This PR and commit to cpython:

python/cpython#16013
python/cpython@3a4f667

removed two arguments from _PyBytes_DecodeEscape .

We now call _PyBytes_DecodeEscape with the new signature if the (CPython) minor
version is 9 or greater.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants