-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make regex methods accept all buffer types as inputs #7158
Make regex methods accept all buffer types as inputs #7158
Conversation
stdlib/mmap.pyi
Outdated
@@ -27,7 +27,7 @@ if sys.platform != "win32": | |||
|
|||
PAGESIZE: int | |||
|
|||
class mmap(AbstractContextManager[mmap], Iterable[int], Sized): | |||
class mmap(AbstractContextManager[mmap], bytearray, Iterable[int], Sized): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, why does bytearray
have to be in this position in the class signature? I initially wanted to place it after Iterable[int]
and before Sized
, but received this error:
~/Development/PoCs/typing
❯ mypy --custom-typeshed-dir ../../OSS/typeshed scratch.py
scratch.py:5: error: "mmap" has no attribute "__enter__"; maybe "__iter__"?
scratch.py:5: error: "mmap" has no attribute "__exit__"
Found 2 errors in 1 file (checked 1 source file)
This comment has been minimized.
This comment has been minimized.
mmap.mmap
extend bytearray
mmap.mmap
extend bytearray
mmap.mmap
extend bytearray
mmap.mmap
extend bytearray
In the past we've had problems whenever we tried to "fudge" types and made them differ from runtime types. I think a better solution is python/typing#593. |
I see that I approved the previous instance of this, but I no longer think it's a good idea to lie in the stubs. The proper solution is python/typing#593. Until that is implemented, we can just use the aliases from https://github.com/python/typeshed/blob/master/stdlib/_typeshed/__init__.pyi#L185. |
@srittau @JelleZijlstra I'll update this PR to use the currently-implemented aliases then. Regarding python/typing#593 (and python/typing#593 (comment)), since I've never participated in any Python maling list discussion nor worked on CPython, it'll probably take me a while until I'll be comfortable with leading the implementation of the proper solution 😬 (Any tips would be much appreciated :)) |
stdlib/mmap.pyi
Outdated
@@ -27,7 +27,7 @@ if sys.platform != "win32": | |||
|
|||
PAGESIZE: int | |||
|
|||
class mmap(AbstractContextManager[mmap], Iterable[int], Sized): | |||
class mmap(AbstractContextManager[mmap], ReadableBuffer, Iterable[int], Sized): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thought: how does this even work? ReadableBuffer
is a Union
which includes mmap.mmap
, which leads me to the conclusion that this should result in some sort of infinite type loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: this does lead to an error :) Will probably split up WriteableBuffer
to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add it to the union, not use the union as a base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? mmap.mmap
is already part of WriteableBuffer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. Then we shouldn't change it at all.
The original problem was that mmap wasn't accepted in re.match. If re.match accepts all buffers, then we should change its signature, not anything about mmap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Will look into re.match
. Note that having mmap.mmap
extend ReadOnlyBuffer
does make some sense to me, though you're probably right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon further research, it seems to me like a way to solve this would be to have mmap.mmap
extend ByteString
, and change the definition of AnyStr
to:
AnyStr = TypeVar("AnyStr", str, ByteString)
However, that would require a change in CPython. I could also create an AnyStr
alternative, specific to Pattern
, with this definition. One last option would be to have mmap.mmap
extend bytes
(aka ReadOnlyBuffer
). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not change AnyStr. That would require CPython runtime changes. We could use a separate typevar specific to regexes, but I'd have to look more into it to figure out whether that's feasible.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the definition of AnyStr is not an option.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Cool, I didn't even notice 🧐 Will take a look. Also, defining |
mmap.mmap
extend bytearray
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Note that the PR does not fix the original issue at the moment, as StrOrBuffer = TypeVar("StrOrBuffer", str, ReadableBuffer) However that didn't work for me earlier, so I'll look into it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't exactly describe re's real behavior in a few cases. I think Pattern and Match should continue to be generic over AnyStr, but additional overloads can be added so that e.g. Pattern[bytes].search() accepts other ReadableBuffers.
stdlib/typing.pyi
Outdated
@overload | ||
def group(self, __group: str | int) -> AnyStr | Any: ... | ||
def group(self, __group: str | int) -> StrOrBuffer | Any: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't right: if you match on a memoryview, you get bytes objects from .group(), not memoryviews.
stdlib/re.pyi
Outdated
@@ -52,66 +53,90 @@ if sys.version_info < (3, 7): | |||
_pattern_type: type | |||
|
|||
@overload | |||
def compile(pattern: AnyStr, flags: _FlagsType = ...) -> Pattern[AnyStr]: ... | |||
def compile(pattern: StrOrBuffer, flags: _FlagsType = ...) -> Pattern[StrOrBuffer]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re.compile actually works only on str or bytes
This comment has been minimized.
This comment has been minimized.
Diff from mypy_primer, showing the effect of this PR on open source code: black (https://github.com/psf/black)
+ src/blib2to3/pgen2/literals.py:25:15: error: "Match" expects 2 type arguments, but 1 given
urllib3 (https://github.com/urllib3/urllib3)
+ src/urllib3/util/ssl_match_hostname.py:23: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ src/urllib3/util/url.py:242: error: Returning Any from function declared to return "str" [no-any-return]
websockets (https://github.com/aaugustin/websockets)
+ src/websockets/version.py:74: error: Returning Any from function declared to return "str"
pytest (https://github.com/pytest-dev/pytest)
+ src/_pytest/junitxml.py:51: error: "Match" expects 2 type arguments, but 1 given [type-arg]
rich (https://github.com/willmcgugan/rich)
+ rich/_emoji_replace.py:7: error: "Match" expects 2 type arguments, but 1 given
+ rich/_emoji_replace.py:23: error: "Match" expects 2 type arguments, but 1 given
+ rich/markup.py:43: error: "Match" expects 2 type arguments, but 1 given
+ rich/markup.py:60: error: "Match" expects 2 type arguments, but 1 given
zulip (https://github.com/zulip/zulip)
+ tools/lib/capitalization.py:213: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/help_settings_links.py:149: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/help_relative_links.py:115: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/url_preview/preview.py:46: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/mention.py:137: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/openapi/markdown_extension.py:476: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/openapi/markdown_extension.py:489: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/help_emoticon_translations_table.py:61: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:606: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:1375: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:1486: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:1498: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:1509: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:1535: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:1624: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:1826: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:1846: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:1902: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:1942: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:1973: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ zerver/lib/markdown/__init__.py:2082: error: "Match" expects 2 type arguments, but 1 given [type-arg]
flake8 (https://github.com/pycqa/flake8)
+ src/flake8/violation.py:19: error: "Match" expects 2 type arguments, but 1 given
vision (https://github.com/pytorch/vision)
+ torchvision/prototype/datasets/_builtin/imagenet.py:95: error: "Match" expects 2 type arguments, but 1 given [type-arg]
sphinx (https://github.com/sphinx-doc/sphinx)
+ sphinx/util/matching.py: note: In function "compile_matchers":
+ sphinx/util/matching.py:63:76: error: "Match" expects 2 type arguments, but 1 given
+ sphinx/util/matching.py: note: In function "patmatch":
+ sphinx/util/matching.py:92:47: error: "Match" expects 2 type arguments, but 1 given
werkzeug (https://github.com/pallets/werkzeug)
+ src/werkzeug/debug/__init__.py:90: error: Returning Any from function declared to return "Union[str, bytes, None]" [no-any-return]
+ src/werkzeug/routing.py:2004: error: "Match" expects 2 type arguments, but 1 given [type-arg]
cwltool (https://github.com/common-workflow-language/cwltool)
+ cwltool/job.py: note: In function "neverquote":
+ cwltool/job.py:110:72: error: "Match" expects 2 type arguments, but 1 given [type-arg]
aiohttp (https://github.com/aio-libs/aiohttp)
+ aiohttp/helpers.py:366: error: Returning Any from function declared to return "str" [no-any-return]
edgedb (https://github.com/edgedb/edgedb)
+ edb/edgeql/codegen.py:52:26: error: "Match" expects 2 type arguments, but 1 given
attrs (https://github.com/python-attrs/attrs)
+ src/attr/validators.pyi:60: error: "Match" expects 2 type arguments, but 1 given
tornado (https://github.com/tornadoweb/tornado)
+ tornado/util.py:199: error: "Match" expects 2 type arguments, but 1 given
mypy (https://github.com/python/mypy)
+ mypy/checkstrformat.py:44: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ mypy/checkstrformat.py:117: error: "Match" expects 2 type arguments, but 1 given [type-arg]
core (https://github.com/home-assistant/core)
+ homeassistant/components/acer_projector/switch.py:115: error: Returning Any from function declared to return "str" [no-any-return]
steam.py (https://github.com/Gobot1234/steam.py)
+ steam/utils.py:368: error: "Match" expects 2 type arguments, but 1 given [type-arg]
porcupine (https://github.com/Akuli/porcupine)
+ porcupine/textutils.py:553: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ porcupine/plugins/aboutdialog.py:18: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ porcupine/plugins/run/no_terminal.py:231: error: "Match" expects 2 type arguments, but 1 given [type-arg]
pylint (https://github.com/pycqa/pylint)
+ pylint/checkers/utils.py:697: error: "Match" expects 2 type arguments, but 1 given [type-arg]
psycopg (https://github.com/psycopg/psycopg)
+ psycopg/psycopg/copy.py:640: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ psycopg/psycopg/copy.py:648: error: "Match" expects 2 type arguments, but 1 given [type-arg]
+ psycopg/psycopg/_queries.py:217: error: "Match" expects 2 type arguments, but 1 given [type-arg]
pydantic (https://github.com/samuelcolvin/pydantic)
+ pydantic/color.py:259: error: Unused "type: ignore" comment
dragonchain (https://github.com/dragonchain/dragonchain)
- dragonchain/broadcast_processor/broadcast_functions.py:95:32: error: Item "None" of "Optional[Match[str]]" has no attribute "group"
+ dragonchain/broadcast_processor/broadcast_functions.py:95:32: error: Item "None" of "Optional[Match[Any, Any]]" has no attribute "group"
- dragonchain/lib/authorization.py:315:19: error: Item "None" of "Optional[Match[str]]" has no attribute "group"
+ dragonchain/lib/authorization.py:315:19: error: Item "None" of "Optional[Match[Any, Any]]" has no attribute "group"
- dragonchain/lib/authorization.py:317:25: error: Item "None" of "Optional[Match[str]]" has no attribute "group"
+ dragonchain/lib/authorization.py:317:25: error: Item "None" of "Optional[Match[Any, Any]]" has no attribute "group"
- dragonchain/lib/authorization.py:340:31: error: Item "None" of "Optional[Match[str]]" has no attribute "group"
+ dragonchain/lib/authorization.py:340:31: error: Item "None" of "Optional[Match[Any, Any]]" has no attribute "group"
git-revise (https://github.com/mystor/git-revise)
+ gitrevise/odb.py:113: error: Returning Any from function declared to return "bytes"
+ gitrevise/odb.py:120: error: Returning Any from function declared to return "bytes"
+ gitrevise/odb.py:127: error: Returning Any from function declared to return "bytes"
+ gitrevise/odb.py:134: error: Returning Any from function declared to return "bytes"
+ gitrevise/odb.py:141: error: Returning Any from function declared to return "bytes"
|
I noticed that the type of |
Had it been possible to provide a default value for |
Fixes #1467
Refs #3014