Skip to content

Deprecate then remove regex formatted string parsing #112

@sarayourfriend

Description

@sarayourfriend

Pook has a feature that allows passing specially formatted strings, which pook interprets as a regex pattern to be parsed during matching. In all cases where this feature is supported, pook also handles getting a compiled regex pattern just fine.

This is a violation of PEP 20 from an API perspective and significant complicates the matching code, requiring many lines of code to support.

This issue tracks the deprecation and eventual removal of this feature in favour of users passing a compiled regex.

There also exist bugs/unexpected behaviour in pook that are caused by this. For example, passing a regex-ish string to enable_network results in the following issue:

>>> import pook
>>> from urllib.request import urlopen
>>> pook.on()
>>> pook.get("https://httpbin.org/404").reply(200).body("hello from pook")
Response(
    headers=HTTPHeaderDict({}),
    status=200,
    body=hello from pook
)
>>> pook.enable_network("re/hello/")
>>> urlopen("https://example.com")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.12/urllib/request.py", line 215, in urlopen
    return opener.open(url, data, timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/urllib/request.py", line 515, in open
    response = self._open(req, data)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/urllib/request.py", line 532, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/urllib/request.py", line 492, in _call_chain
    result = func(*args)
             ^^^^^^^^^^^
  File "/usr/lib64/python3.12/urllib/request.py", line 1392, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/urllib/request.py", line 1344, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/var/home/sara/projects/pook/src/pook/interceptors/http.py", line 112, in handler
    return self._on_request(
           ^^^^^^^^^^^^^^^^^
  File "/var/home/sara/projects/pook/src/pook/interceptors/http.py", line 61, in _on_request
    mock = self.engine.match(req)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sara/projects/pook/src/pook/engine.py", line 431, in match
    if not self.should_use_network(request):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sara/projects/pook/src/pook/engine.py", line 385, in should_use_network
    return self.networking and all((fn(request) for fn in self.network_filters))
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/home/sara/projects/pook/src/pook/engine.py", line 385, in <genexpr>
    return self.networking and all((fn(request) for fn in self.network_filters))
                                    ^^^^^^^^^^^
  File "/var/home/sara/projects/pook/src/pook/engine.py", line 100, in hostname_filter
    return hostname.match(req.url.hostname)
           ^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'match'

This is because isregex used in the code below returns True if the string is formatted with re/{}/, but the call site doesn't keep that in mind, and assumes it's now safe to call match!

pook/src/pook/engine.py

Lines 98 to 101 in ff30ca7

def hostname_filter(hostname, req):
if isregex(hostname):
return hostname.match(req.url.hostname)
return req.url.hostname == hostname

We could fix that bug, but overall it's much simpler to just deprecate and then remove the regex-ish/regex-patterned string feature.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions