Skip to content

Fix pickling of httpclient.HTTPError subclasses and web.HTTPError #3439

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

andersk
Copy link
Contributor

@andersk andersk commented Nov 2, 2024

The args member variable is set by BaseException.__new__ and used by BaseException.__reduce__ for pickling. To avoid interfering with it, we need to avoid calling BaseException.__init__ from classes that have subclasses with incompatible constructors, and rename our own tornado.web.HTTPError.args member.

>>> pickle.loads(pickle.dumps(tornado.simple_httpclient.HTTPTimeoutError("message")))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: HTTPTimeoutError.__init__() takes 2 positional arguments but 4 were given
>>> str(pickle.loads(pickle.dumps(tornado.web.HTTPError(500, "%s", "foo"))))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/anders/python/tornado/tornado/web.py", line 2488, in __str__
    return message + " (" + (self.log_message % self.args) + ")"
                             ~~~~~~~~~~~~~~~~~^~~~~~~~~~~
TypeError: not enough arguments for format string

@andersk andersk force-pushed the pickle branch 2 times, most recently from c9309d0 to 9cb31f2 Compare November 2, 2024 02:13
@bdarnell
Copy link
Member

args is part of the de facto public interface of this class. We just jumped through some hoops to preserve backwards compatibility in this class in #3465. What exactly are the rules of pickling as applied to exceptions? (I've never knowingly used pickle and I'm surprised to learn that exceptions are pickleable)

@bdarnell bdarnell added the web label Apr 28, 2025
@andersk
Copy link
Contributor Author

andersk commented Apr 28, 2025

args isn’t our interface to define; it was, and still is, part of the public interface of BaseException since before Tornado existed. It’s specified to be the tuple of arguments given to the exception constructor. Overriding it with any other meaning is a bug, and breaks pickling.

One common situation where exception pickling comes up is multiprocessing:

import multiprocessing, tornado

def f():
    raise tornado.simple_httpclient.HTTPTimeoutError("message")

with multiprocessing.Pool() as pool:
    pool.apply(f)

This not only fails to propagate the raised error, but in fact wedges the whole process while trying.

The `args` member variable is set by `BaseException.__new__` and used
by `BaseException.__reduce__` for pickling.  To avoid interfering with
it, we need to avoid calling `BaseException.__init__` from classes
that have subclasses with incompatible constructors, and rename our
own `tornado.web.HTTPError.args` member.

    >>> pickle.loads(pickle.dumps(tornado.simple_httpclient.HTTPTimeoutError("message")))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: HTTPTimeoutError.__init__() takes 2 positional arguments but 4 were given
    >>> str(pickle.loads(pickle.dumps(tornado.web.HTTPError(500, "%s", "foo"))))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/anders/python/tornado/tornado/web.py", line 2488, in __str__
        return message + " (" + (self.log_message % self.args) + ")"
                                 ~~~~~~~~~~~~~~~~~^~~~~~~~~~~
    TypeError: not enough arguments for format string

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
@bdarnell
Copy link
Member

Right. It was a mistake to use self.args in this way, but it's a fifteen-year-old mistake now (I appreciate the link to the Python 2.5 docs since that was the current release back when the project that became Tornado was started). The question is what to do about it now. Pickling is the first known instance of this misuse of self.args actually causing problems, so is there a pickling-specific solution? Can we define __reduce__ and/or __setstate__ somehow to make this work, for example? Or do we have to make a breaking change to avoid this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants