Skip to content

Pickles

Jim Fulton edited this page Mar 18, 2013 · 12 revisions

Pickle interoperability between Python 2 and Python 3

It's useful to be able to support accessing databases from both Python 2 and Python 3 because:

  • You may have multiple applications accessing a database, or multiple installations of the same applications that are moved to Python 3 at different times. Supporting both Python versions will make transition much easier.
  • Eventually, ZODB may support other languages, especially Javascript. It would be a shame if we could support Javascript but not Python 2.
  • Some ZODB users have massive databases which cannot be easily (or even realistically) go through a migration process before moving to Python 3

Issues

  1. Python 3 uses different pickling codes than Python 2. In particular, the Python 2 bytes (STRING, BINSTRING, and SHORTBINSTRING) are DWIMilly interpreted as text in some encoding. Python 3 saves bytes with a Python 3-specific bytecode (BYTES and BINBYTES).
  2. Names (attribute, and global) in Python 3 are unicode in Python 3 but bytes in Python 2.

Proposals

Python2 pickle with name conversion

Read and store byte data using Python 2 byte codes using a forked version of pickle, zodbpickle. Fix up names when necessary in Python 3.

  • When finding globals or setting instance state, convert byte names to unicode using an ascii encoding.

    We can only fix up attribute names when no custom set state is used. So this is only a partial solution. Applications with custom __setstate__ methods may not be interoperable accross Python versions or may need to be modified.

  • Note that Python 2 attributes can be stored as unicode. (They can only be accessed with attribute notation if they're ASCII.)

Issues:

  • Cookie.Morsel is a dict subclass that has unicode keys on Python 3, but byte keys in Python 2. Reading a Python 2 Morsel pickle in Python 3 requires the byte->unicode DWIM.

    This is a case we can't handle.

Clone this wiki locally