Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-46906: Add PyFloat_Pack8() to the C API #31657

Merged
merged 2 commits into from
Mar 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions Doc/c-api/float.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,3 +76,85 @@ Floating Point Objects
.. c:function:: double PyFloat_GetMin()

Return the minimum normalized positive float *DBL_MIN* as C :c:type:`double`.


Pack and Unpack functions
=========================

The pack and unpack functions provide an efficient platform-independent way to
store floating-point values as byte strings. The Pack routines produce a bytes
string from a C :c:type:`double`, and the Unpack routines produce a C
:c:type:`double` from such a bytes string. The suffix (2, 4 or 8) specifies the
number of bytes in the bytes string.

On platforms that appear to use IEEE 754 formats these functions work by
copying bits. On other platforms, the 2-byte format is identical to the IEEE
754 binary16 half-precision format, the 4-byte format (32-bit) is identical to
the IEEE 754 binary32 single precision format, and the 8-byte format to the
IEEE 754 binary64 double precision format, although the packing of INFs and
NaNs (if such things exist on the platform) isn't handled correctly, and
attempting to unpack a bytes string containing an IEEE INF or NaN will raise an
exception.

On non-IEEE platforms with more precision, or larger dynamic range, than IEEE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably once we're assuming IEEE 754 for CPython, all this text can be dropped?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan is to first fix the Python 3.11 "regression" by exposing these functions as public C API functions. And then deprecate or remove support for non-IEEE 754 platforms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Require IEEE 754 support is tracked by https://bugs.python.org/issue46917

754 supports, not all values can be packed; on non-IEEE platforms with less
precision, or smaller dynamic range, not all values can be unpacked. What
happens in such cases is partly accidental (alas).

.. versionadded:: 3.11

Pack functions
--------------

The pack routines write 2, 4 or 8 bytes, starting at *p*. *le* is an
:c:type:`int` argument, non-zero if you want the bytes string in little-endian
format (exponent last, at ``p+1``, ``p+3``, or ``p+6`` ``p+7``), zero if you
want big-endian format (exponent first, at *p*).

Return value: ``0`` if all is OK, ``-1`` if error (and an exception is set,
most likely :exc:`OverflowError`).

There are two problems on non-IEEE platforms:

* What this does is undefined if *x* is a NaN or infinity.
* ``-0.0`` and ``+0.0`` produce the same bytes string.

.. c:function:: int PyFloat_Pack2(double x, unsigned char *p, int le)

Pack a C double as the IEEE 754 binary16 half-precision format.

.. c:function:: int PyFloat_Pack4(double x, unsigned char *p, int le)

Pack a C double as the IEEE 754 binary32 single precision format.

.. c:function:: int PyFloat_Pack8(double x, unsigned char *p, int le)

Pack a C double as the IEEE 754 binary64 double precision format.


Unpack functions
----------------

The unpack routines read 2, 4 or 8 bytes, starting at *p*. *le* is an
:c:type:`int` argument, non-zero if the bytes string is in little-endian format
(exponent last, at ``p+1``, ``p+3`` or ``p+6`` and ``p+7``), zero if big-endian
(exponent first, at *p*).

Return value: The unpacked double. On error, this is ``-1.0`` and
:c:func:`PyErr_Occurred` is true (and an exception is set, most likely
:exc:`OverflowError`).

Note that on a non-IEEE platform this will refuse to unpack a bytes string that
represents a NaN or infinity.

.. c:function:: double PyFloat_Unpack2(const unsigned char *p, int le)

Unpack the IEEE 754 binary16 half-precision format as a C double.

.. c:function:: double PyFloat_Unpack4(const unsigned char *p, int le)

Unpack the IEEE 754 binary32 single precision format as a C double.

.. c:function:: double PyFloat_Unpack8(const unsigned char *p, int le)

Unpack the IEEE 754 binary64 double precision format as a C double.
6 changes: 6 additions & 0 deletions Doc/whatsnew/3.11.rst
Original file line number Diff line number Diff line change
Expand Up @@ -759,6 +759,12 @@ New Features
available directly (via :c:type:`PyCMethod`).
(Contributed by Petr Viktorin in :issue:`46613`.)

* Add new functions to pack and unpack C double (serialize and deserialize):
:c:func:`PyFloat_Pack2`, :c:func:`PyFloat_Pack4`, :c:func:`PyFloat_Pack8`,
:c:func:`PyFloat_Unpack2`, :c:func:`PyFloat_Unpack4` and
:c:func:`PyFloat_Unpack8`.
(Contributed by Victor Stinner in :issue:`46906`.)


Porting to Python 3.11
----------------------
Expand Down
9 changes: 9 additions & 0 deletions Include/cpython/floatobject.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,12 @@ typedef struct {
// Macro version of PyFloat_AsDouble() trading safety for speed.
// It doesn't check if op is a double object.
#define PyFloat_AS_DOUBLE(op) (((PyFloatObject *)(op))->ob_fval)


PyAPI_FUNC(int) PyFloat_Pack2(double x, char *p, int le);
PyAPI_FUNC(int) PyFloat_Pack4(double x, char *p, int le);
PyAPI_FUNC(int) PyFloat_Pack8(double x, char *p, int le);

PyAPI_FUNC(double) PyFloat_Unpack2(const char *p, int le);
PyAPI_FUNC(double) PyFloat_Unpack4(const char *p, int le);
PyAPI_FUNC(double) PyFloat_Unpack8(const char *p, int le);
48 changes: 0 additions & 48 deletions Include/internal/pycore_floatobject.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,54 +38,6 @@ struct _Py_float_state {
#endif
};

/* _PyFloat_{Pack,Unpack}{4,8}
*
* The struct and pickle (at least) modules need an efficient platform-
* independent way to store floating-point values as byte strings.
* The Pack routines produce a string from a C double, and the Unpack
* routines produce a C double from such a string. The suffix (4 or 8)
* specifies the number of bytes in the string.
*
* On platforms that appear to use (see _PyFloat_Init()) IEEE-754 formats
* these functions work by copying bits. On other platforms, the formats the
* 4- byte format is identical to the IEEE-754 single precision format, and
* the 8-byte format to the IEEE-754 double precision format, although the
* packing of INFs and NaNs (if such things exist on the platform) isn't
* handled correctly, and attempting to unpack a string containing an IEEE
* INF or NaN will raise an exception.
*
* On non-IEEE platforms with more precision, or larger dynamic range, than
* 754 supports, not all values can be packed; on non-IEEE platforms with less
* precision, or smaller dynamic range, not all values can be unpacked. What
* happens in such cases is partly accidental (alas).
*/

/* The pack routines write 2, 4 or 8 bytes, starting at p. le is a bool
* argument, true if you want the string in little-endian format (exponent
* last, at p+1, p+3 or p+7), false if you want big-endian format (exponent
* first, at p).
* Return value: 0 if all is OK, -1 if error (and an exception is
* set, most likely OverflowError).
* There are two problems on non-IEEE platforms:
* 1): What this does is undefined if x is a NaN or infinity.
* 2): -0.0 and +0.0 produce the same string.
*/
PyAPI_FUNC(int) _PyFloat_Pack2(double x, unsigned char *p, int le);
PyAPI_FUNC(int) _PyFloat_Pack4(double x, unsigned char *p, int le);
PyAPI_FUNC(int) _PyFloat_Pack8(double x, unsigned char *p, int le);

/* The unpack routines read 2, 4 or 8 bytes, starting at p. le is a bool
* argument, true if the string is in little-endian format (exponent
* last, at p+1, p+3 or p+7), false if big-endian (exponent first, at p).
* Return value: The unpacked double. On error, this is -1.0 and
* PyErr_Occurred() is true (and an exception is set, most likely
* OverflowError). Note that on a non-IEEE platform this will refuse
* to unpack a string that represents a NaN or infinity.
*/
PyAPI_FUNC(double) _PyFloat_Unpack2(const unsigned char *p, int le);
PyAPI_FUNC(double) _PyFloat_Unpack4(const unsigned char *p, int le);
PyAPI_FUNC(double) _PyFloat_Unpack8(const unsigned char *p, int le);


PyAPI_FUNC(void) _PyFloat_DebugMallocStats(FILE* out);

Expand Down
74 changes: 73 additions & 1 deletion Lib/test/test_float.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,14 @@
from test.test_grammar import (VALID_UNDERSCORE_LITERALS,
INVALID_UNDERSCORE_LITERALS)
from math import isinf, isnan, copysign, ldexp
import math

try:
import _testcapi
except ImportError:
_testcapi = None

HAVE_IEEE_754 = float.__getformat__("double").startswith("IEEE")
INF = float("inf")
NAN = float("nan")

Expand Down Expand Up @@ -652,8 +659,9 @@ def test_float_specials_do_unpack(self):
struct.unpack(fmt, data)

@support.requires_IEEE_754
@unittest.skipIf(_testcapi is None, 'needs _testcapi')
def test_serialized_float_rounding(self):
FLT_MAX = import_helper.import_module('_testcapi').FLT_MAX
FLT_MAX = _testcapi.FLT_MAX
self.assertEqual(struct.pack("<f", 3.40282356e38), struct.pack("<f", FLT_MAX))
self.assertEqual(struct.pack("<f", -3.40282356e38), struct.pack("<f", -FLT_MAX))

Expand Down Expand Up @@ -1488,5 +1496,69 @@ def __init__(self, value):
self.assertEqual(getattr(f, 'foo', 'none'), 'bar')


# Test PyFloat_Pack2(), PyFloat_Pack4() and PyFloat_Pack8()
# Test PyFloat_Unpack2(), PyFloat_Unpack4() and PyFloat_Unpack8()
BIG_ENDIAN = 0
LITTLE_ENDIAN = 1
EPSILON = {
2: 2.0 ** -11, # binary16
4: 2.0 ** -24, # binary32
8: 2.0 ** -53, # binary64
}

@unittest.skipIf(_testcapi is None, 'needs _testcapi')
class PackTests(unittest.TestCase):
def test_pack(self):
self.assertEqual(_testcapi.float_pack(2, 1.5, BIG_ENDIAN),
b'>\x00')
self.assertEqual(_testcapi.float_pack(4, 1.5, BIG_ENDIAN),
b'?\xc0\x00\x00')
self.assertEqual(_testcapi.float_pack(8, 1.5, BIG_ENDIAN),
b'?\xf8\x00\x00\x00\x00\x00\x00')
self.assertEqual(_testcapi.float_pack(2, 1.5, LITTLE_ENDIAN),
b'\x00>')
self.assertEqual(_testcapi.float_pack(4, 1.5, LITTLE_ENDIAN),
b'\x00\x00\xc0?')
self.assertEqual(_testcapi.float_pack(8, 1.5, LITTLE_ENDIAN),
b'\x00\x00\x00\x00\x00\x00\xf8?')

def test_unpack(self):
self.assertEqual(_testcapi.float_unpack(b'>\x00', BIG_ENDIAN),
1.5)
self.assertEqual(_testcapi.float_unpack(b'?\xc0\x00\x00', BIG_ENDIAN),
1.5)
self.assertEqual(_testcapi.float_unpack(b'?\xf8\x00\x00\x00\x00\x00\x00', BIG_ENDIAN),
1.5)
self.assertEqual(_testcapi.float_unpack(b'\x00>', LITTLE_ENDIAN),
1.5)
self.assertEqual(_testcapi.float_unpack(b'\x00\x00\xc0?', LITTLE_ENDIAN),
1.5)
self.assertEqual(_testcapi.float_unpack(b'\x00\x00\x00\x00\x00\x00\xf8?', LITTLE_ENDIAN),
1.5)

def test_roundtrip(self):
large = 2.0 ** 100
values = [1.0, 1.5, large, 1.0/7, math.pi]
if HAVE_IEEE_754:
values.extend((INF, NAN))
for value in values:
for size in (2, 4, 8,):
if size == 2 and value == large:
# too large for 16-bit float
continue
rel_tol = EPSILON[size]
for endian in (BIG_ENDIAN, LITTLE_ENDIAN):
with self.subTest(value=value, size=size, endian=endian):
data = _testcapi.float_pack(size, value, endian)
value2 = _testcapi.float_unpack(data, endian)
if isnan(value):
self.assertTrue(isnan(value2), (value, value2))
elif size < 8:
self.assertTrue(math.isclose(value2, value, rel_tol=rel_tol),
(value, value2))
else:
self.assertEqual(value2, value)


if __name__ == '__main__':
unittest.main()
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Add new functions to pack and unpack C double (serialize and deserialize):
:c:func:`PyFloat_Pack2`, :c:func:`PyFloat_Pack4`, :c:func:`PyFloat_Pack8`,
:c:func:`PyFloat_Unpack2`, :c:func:`PyFloat_Unpack4` and
:c:func:`PyFloat_Unpack8`. Patch by Victor Stinner.
17 changes: 8 additions & 9 deletions Modules/_ctypes/cfield.c
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@

#include "pycore_bitutils.h" // _Py_bswap32()
#include "pycore_call.h" // _PyObject_CallNoArgs()
#include "pycore_floatobject.h" // _PyFloat_Pack8()

#include <ffi.h>
#include "ctypes.h"
Expand Down Expand Up @@ -1009,10 +1008,10 @@ d_set_sw(void *ptr, PyObject *value, Py_ssize_t size)
if (x == -1 && PyErr_Occurred())
return NULL;
#ifdef WORDS_BIGENDIAN
if (_PyFloat_Pack8(x, (unsigned char *)ptr, 1))
if (PyFloat_Pack8(x, ptr, 1))
return NULL;
#else
if (_PyFloat_Pack8(x, (unsigned char *)ptr, 0))
if (PyFloat_Pack8(x, ptr, 0))
return NULL;
#endif
_RET(value);
Expand All @@ -1022,9 +1021,9 @@ static PyObject *
d_get_sw(void *ptr, Py_ssize_t size)
{
#ifdef WORDS_BIGENDIAN
return PyFloat_FromDouble(_PyFloat_Unpack8(ptr, 1));
return PyFloat_FromDouble(PyFloat_Unpack8(ptr, 1));
#else
return PyFloat_FromDouble(_PyFloat_Unpack8(ptr, 0));
return PyFloat_FromDouble(PyFloat_Unpack8(ptr, 0));
#endif
}

Expand Down Expand Up @@ -1057,10 +1056,10 @@ f_set_sw(void *ptr, PyObject *value, Py_ssize_t size)
if (x == -1 && PyErr_Occurred())
return NULL;
#ifdef WORDS_BIGENDIAN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the le argument of PyFloat_{Un}Pack* be replaced with three-value endianness ( big, little, and native)? The native option would allow to move all #ifdef WORDS_BIGENDIAN fiddling inside the (up)pack functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this parameter is changed, it will be way harder to provide this function in pythoncapi_compat to older Python versions: python/pythoncapi-compat#26 And it maybe become more complicated to write a single code base supporting old and new Python versions.

It's easy to implement the "native" behavior by checking if your system uses little endian or not with your preferred way.

  • marshal uses little endian
  • struct supports big and little endian
  • pickle uses big endian
  • ctypes supports big and little endian
  • msgpack uses big endian

So far, I didn't see a module using the "native endian". For a serialization function, it's not convenient.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote #31832 to suggest using PY_BIG_ENDIAN macro to use the native endian.

if (_PyFloat_Pack4(x, (unsigned char *)ptr, 1))
if (PyFloat_Pack4(x, ptr, 1))
return NULL;
#else
if (_PyFloat_Pack4(x, (unsigned char *)ptr, 0))
if (PyFloat_Pack4(x, ptr, 0))
return NULL;
#endif
_RET(value);
Expand All @@ -1070,9 +1069,9 @@ static PyObject *
f_get_sw(void *ptr, Py_ssize_t size)
{
#ifdef WORDS_BIGENDIAN
return PyFloat_FromDouble(_PyFloat_Unpack4(ptr, 1));
return PyFloat_FromDouble(PyFloat_Unpack4(ptr, 1));
#else
return PyFloat_FromDouble(_PyFloat_Unpack4(ptr, 0));
return PyFloat_FromDouble(PyFloat_Unpack4(ptr, 0));
#endif
}

Expand Down
5 changes: 2 additions & 3 deletions Modules/_pickle.c
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@
#endif

#include "Python.h"
#include "pycore_floatobject.h" // _PyFloat_Pack8()
#include "pycore_moduleobject.h" // _PyModule_GetState()
#include "pycore_runtime.h" // _Py_ID()
#include "pycore_pystate.h" // _PyThreadState_GET()
Expand Down Expand Up @@ -2244,7 +2243,7 @@ save_float(PicklerObject *self, PyObject *obj)
if (self->bin) {
char pdata[9];
pdata[0] = BINFLOAT;
if (_PyFloat_Pack8(x, (unsigned char *)&pdata[1], 0) < 0)
if (PyFloat_Pack8(x, &pdata[1], 0) < 0)
return -1;
if (_Pickler_Write(self, pdata, 9) < 0)
return -1;
Expand Down Expand Up @@ -5395,7 +5394,7 @@ load_binfloat(UnpicklerObject *self)
if (_Unpickler_Read(self, &s, 8) < 0)
return -1;

x = _PyFloat_Unpack8((unsigned char *)s, 0);
x = PyFloat_Unpack8(s, 0);
if (x == -1.0 && PyErr_Occurred())
return -1;

Expand Down
Loading