[WIP] gh-129813, PEP 782: Add PyBytesWriter C API#131681
[WIP] gh-129813, PEP 782: Add PyBytesWriter C API#131681vstinner wants to merge 30 commits intopython:mainfrom
Conversation
459f3d8 to
9097e5f
Compare
Add functions: * PyBytesWriter_Create() * PyBytesWriter_Discard() * PyBytesWriter_Finish() * PyBytesWriter_FinishWithSize() * PyBytesWriter_FinishWithEndPointer() * PyBytesWriter_Data() * PyBytesWriter_Allocated() * PyBytesWriter_SetSize() * PyBytesWriter_Resize()
9097e5f to
e24d40e
Compare
Convert _PyBytes_FromHex().
Replace PyBytes_FromStringAndSize(NULL, 0) with Py_GetConstant(Py_CONSTANT_EMPTY_BYTES).
|
This change has no impact on performance, even if the new public API allocates memory on the heap, instead of allocating on the stack. It uses a freelist to optimize Microbenchmark on 3 functions, to compare the private
import pyperf
import binascii
runner = pyperf.Runner()
runner.bench_func('from list 100', bytes, list(b'x' * 100))
runner.bench_func('from list 1,000', bytes, list(b'x' * 1_000))
runner.bench_func('from hex 100', bytes.fromhex, bytes(range(100)).hex())
runner.bench_func('from hex 1,000', bytes.fromhex, (b'x' * 1_000).hex())
runner.bench_func('b2a_uu', binascii.b2a_uu, b'x' * 45)Result:
Benchmark hidden because not significant (1): from list 1,000 |
|
Benchmark comparing Benchmark: import pyperf
SIZES = (10, 100, 500)
runner = pyperf.Runner()
for size in SIZES:
large_int = (2 ** (size * 8) - 1)
runner.bench_func(f'to_bytes({size})', large_int.to_bytes, size)
for size in SIZES:
mem = memoryview(b'x' * size)
runner.bench_func(f'memoryview({size}).tobytes()', mem.tobytes)Result:
It's hard to beat There is an overhead around 10 ns when using |
|
Could you please benchmark the following?
|
|
I wrote a big PR to show how PEP 782 would look like and how it's being used. But if PEP 782 is accepted, I will only start by adding the API without using it. Then I will write separated changes to use the new API and run benchmarks on each change.
I didn't modify these encoders, they still use the private
Same. If I modify these encoders and error handlers later, I will run benchmarks to decide if it's acceptable to use the public API or not. |
|
Microbenchmark on Detailsimport pyperf
runner = pyperf.Runner()
import ctypes
from ctypes import pythonapi, py_object
from ctypes import (
c_int, c_uint,
c_long, c_ulong,
c_size_t, c_ssize_t,
c_char_p)
PyBytes_FromFormat = pythonapi.PyBytes_FromFormat
PyBytes_FromFormat.argtypes = (c_char_p,)
PyBytes_FromFormat.restype = py_object
PyBytes_DecodeEscape = pythonapi.PyBytes_DecodeEscape
PyBytes_DecodeEscape.argtypes = (c_char_p, c_size_t, c_char_p, c_size_t, c_char_p)
PyBytes_DecodeEscape.restype = py_object
runner.bench_func('Format hello world', PyBytes_FromFormat, b'Hello %s !', b'world')
fmt = (b'Hell%c' + b' ' * 1024 + b' %s')
runner.bench_func('Format long format', PyBytes_FromFormat, fmt, c_int(ord('o')), b'world')
s = b'abc\\ndef\\x40.'
runner.bench_func('Decode simple', PyBytes_DecodeEscape, s, len(s), None, 0, b'unused')
s = b'x' * 1024
runner.bench_func('Decode long copy', PyBytes_DecodeEscape, s, len(s), None, 0, b'unused')
s = b'\\x40' * 1024
runner.bench_func('Decode long \\x40', PyBytes_DecodeEscape, s, len(s), None, 0, b'unused')Results:
Benchmark hidden because not significant (1): Format hello world I'm not sure why PEP 782 is faster, but at least it's not slower :-) I build Python with |
|
I started to split this huge PR into smaller PRs, see PRs attached to the issue #129813. |
Add functions: