Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 20 additions & 24 deletions pep-0467.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,47 +14,43 @@ Post-History: 2014-03-30 2014-08-15 2014-08-16 2016-06-07 2016-09-01 2021-04-13
Abstract
========

During the initial development of the Python 3 language specification, the
core ``bytes`` type for arbitrary binary data started as the mutable type
that is now referred to as ``bytearray``. Other aspects of operating in
the binary domain in Python have also evolved over the course of the Python
3 series.

This PEP proposes five small adjustments to the APIs of the ``bytes`` and
``bytearray`` types to make it easier to operate entirely in the binary domain:

* Discourage passing single integer values to ``bytes`` and ``bytearray``
* Add ``bytes.fromsize`` and ``bytearray.fromsize`` alternative constructors
* Add ``bytes.fromint`` and ``bytearray.fromint`` alternative constructors
* Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods
* Add ``bytes.iterbytes`` and ``bytearray.iterbytes`` alternative iterators


Proposals
Rationale
=========

Discourage use of current "zero-initialised sequence" behavior
---------------------------------------------------------------
During the initial development of the Python 3 language specification, the
core ``bytes`` type for arbitrary binary data started as the mutable type
that is now referred to as ``bytearray``. Other aspects of operating in
the binary domain in Python have also evolved over the course of the Python
3 series, for example with PEP 461.


Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
argument and interpret it as meaning to create a zero-initialised sequence
of the given size::
Motivation
==========

>>> bytes(3)
b'\x00\x00\x00'
>>> bytearray(3)
bytearray(b'\x00\x00\x00')
With Python 3 and the split between ``str`` and ``bytes``, one small but
important area of programming became slightly more difficult, and much more
painful -- wire format protocols.

This PEP proposes to update the documentation to discourage making use of that
input type dependent behavior in Python 3.11, suggesting to use a new, more
explicit, ``bytes.fromsize(n)`` or ``bytearray.fromsize(n)`` spelling instead
(see next section).
This area of programming is characterized by a mixture of binary data and
ASCII compatible segments of text (aka ASCII-encoded text). The addition of
the new constructors, methods, and iterators will aid both in writing new
wire format code, and in porting any remaining Python 2 wire format code.

However, the current handling of numeric inputs in the default constructors
would remain in place indefinitely to avoid introducing a compatibility break.
Common use-cases include ``dbf`` and ``pdf`` file formats, ``email``
formats, and ``FTP`` and ``HTTP`` communications, among many others.

No other changes are proposed to the existing constructors.

Proposals
=========

Addition of explicit "count and byte initialised sequence" constructors
-----------------------------------------------------------------------
Expand Down