Skip to content

Commit 5c4235c

Browse files
authored
gh-119182: Add PyUnicodeWriter C API (#119184)
1 parent 2c7209a commit 5c4235c

File tree

6 files changed

+533
-18
lines changed

6 files changed

+533
-18
lines changed

Doc/c-api/unicode.rst

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1502,3 +1502,87 @@ They all return ``NULL`` or ``-1`` if an exception occurs.
15021502
:c:func:`PyUnicode_InternInPlace`, returning either a new Unicode string
15031503
object that has been interned, or a new ("owned") reference to an earlier
15041504
interned string object with the same value.
1505+
1506+
PyUnicodeWriter
1507+
^^^^^^^^^^^^^^^
1508+
1509+
The :c:type:`PyUnicodeWriter` API can be used to create a Python :class:`str`
1510+
object.
1511+
1512+
.. versionadded:: 3.14
1513+
1514+
.. c:type:: PyUnicodeWriter
1515+
1516+
A Unicode writer instance.
1517+
1518+
The instance must be destroyed by :c:func:`PyUnicodeWriter_Finish` on
1519+
success, or :c:func:`PyUnicodeWriter_Discard` on error.
1520+
1521+
.. c:function:: PyUnicodeWriter* PyUnicodeWriter_Create(Py_ssize_t length)
1522+
1523+
Create a Unicode writer instance.
1524+
1525+
Set an exception and return ``NULL`` on error.
1526+
1527+
.. c:function:: PyObject* PyUnicodeWriter_Finish(PyUnicodeWriter *writer)
1528+
1529+
Return the final Python :class:`str` object and destroy the writer instance.
1530+
1531+
Set an exception and return ``NULL`` on error.
1532+
1533+
.. c:function:: void PyUnicodeWriter_Discard(PyUnicodeWriter *writer)
1534+
1535+
Discard the internal Unicode buffer and destroy the writer instance.
1536+
1537+
.. c:function:: int PyUnicodeWriter_WriteChar(PyUnicodeWriter *writer, Py_UCS4 ch)
1538+
1539+
Write the single Unicode character *ch* into *writer*.
1540+
1541+
On success, return ``0``.
1542+
On error, set an exception, leave the writer unchanged, and return ``-1``.
1543+
1544+
.. c:function:: int PyUnicodeWriter_WriteUTF8(PyUnicodeWriter *writer, const char *str, Py_ssize_t size)
1545+
1546+
Decode the string *str* from UTF-8 in strict mode and write the output into *writer*.
1547+
1548+
*size* is the string length in bytes. If *size* is equal to ``-1``, call
1549+
``strlen(str)`` to get the string length.
1550+
1551+
On success, return ``0``.
1552+
On error, set an exception, leave the writer unchanged, and return ``-1``.
1553+
1554+
To use a different error handler than ``strict``,
1555+
:c:func:`PyUnicode_DecodeUTF8` can be used with
1556+
:c:func:`PyUnicodeWriter_WriteStr`.
1557+
1558+
.. c:function:: int PyUnicodeWriter_WriteStr(PyUnicodeWriter *writer, PyObject *obj)
1559+
1560+
Call :c:func:`PyObject_Str` on *obj* and write the output into *writer*.
1561+
1562+
On success, return ``0``.
1563+
On error, set an exception, leave the writer unchanged, and return ``-1``.
1564+
1565+
.. c:function:: int PyUnicodeWriter_WriteRepr(PyUnicodeWriter *writer, PyObject *obj)
1566+
1567+
Call :c:func:`PyObject_Repr` on *obj* and write the output into *writer*.
1568+
1569+
On success, return ``0``.
1570+
On error, set an exception, leave the writer unchanged, and return ``-1``.
1571+
1572+
.. c:function:: int PyUnicodeWriter_WriteSubstring(PyUnicodeWriter *writer, PyObject *str, Py_ssize_t start, Py_ssize_t end)
1573+
1574+
Write the substring ``str[start:end]`` into *writer*.
1575+
1576+
*str* must be Python :class:`str` object. *start* must be greater than or
1577+
equal to 0, and less than or equal to *end*. *end* must be less than or
1578+
equal to *str* length.
1579+
1580+
On success, return ``0``.
1581+
On error, set an exception, leave the writer unchanged, and return ``-1``.
1582+
1583+
.. c:function:: int PyUnicodeWriter_Format(PyUnicodeWriter *writer, const char *format, ...)
1584+
1585+
Similar to :c:func:`PyUnicode_FromFormat`, but write the output directly into *writer*.
1586+
1587+
On success, return ``0``.
1588+
On error, set an exception, leave the writer unchanged, and return ``-1``.

Doc/whatsnew/3.14.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -283,6 +283,21 @@ New Features
283283
* Add :c:func:`PyLong_GetSign` function to get the sign of :class:`int` objects.
284284
(Contributed by Sergey B Kirpichev in :gh:`116560`.)
285285

286+
* Add a new :c:type:`PyUnicodeWriter` API to create a Python :class:`str`
287+
object:
288+
289+
* :c:func:`PyUnicodeWriter_Create`.
290+
* :c:func:`PyUnicodeWriter_Discard`.
291+
* :c:func:`PyUnicodeWriter_Finish`.
292+
* :c:func:`PyUnicodeWriter_WriteChar`.
293+
* :c:func:`PyUnicodeWriter_WriteUTF8`.
294+
* :c:func:`PyUnicodeWriter_WriteStr`.
295+
* :c:func:`PyUnicodeWriter_WriteRepr`.
296+
* :c:func:`PyUnicodeWriter_WriteSubstring`.
297+
* :c:func:`PyUnicodeWriter_Format`.
298+
299+
(Contributed by Victor Stinner in :gh:`119182`.)
300+
286301
Porting to Python 3.14
287302
----------------------
288303

Include/cpython/unicodeobject.h

Lines changed: 35 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -444,7 +444,40 @@ PyAPI_FUNC(PyObject*) PyUnicode_FromKindAndData(
444444
Py_ssize_t size);
445445

446446

447-
/* --- _PyUnicodeWriter API ----------------------------------------------- */
447+
/* --- Public PyUnicodeWriter API ----------------------------------------- */
448+
449+
typedef struct PyUnicodeWriter PyUnicodeWriter;
450+
451+
PyAPI_FUNC(PyUnicodeWriter*) PyUnicodeWriter_Create(Py_ssize_t length);
452+
PyAPI_FUNC(void) PyUnicodeWriter_Discard(PyUnicodeWriter *writer);
453+
PyAPI_FUNC(PyObject*) PyUnicodeWriter_Finish(PyUnicodeWriter *writer);
454+
455+
PyAPI_FUNC(int) PyUnicodeWriter_WriteChar(
456+
PyUnicodeWriter *writer,
457+
Py_UCS4 ch);
458+
PyAPI_FUNC(int) PyUnicodeWriter_WriteUTF8(
459+
PyUnicodeWriter *writer,
460+
const char *str,
461+
Py_ssize_t size);
462+
463+
PyAPI_FUNC(int) PyUnicodeWriter_WriteStr(
464+
PyUnicodeWriter *writer,
465+
PyObject *obj);
466+
PyAPI_FUNC(int) PyUnicodeWriter_WriteRepr(
467+
PyUnicodeWriter *writer,
468+
PyObject *obj);
469+
PyAPI_FUNC(int) PyUnicodeWriter_WriteSubstring(
470+
PyUnicodeWriter *writer,
471+
PyObject *str,
472+
Py_ssize_t start,
473+
Py_ssize_t end);
474+
PyAPI_FUNC(int) PyUnicodeWriter_Format(
475+
PyUnicodeWriter *writer,
476+
const char *format,
477+
...);
478+
479+
480+
/* --- Private _PyUnicodeWriter API --------------------------------------- */
448481

449482
typedef struct {
450483
PyObject *buffer;
@@ -466,7 +499,7 @@ typedef struct {
466499
/* If readonly is 1, buffer is a shared string (cannot be modified)
467500
and size is set to 0. */
468501
unsigned char readonly;
469-
} _PyUnicodeWriter ;
502+
} _PyUnicodeWriter;
470503

471504
// Initialize a Unicode writer.
472505
//
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
Add a new :c:type:`PyUnicodeWriter` API to create a Python :class:`str` object:
2+
3+
* :c:func:`PyUnicodeWriter_Create`.
4+
* :c:func:`PyUnicodeWriter_Discard`.
5+
* :c:func:`PyUnicodeWriter_Finish`.
6+
* :c:func:`PyUnicodeWriter_WriteChar`.
7+
* :c:func:`PyUnicodeWriter_WriteUTF8`.
8+
* :c:func:`PyUnicodeWriter_WriteStr`.
9+
* :c:func:`PyUnicodeWriter_WriteRepr`.
10+
* :c:func:`PyUnicodeWriter_WriteSubstring`.
11+
* :c:func:`PyUnicodeWriter_Format`.
12+
13+
Patch by Victor Stinner.

0 commit comments

Comments
 (0)