Skip to content

Vec<u8>: support u8-compatible buffer exporters#5774

Open
espressolee wants to merge 1 commit intoPyO3:mainfrom
espressolee:vec-u8-buffer-fast-path
Open

Vec<u8>: support u8-compatible buffer exporters#5774
espressolee wants to merge 1 commit intoPyO3:mainfrom
espressolee:vec-u8-buffer-fast-path

Conversation

@espressolee
Copy link

Summary

This PR extends Vec<u8> extraction to support u8-compatible buffer-protocol exporters (e.g. memoryview, array('B'), and custom types implementing __getbuffer__) by performing a contiguous copy via PyBuffer<u8>.

When the object exports a buffer that is not compatible with u8 (e.g. array('I')), extraction falls back to the existing sequence semantics, preserving current behavior.

Why

Vec<u8> is a common "byte payload" type. Today:

  • bytes / bytearray already have a specialized path via u8::sequence_extractor.
  • Other buffer exporters can still hit element-wise sequence extraction, and non-sequence buffer exporters cannot be extracted into Vec<u8> at all.

This change treats the buffer protocol as a first-class source for Vec<u8> when it is semantically valid (u8-compatible).

Implementation notes

  • Extends the u8::sequence_extractor specialization to recognize buffer exporters.
  • Makes the internal FromPyObjectSequence::to_vec fallible (PyResult<Vec<_>>) so buffer copies can propagate errors cleanly.
  • Uses the existing PyBuffer<u8> implementation to perform the copy.

Compatibility

  • No change to str -> Vec<_> rejection.
  • bytes / bytearray behavior unchanged.
  • For incompatible buffers (e.g. array('I')), behavior is preserved via fallback to sequence semantics.

Tests

Added to tests/test_buffer_protocol.rs:

  • test_extract_vec_u8_from_buffer_exporter (custom buffer exporter, not a sequence)
  • test_extract_vec_u8_falls_back_when_buffer_incompatible (array('I') fallback)

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 4, 2026

CodSpeed Performance Report

Merging this PR will degrade performance by 11.58%

Comparing espressolee:vec-u8-buffer-fast-path (b734b6a) with main (dfd6eaf)

Summary

❌ 1 regressed benchmark
✅ 98 untouched benchmarks
⏩ 1 skipped benchmark1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Benchmark BASE HEAD Efficiency
vec_bytes_from_py_bytes_medium 2 µs 2.3 µs -11.58%

Footnotes

  1. 1 benchmark was skipped, so the baseline result was used instead. If it was deleted from the codebase, click here and archive it to remove it from the performance reports.

@davidhewitt
Copy link
Member

I am unsure I am comfortable with having this be built-in to the Vec<u8> extraction - see https://alexgaynor.net/2022/oct/23/buffers-on-the-edge/ - there is no guarantee that the buffer object is properly synchronized. By leaving this to user code we let them make the choice.

Hopefully the Python C API will one day support better guarantees for buffer objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants