-
-
Notifications
You must be signed in to change notification settings - Fork 112
Description
I’m proposing some API changes to ScreenShot that break compatibility in some ways, although common usages will still be fine. This is to help us with our overall release planning, as being detailed in #471.
Change 1: Buffers
Goal
The goal here is to prevent unnecessary copies. Currently, the API design requires us copy the pixel data in many conditions.
Many of these changes refer to the Buffer type. This is in collections.abc as of Python 3.12, and also in typing_extensions if we want to use that for earlier versions. We can alternatively alias it to bytes | bytearray | memoryview on earlier versions.
Proposed API Changes
_raw: Renamerawto_raw. I’m unclear on whether this was ever intended to be exposed. In the proposed new API, it’s the same to the user asbgraanyway, but making it private gives us more flexibility.__init__,from_size: Change thedataargument frombytearraytoBuffer. This lets_grab_implpass us amemoryviewrather than having to make a new copy of the data.bgra,rgb: Return a read-onlyBufferinstead of abytesobject, avoiding a copy.
Also of note:
__array_interface__: Not a type change, but I’m thinking we might make the provided array read-only.buffer,__buffer__: New method.__buffer__is a Python 3.12-only feature that lets the ScreenShot object act as a Buffer itself, while thebuffermethod can be used on any version of Python.
Impact
With just these simple API changes, and no backend-specific changes, the cat-detector and video-capture demos are 12% and 15% faster, respectively. That’s for the whole program, not just the screen capture part. The MSS code I used for this test is at https://github.com/jholveck/python-mss/tree/d38af8a36abe2d7ce9f87f95bca78c7502f2a7ca. The proposed API further allows backends to avoid yet another copy, improving performance even more.
Question: read-write or read-only?
One topic to discuss: currently, the user could modify the pixel data. They had access to the read-write bytearray as sct.raw, and also could modify the raw data via NumPy using __array_interface__.
I’m thinking we may not want to make any guaranteed read-write mechanisms available, since that would mean we can’t cache pixels or rgb. In my current implementation, the .buffer() method can be used to request a read-write buffer, since PyTorch ignores the read-only flag (with a warning). But in general, I don’t see much sense in giving users a way to edit the buffer in-place.
Change 2: ScreenShot types
In anticipation of us offering both CPU- and GPU-side screenshot buffers, I propose we make preparations. I propose we create a new ScreenShot parent class for all possible screenshot types, and a ScreenShotCPU for the CPU-side screenshot objects we have today, which derives from ScreenShot. Some methods, like size, would be on ScreenShot itself. Methods that are only applicable to CPU-side screenshots, like bgra, would be defined on ScreenShotCPU. (A future ScreenShotCUDA or whatever would need to define its own way of giving the user access to the GPU buffer containing pixel data. While it could be named bgra as well, ScreenShotCUDA.bgra method would be a different method interface with a different signature, unrelated to ScreenShotCPU.bgra.)
While there’s not yet a GPU-side capture API available, we may want to make ScreenShot a more generic parent for the future class, so we don’t have to break the API when we do add those.
Python typing generics would allow us to give type checkers the information they need to infer the correct return type from grab. For users who use explicit types in their code, though, this class organization will let them declare the correct types today. That would save us from needing a major release to change the API when we do add GPU capabilities.