lib: pixel: towards gstreamer/ffmpeg/opencv/libcamera

Implementation:

- https://github.com/zephyrproject-rtos/zephyr/pull/86671

## Introduction

[SIMD brings video abilities to MCUs](https://www.synopsys.com/designware-ip/technical-bulletin/signal-processing-risc-v-dsp-extensions.html).

Take advantage of vector extensions and software to process video dta on very low-RAM (~1 MBytes, small for video) targets by introducing libraries helping with this.

### Problem description

Color image sensors produce [bayer data](https://en.wikipedia.org/wiki/Bayer_filter), which go x3 in size once converted to RGB. An MCU that has 1 MByte of RAM cannot store both the input and output VGA frame in these conditions: `640 * 480 * (1 + 3) = 1228800`.

### Proposed change

By inserting a ring buffer between line-processing functions, the flow becomes very linear:

```
convert_1line_awb() --(gearbox)--> convert_2lines_debayer_3x3() --(gearbox)--> convert_8lines_jpeg()
```

Such gearbox is proposed on top of `ring_buf` and turned into an API.

## Detailed RFC

Processing the image end-to-end one line at a time works. For instance:

```
convert_1line_awb() --+--> convert_2lines_debayer_3x3() --+--> convert_8lines_jpeg()
convert_1line_awb() --'                                   |
convert_1line_awb() --+--> convert_2lines_debayer_3x3() --+
convert_1line_awb() --'                                   |
convert_1line_awb() --+--> convert_2lines_debayer_3x3() --+
convert_1line_awb() --'                                   |
convert_1line_awb() --+--> convert_2lines_debayer_3x3() --'
convert_1line_awb() --'
```

But this is becomes complex when working in non-power-of-two number of lines, and tedious to stitch by hand.

In addition, having a library that automates the process permits to define "stream processors" that are independent on their context...

- Facilitates writing new line conversion functions, and stitch them into streams (**[gstreamer](https://gstreamer.freedesktop.org/)** style).
- This will be extended in the future to cover a complete ISP pipeline for Zephyr (**[libcamera](https://libcamera.org/)** style)
- Most image pre-processing algorithms can be implemented on a line-based fashion (**[opencv](https://opencv.org/)** style).
- A driver using this to automatically convert data between input and output formats is provided (**[ffmpeg](https://ffmpeg.org/)** style)

Performance-wise:

- Having this extra gearbox library between lines conversion functions adds some overhead, but this only happens once a full line is converted. For instance, every `640 * 3` bytes for VGA resolution, considered low impact.
- This plays well with SIMD instructions that rarely process data one *column* at a time, but instead work on contiguous bytes.
- This permits the transfer of the converted image to start as soon as the first line of data is converted, without waiting the full conversion.

### Proposed change (Detailed)

No SIMD support for now:

- `<zephyr/pixel/stream.h>`: the gearbox library that permits to stitch stream processors together.
- `<zephyr/pixel/formats.h>`: containing format conversion with support for RGB24, RGB565, YUYV (from/to)
- `<zephyr/pixel/bayer.h>`: containing format conversion for bayer input: RGGB8, BGGR8, GRBG8, GBRG8
- `<zephyr/pixel/resize.h>`: resize a full frame by using fast but low-quality subsampling.
- `<zephyr/pixel/stats.h>`: collect statistics from a frame: RGB or Y (luma) channel averages or histograms.
- `<zephyr/pixel/print.h>`: utilities calling `printf()` to display a hexdump of the data, as well as the actual colored image, and histograms.

### Dependencies

None (mostly `<stdint.h>`, and `<sys/util.h>`).

### Concerns and Unresolved Questions

Should this be part of Zephyr or be put in a separate repo as a module?

## Alternatives

Let the application process everything.
Use the vendor-specific libraries directly, although [CMSIS-CV](https://github.com/ARM-software/CMSIS-CV/tree/main/Source) is almost empty compared to i.e. OpenCV, [others](https://github.com/alifsemi/alif_image-processing-lib) have a simple API but without standard available (which this PR strives to deliver), and there are other CPU architectures with extensions than ARM, so would not count as generic front-end for them.

Open to suggestions!

Example of what an end-to-end flow looks like on `native_sim`:

![Image](https://github.com/user-attachments/assets/45c13ad8-e38e-486b-8a1f-22e5ba6082f3)
![Image](https://github.com/user-attachments/assets/6ee71291-531d-4764-9d04-cc54ea1263fd)

P.S. thanks @VynDragon (MASSDRIVER EI) for the help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

lib: pixel: towards gstreamer/ffmpeg/opencv/libcamera #86669

Introduction

Problem description

Proposed change

Detailed RFC

Proposed change (Detailed)

Dependencies

Concerns and Unresolved Questions

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

lib: pixel: towards gstreamer/ffmpeg/opencv/libcamera #86669

Description

Introduction

Problem description

Proposed change

Detailed RFC

Proposed change (Detailed)

Dependencies

Concerns and Unresolved Questions

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions