Video processing in `inference` server #679

PawelPeczek-Roboflow · 2024-09-25T13:34:40Z

Description

The goal of this feature is to bring video processing capabilities into inference server - long story short, Workflows should run against videos without additional scripts needed.

State of the work:

🟢 Old enterprise stream management components copied and adjusted to process workflows
🟢 Basic endpoints to manage stream states enabled (initialise, list, get state, consume, pause, resume, terminate)
🟢 Basic tests coverage
🔴 Full support for old enterprise features (old stream management was running InferencePipeline without Workflows)
🔴 true integration tests
🔴 Functionality to start video processing on start of the conrainer

Issues spotted:

Performance

The same workflow tested, reporting only latency for single frame processing inside WorkflowRunner.run_workflow(...) function:

MacBook, bare metal in script using InferencePipeline directly - ~40ms
MacBook, inside docker container, behind API - ~110ms
Jetson Nano Orin, bare metal in script using InferencePipeline directly - not measured precisely now, but older tests indicated the same performance as on MacBook when yolov8n-640 used - which was the model used in test case
Jetson Nano Orin, inside docker container, behind API - ~50ms

We have docker overhead, not 100% sure if it is visible on Jetson devices, but MacBook one makes drop from 27fps into <10fps 😢

Passing localhost camera to docker

On MacBook it is very hard (requires tons of configuration - https://medium.com/@jijupax/connect-the-webcam-to-docker-on-mac-or-windows-51d894c44468) to pass device camera to container which would be required to have nice demos without UI streaming to the inside of container. @grzegorz-roboflow suggested passing frames through Unix socket which seems feasible - please clarify if I should allocate time to implement that.

Open questions

do we want to port all old functionalities
are we fine with this feature to be completed without re-streaming (managed to verify that pooling results from buffer (consume result HTTP endpoint) is ok - not great / not terribly bad
how we should deal with auth on endpoints not touching API key gated resources (for instance manipulation of pipelines state) - I would assume we should auth API key in out backend to avoid malicious attacks, right? But then - how about offline use-cases?
We have the following setup now:
- We init processing starting pipeline in separate process
- Each init ends with success when process starts
- We do not wait for pipeline to connect to source (we would block other requests)
- We end up in a state when client needs to check for pipeline status - initial video source connection error terminates inference pipeline process - pipeline will re-connect only if connection to camera is broken after initial connection is established
- That may be problematic to wait for initial connection to complete in general sense, if we wanted to retry, we would block the stream manager socket for longer
- not really sure how to address the problem, but I see this problematic when we enable the "on-start-processing" feature and people may sometimes be surprised on their pipelines failing on tmp connectivity issues with cameras

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

YOUR_ANSWER

Any specific deployment considerations

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs

Docs updated? What were the changes:

…/video_processing_in_inference_server

inference/core/interfaces/http/http_api.py

hansent · 2024-09-25T15:21:54Z

do we want to port all old functionalities
what thing are part of this?

I think we can treat it as separate feature for workflows / don't need to port all stream management features until we need them. E.g. not sure we ned pause/resume explicitly (benefit is really just to not have to create / configure the stream again, which is ideally done as part of worklfow spec now anyway?)

are we fine with this feature to be completed without re-streaming (managed to verify that pooling results from buffer (consume result HTTP endpoint) is ok - not great / not terribly bad
I think yes lets go without re-streaming for now. As long as we can get video back out to look at to display for now we can focus on building the API and workflow processing. Seems like the kind of thing that can be added somewhat cleanly later if we need it, but increases overall scope /complexity if we do it all at once?

Reasons for doing it now would be if it requires different architecture for how we process / start / manage streams, but I don't think thats the case.

another thought: could re-streaming be separate sink / stateful block that creates a stream?

how we should deal with auth on endpoints not touching API key gated resources (for instance manipulation of pipelines state) - I would assume we should auth API key in out backend to avoid malicious attacks, right? But then - how about offline use-cases?

Can we do it the same way we do for inference / workflow endpoints? For dedicated deployments I think we have a check that API key matches the owner of the deployment. For local / user managed we can allow the requests but auth on model access / other api calls that need API keys

hansent · 2024-09-25T15:29:36Z

Passing localhost camera to docker
On MacBook it is very hard (requires tons of configuration - https://medium.com/@jijupax/connect-the-webcam-to-docker-on-mac-or-windows-51d894c44468) to pass device camera to container which would be required to have nice demos without UI streaming to the inside of container. @grzegorz-roboflow suggested passing frames through Unix socket which seems feasible - please clarify if I should allocate time to implement that.

I think we need:

a way to use usb / local device as video source for device deployments (jetson, roboflow box). we have that for bare metal I think. Should work in docker for jetson / roboflow boxes / field deployments, less important to work ion docker on mac (especially if we have a way to use network input also)
get video input via network. I think RTSP and webrtc would be ideal and would cover a lot of use cases / allow building adapters in front of it when needed.

PawelPeczek-Roboflow added 10 commits September 12, 2024 20:08

Add scratch of implementation

def327c

Merge remote-tracking branch 'origin/byte-tracker-block' into feature…

8403d8e

…/video_processing_in_inference_server

Merge remote-tracking branch 'origin/byte-tracker-block' into feature…

f620030

…/video_processing_in_inference_server

WIP - added changes required for demo

e7578d5

Resolved conflics with main

e157475

Get rid of unwanted extensions

371edef

Add dummy debiugging regarding speed

0ca2739

Enable debug

1994111

Start stream manager in GPU build

c9a8076

Add basic tests

8816972

PawelPeczek-Roboflow requested review from grzegorz-roboflow, yeldarby, probicheaux and hansent as code owners September 25, 2024 13:34

PawelPeczek-Roboflow marked this pull request as draft September 25, 2024 13:35

github-advanced-security bot found potential problems Sep 25, 2024

View reviewed changes

inference/core/interfaces/http/http_api.py Fixed Show resolved Hide resolved

inference/core/interfaces/http/http_api.py Fixed Show fixed Hide fixed

inference/core/interfaces/http/http_api.py Fixed Show fixed Hide fixed

PawelPeczek-Roboflow added the release 0.21.0 label Sep 25, 2024

PawelPeczek-Roboflow added 11 commits September 26, 2024 15:36

Add CLI extension and support for L4V2 VideoSource

a43f651

Add CLI command option

57888bc

Let re-use images

55ad939

Let re-use images

8710650

WIP

7529505

Improve API design and add SDK client

3b51db0

Fix test

b298533

Enable API by default

727a511

Apply fixes

d99075e

Apply fixes for tests

864c868

Fix another sensitive info leak

813c595

PawelPeczek-Roboflow marked this pull request as ready for review September 27, 2024 07:46

PawelPeczek-Roboflow added 3 commits September 27, 2024 09:47

Apply fixes

a0dae86

Merge branch 'main' into feature/video_processing_in_inference_server

b434955

Make the linter happy

e555cf4

PawelPeczek-Roboflow added the release-branch label Sep 27, 2024

PawelPeczek-Roboflow added 4 commits September 27, 2024 10:24

Fix issues spotted while tesing

6b1628e

Add tests

668212c

Kick-off docs

19f1123

Resolved conflics with main

38b2308

PawelPeczek-Roboflow requested a review from capjamesg as a code owner September 27, 2024 09:39

grzegorz-roboflow previously approved these changes Sep 27, 2024

View reviewed changes

Add docs

64f2525

PawelPeczek-Roboflow dismissed grzegorz-roboflow’s stale review via 64f2525 September 27, 2024 10:27

grzegorz-roboflow approved these changes Sep 27, 2024

View reviewed changes

PawelPeczek-Roboflow merged commit 6acb030 into main Sep 27, 2024
57 checks passed

PawelPeczek-Roboflow deleted the feature/video_processing_in_inference_server branch September 27, 2024 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Video processing in `inference` server #679

Video processing in `inference` server #679

PawelPeczek-Roboflow commented Sep 25, 2024 •

edited

Loading

hansent commented Sep 25, 2024

hansent commented Sep 25, 2024

Video processing in inference server #679

Video processing in inference server #679

Conversation

PawelPeczek-Roboflow commented Sep 25, 2024 • edited Loading

Description

State of the work:

Issues spotted:

Performance

Passing localhost camera to docker

Open questions

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs

hansent commented Sep 25, 2024

hansent commented Sep 25, 2024

Video processing in `inference` server #679

Video processing in `inference` server #679

PawelPeczek-Roboflow commented Sep 25, 2024 •

edited

Loading