Add remote exec capability for foundation models missing it by hansent · Pull Request #1968 · roboflow/inference

hansent · 2026-02-04T19:40:39Z

Manual testing results

I tested the workflows blocks with added remote exec functionality on a dev GPU with following results:

working:
gaze - works both with local and remote exec
depth estimation: works both with local and remote exec
SAM2: works both with local and remote exec
florence - works both with local and remote exec (some workflow block config causes broken results, but same for local/remote)
Moondream2 - works both with local and remote exec

not working/couldnt test:

smolvlm: doesnt work local or remote because model download fails (currently disabled on serverless anyway)
qwen: doesnt work local or remote because model download fails (currently disabled on serverless anyway)
SAM3 3d: couldnt install dependencies get model working (disabled on serverless anyway)

I think its ok to add the missing remote exec functionality and endpoints handling for these already also.

1. HTTP Client Methods Added (`inference_sdk/http/client.py`)

infer_lmm() - Generic LMM endpoint for Florence2, Moondream2, SmolVLM, Qwen models
depth_estimation() - Depth estimation endpoint
sam2_segment_image() - SAM2 segmentation endpoint
sam3_3d_infer() - SAM3 3D object generation endpoint
Added async variants for all methods

2. HTTP API Endpoint Added (`inference/core/interfaces/http/http_api.py`)

/sam3_3d/infer - New endpoint for SAM3 3D object generation with JSON-serializable response (base64-encoded binary data)

3. Workflow Blocks Updated with Remote Execution

All blocks now support StepExecutionMode.REMOTE:

Block	Client Method Used
Gaze v1	`detect_gazes()`
Depth Estimation v1	`depth_estimation()`
Segment Anything 2 v1	`sam2_segment_image()`
Florence2 v1	`infer_lmm()`
Moondream2 v1	`infer_lmm()`
SmolVLM v1	`infer_lmm()`
Qwen2.5-VL v1	`infer_lmm()`
Qwen3-VL v1	`infer_lmm()`
Segment Anything 3 3D v1	`sam3_3d_infer()`

4. Unit Tests Added

test_gaze_remote.py - Gaze detection remote execution tests
test_depth_estimation.py - Depth estimation tests
test_segment_anything2.py - SAM2 tests
test_vlm_remote_execution.py - Tests for Florence2, Moondream2, SmolVLM, Qwen2.5-VL, Qwen3-VL
test_segment_anything3_3d.py - SAM3 3D tests

5. Documentation Updated

Added "Execution Modes" sections to:

docs/foundation/florence2.md
docs/foundation/gaze.md
docs/foundation/depth_estimation.md
docs/foundation/sam2.md
docs/foundation/moondream2.md
docs/foundation/smolvlm.md
docs/foundation/sam3_3d.md

codeflash-ai · 2026-02-04T20:51:57Z

⚡️ Codeflash found optimizations for this PR

📄 45% (0.45x) speedup for `Qwen3VLBlockV1.run` in `inference/core/workflows/core_steps/models/foundation/qwen3vl/v1.py`

⏱️ Runtime : 5.64 milliseconds → 3.90 milliseconds (best of 71 runs)

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up method Qwen3VLBlockV1.run by 45% in PR #1968 (remote-exec-for-all-models) #1971

If you approve, it will be merged into this PR (branch remote-exec-for-all-models).

bigbitbus

LGTM.

grzegorz-roboflow

LGTM

hansent added 3 commits February 4, 2026 13:38

add remote exec for foundation models missing it

a73b220

make style

3f3b1d2

fix missing name in unit tests

e294e3b

codeflash-ai bot mentioned this pull request Feb 4, 2026

⚡️ Speed up method Qwen3VLBlockV1.run by 45% in PR #1968 (remote-exec-for-all-models) #1971

Open

hansent and others added 5 commits February 4, 2026 15:37

fix depth estimation endpoint to return propper base64

758ecf8

allow passing model id explicitly in /infer/llm endpoint

28500f7

Merge branch 'main' into remote-exec-for-all-models

f1a89a1

fix image returned by depth estimation block on remote exec

7e0d67b

Merge branch 'main' into remote-exec-for-all-models

21200b7

hansent marked this pull request as ready for review February 5, 2026 19:08

hansent requested review from PawelPeczek-Roboflow, grzegorz-roboflow, probicheaux and yeldarby as code owners February 5, 2026 19:08

bigbitbus approved these changes Feb 5, 2026

View reviewed changes

grzegorz-roboflow approved these changes Feb 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add remote exec capability for foundation models missing it#1968

Add remote exec capability for foundation models missing it#1968
hansent wants to merge 8 commits intomainfrom
remote-exec-for-all-models

hansent commented Feb 4, 2026 •

edited

Loading

Uh oh!

codeflash-ai bot commented Feb 4, 2026

⚡️ Speed up method `Qwen3VLBlockV1.run` by 45% in PR #1968 (`remote-exec-for-all-models`) #1971

Uh oh!

bigbitbus left a comment

Uh oh!

grzegorz-roboflow left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hansent commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Manual testing results

1. HTTP Client Methods Added (inference_sdk/http/client.py)

2. HTTP API Endpoint Added (inference/core/interfaces/http/http_api.py)

3. Workflow Blocks Updated with Remote Execution

4. Unit Tests Added

5. Documentation Updated

Uh oh!

codeflash-ai bot commented Feb 4, 2026

⚡️ Codeflash found optimizations for this PR

📄 45% (0.45x) speedup for Qwen3VLBlockV1.run in inference/core/workflows/core_steps/models/foundation/qwen3vl/v1.py

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up method Qwen3VLBlockV1.run by 45% in PR #1968 (remote-exec-for-all-models) #1971

Uh oh!

bigbitbus left a comment

Choose a reason for hiding this comment

Uh oh!

grzegorz-roboflow left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hansent commented Feb 4, 2026 •

edited

Loading

1. HTTP Client Methods Added (`inference_sdk/http/client.py`)

2. HTTP API Endpoint Added (`inference/core/interfaces/http/http_api.py`)

📄 45% (0.45x) speedup for `Qwen3VLBlockV1.run` in `inference/core/workflows/core_steps/models/foundation/qwen3vl/v1.py`

⚡️ Speed up method `Qwen3VLBlockV1.run` by 45% in PR #1968 (`remote-exec-for-all-models`) #1971