QVAC-19797 test[api]: ocr-ggml Metal GPU perf coverage + easyocr gallocr memory fix by olyasir · Pull Request #2483 · tetherto/qvac

olyasir · 2026-06-08T12:38:02Z

Summary

Adds Metal GPU performance coverage to the ocr-ggml CI perf table plus an EasyOCR backend-memory fix. Rebased onto current main (after #2457 landed the Adreno guard + Android-Vulkan, and alongside in-flight #2458 for the CPU-vs-Vulkan benchmark).

Commits

fix[api]: reuse ggml_gallocr across input sizes (easyocr detection + recognition steps) — keep the gallocr/backing buffer alive and resize in place across region widths instead of free+recreate per size. Cuts the backend-heap churn that fragments the Metal heap and triggers OOM command-buffer failures on memory-constrained devices.
test[api]: select Metal on Apple desktop — getBackendDevice() returns metal on darwin (was vulkan|cpu only), so the macOS leg records Metal [GPU] rows. Merged with main's Android-Vulkan selection; the OCR_GGML_BACKEND override now also accepts metal.
test[api]: run DocTR on CPU under Metal — the DocTR recognizer's per-region ggml compute is non-deterministically unstable on the constrained macos-15-xlarge runner (aborts status -1 / silent detection collapse). DocTR forces CPU on Metal; EasyOCR (stable on Metal) keeps its Metal GPU pass. This is a deliberate scope, not a fix — see follow-up below.

Backend coverage in the combined perf table

Backend	Coverage
CPU	all platforms
Vulkan `[GPU]`	Linux (ubuntu-24.04) + Windows (EasyOCR + DocTR); Android Mali via #2458
Metal `[GPU]`	macOS — EasyOCR (DocTR CPU-only there by design)

Follow-up (separate work)

DocTR-on-Metal stability — the real fix is an open GPU-memory-pressure investigation (couldn't reproduce on the 192 GB M3 Ultra; next step is repro on the constrained mac-mini-1 M4 with GPU-allocation tracking). Tracked separately so this PR isn't blocked on it.
Mobile GPU beyond Android-Vulkan and the Adreno guard already on main.

The EasyOCR detection (CRAFT) and recognition (CRNN) steps rebuild their ggml graph whenever the input size changes — detection on each new image size, recognition on each distinct text-region width. Previously each rebuild freed the ggml_gallocr and allocated a brand-new, differently-sized backend (Metal/Vulkan) buffer. That repeated alloc/free of varying-size GPU buffers churns and fragments the device heap, which can surface as out-of-memory command-buffer failures on memory-constrained devices (phones, small CI runners) even though steady-state process footprint stays flat. Keep a single gallocr per step and let ggml_gallocr_alloc_graph resize its backing buffer in place across sizes (the canonical llama.cpp pattern). Only the size-specific ggml_context is freed and rebuilt; the gallocr is freed just once, in the destructor / on alloc failure. Output is unchanged — verified identical region counts on Metal (M3 Ultra). DocTR is fixed-size (graphs allocated once) so it is unaffected.

…gration suite getBackendDevice() previously resolved only 'vulkan' or 'cpu', so the macOS matrix leg always ran CPU-only and the cross-platform perf table never carried Metal GPU numbers (only the Linux/Windows Vulkan runners recorded [GPU] rows). ggml's Metal backend is compiled into the addon on darwin — there is no loadable ggml-vulkan lib to probe — so resolve 'metal' directly on Apple desktop. The addon's backend selection falls back to CPU when no Metal device is present, and the suite's GPU detection is backend-agnostic (backendDevice === 'GPU'/'IGPU', stats.backendIsGpu === 1), so Metal passes are tagged [GPU] and compared against a forced-CPU pass exactly like Vulkan. The OCR_GGML_BACKEND override now also accepts 'metal' so a leg can force it.

… CI) The DocTR recognizer runs ggml_backend_graph_compute once per detected region, and on the constrained macos-15-xlarge CI runner the Metal backend is non-deterministically unstable under that sustained load: it either aborts the whole suite ("[DoctrRecognitionGGML] ggml backend graph compute failed with status -1", exit 134) or silently collapses detection to garbage. The failure is not size-bound — the same image's BMP pass passes while its JPEG/PNG passes fail, and clinical_chemistry (previously stable) failed too — so there is no stable per-test subset to scope around. Force CPU for ALL DocTR comparison passes when the auto-selected backend is Metal (in runDoctrComparison), plus the doctr-models batch-equivalence test that calls runDoctrOCR directly. Vulkan keeps its DocTR [GPU] pass, and EasyOCR (runOcrComparison, a different recognizer path that is stable on Metal — the dense canvasSize page passes) keeps its Metal [GPU] pass. macOS therefore records Metal numbers for EasyOCR while DocTR stays CPU-only there, and the suite no longer aborts. Real Metal-on-Apple DocTR stability remains a separate GPU-memory-pressure investigation (repro on the constrained mac-mini-1 M4).

olyasir requested review from a team as code owners June 8, 2026 12:38

olyasir added 3 commits June 8, 2026 16:15

olyasir force-pushed the ocr-ggml-gpu-test-coverage branch from acd92d5 to 364067b Compare June 8, 2026 13:18

olyasir changed the title ~~QVAC-19797 test[api]: ocr-ggml GPU (Vulkan + Metal) perf coverage + Adreno/gallocr backend fixes~~ QVAC-19797 test[api]: ocr-ggml Metal GPU perf coverage + easyocr gallocr memory fix Jun 8, 2026

olyasir temporarily deployed to release June 8, 2026 13:19 — with GitHub Actions Inactive

olyasir temporarily deployed to release June 8, 2026 13:27 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QVAC-19797 test[api]: ocr-ggml Metal GPU perf coverage + easyocr gallocr memory fix#2483

QVAC-19797 test[api]: ocr-ggml Metal GPU perf coverage + easyocr gallocr memory fix#2483
olyasir wants to merge 3 commits into
mainfrom
ocr-ggml-gpu-test-coverage

olyasir commented Jun 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

olyasir commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Commits

Backend coverage in the combined perf table

Follow-up (separate work)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

olyasir commented Jun 8, 2026 •

edited

Loading