Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
d5c8bca
Approve passthrough when clearing no-bin incident
mneuhaus May 16, 2026
9b4d796
Handle channel stuck incidents and C4 exit deadlocks
mneuhaus May 16, 2026
0aa28de
Compose Hive model metadata on publish and rework detail page
mneuhaus May 17, 2026
89c1c89
Fix model variant downloads
mneuhaus May 17, 2026
c012cdf
Show self-describing filename in variant download dropdown
mneuhaus May 17, 2026
b5affc4
Let machines see their owner's private models, fix Hive label
mneuhaus May 17, 2026
33c60b6
Drop Hive selector in Browse Hive, aggregate across all targets
mneuhaus May 17, 2026
9e962e6
Show source Hive on installed models
mneuhaus May 17, 2026
b4cc6bf
Link model titles to their source Hive detail page
mneuhaus May 17, 2026
e800e08
Add RUNBOOK.md capturing the full pull → train → publish recipe
mneuhaus May 17, 2026
4eecd22
Improve sorter incident handling and media pipeline
mneuhaus May 17, 2026
39ce8cb
Make Hive samples public by default and surface machine context
mneuhaus May 20, 2026
14af0b7
Thread machine_id through the training pipeline + add --balance-machine
mneuhaus May 20, 2026
3fc5af4
Add Hive teacher backfill: re-run Gemini/Perceptron/etc. on stored sa…
mneuhaus May 20, 2026
5a2d490
Fix teacher worker DB connection-pool exhaustion under parallel load
mneuhaus May 20, 2026
d4cceb6
Paginate the teacher job detail page + add status filter
mneuhaus May 20, 2026
c9b3158
Hide Machine filter from non-admin samples sidebar
mneuhaus May 20, 2026
7bd7121
Carry samples-list filters into the Review queue
mneuhaus May 20, 2026
a069b6f
Add admin teacher re-run panel inside the Review queue
mneuhaus May 20, 2026
4e8efde
Persist the Review-queue teacher model choice in localStorage
mneuhaus May 20, 2026
02783bb
Fix annotator save in review queue (stale handler after sample swap)
mneuhaus May 21, 2026
5d7b8d5
Merge origin/main into pr-136
spencerhhubert May 21, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/_data/nav.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ sections:
pages:
- title: Architecture
url: /sorter/architecture/
- title: Camera media pipeline
url: /sorter/camera-media-pipeline/
- title: Sorting profile reference
url: /sorter/profile-reference/

Expand Down
85 changes: 85 additions & 0 deletions docs/sorter/camera-media-pipeline.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
---
layout: default
title: Camera Media Pipeline
parent: Sorter
section: sorter
---

# Camera Media Pipeline

The sorter treats camera video as two separate products:

- **Computer-vision frames:** Python/OpenCV-owned, full quality, used for detection,
tracking, samples, calibration, and still captures.
- **Browser live preview:** WebRTC transport, hardware encoded where possible,
optimized for low latency and stable inspection quality.

Python should stay the **control plane**. It owns camera assignment, capture mode,
picture settings, detection orchestration, incidents, sample capture, and metadata.
It should not be the long-term 4K live-video encoder.

## Target Architecture

```text
Camera
-> Python/OpenCV CaptureThread
-> detection / tracking / samples / high-quality stills
-> metadata events

Camera
-> platform media pipeline
-> hardware H.264 encoder
-> WebRTC browser stream

Browser
-> video element
-> metadata overlay canvas/SVG for boxes, zones, incidents, telemetry
```

The browser overlay is deliberately not burned into the video. That keeps the video
encoder focused on the image and lets the UI render boxes/zones crisply at display
resolution.

## Platform Targets

### macOS

Preferred stack:

```text
avfvideosrc -> videoconvert -> vtenc_h264_hw -> h264parse -> WebRTC handoff
```

The encoder target is Apple VideoToolbox through GStreamer.

### Orange Pi 5 / RK3588

Preferred stack:

```text
v4l2src -> videoconvert/RGA -> mpph264enc or rkv4l2h264enc -> h264parse -> WebRTC handoff
```

The encoder target is the Rockchip media pipeline. Element names vary by image,
so the planner accepts `mpph264enc`, `rkv4l2h264enc`, or `v4l2h264enc`.

## Current Implementation Marker

`GET /api/cameras/media-pipeline` reports the desired backend per camera role.
It returns:

- selected backend: `gstreamer_hardware` or the current `python_aiortc` fallback
- required tools and GStreamer elements
- the planned GStreamer pipeline stage
- why the system is falling back

This endpoint is the migration boundary: UI and runtime code should depend on this
capability state instead of hardcoding transport decisions across components.

## Migration Rules

- Prefer lowering preview FPS over lowering capture resolution.
- Keep full-resolution still capture available even when live preview is reduced.
- Do not duplicate camera opens for normal operation.
- Do not burn overlays into the long-term live-video stream.
- Keep Python aiortc as a fallback, not as the final 4K preview foundation.
2 changes: 1 addition & 1 deletion docs/sorter/happy-path-incident-inventory.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Anything that requires recovery motion, operator judgement, hardware repair, or
| `feeder_detection_unavailable` | C2/C3/C4 | Feeder camera detections are unavailable past the grace window. | Operator restores detection or clears the incident. |
| `distribution_chute_jam` | Distribution | Chute/servo motion exceeds the move-time budget. | Operator clears the chute/servo path and clears the incident. |
| `distribution_servo_bus_offline` | Distribution | Every configured distribution layer servo is offline. | Operator restores the bus; incident clears when a servo reports healthy or can be manually cleared. |
| `distribution_no_bin_available` | Distribution | No matching bin/capacity is available for the piece. | Operator assigns capacity, frees a bin, or disables the incident to allow bottom-tray passthrough. |
| `distribution_no_bin_available` | Distribution | No matching bin/capacity is available for the piece. | Operator assigns capacity, frees a bin, or clears the incident to approve one-shot bottom-tray passthrough for that piece. |
| `classification_unresolved` | C4 | C4 reaches the drop deadline or Brickognize timeout before the piece is resolved. | Operator reviews the fallback-to-unknown and clears the incident. |
| `classification_multi_drop_collision` | C4 | Multiple pieces reach the C4 drop window together. | Operator inspects the collision/fallback and clears the incident. |
| `classification_intake_request_timeout` | C4 | C4 requested a piece from C3, but no intake track arrived before timeout. | Operator checks the C3→C4 handoff and clears the incident to retry. |
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
"""add teacher jobs

Revision ID: a9b0c1d2e3f4
Revises: f8b9c0d1e2f3
Create Date: 2026-05-20 22:30:00.000000

"""

from typing import Sequence, Union

from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects import postgresql


revision: str = "a9b0c1d2e3f4"
down_revision: Union[str, None] = "f8b9c0d1e2f3"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None


def upgrade() -> None:
op.create_table(
"teacher_jobs",
sa.Column("id", sa.UUID(), nullable=False),
sa.Column("owner_id", sa.UUID(), nullable=False),
sa.Column("status", sa.String(), nullable=False, server_default="pending"),
sa.Column("openrouter_model", sa.String(), nullable=False),
sa.Column("filter_json", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
sa.Column("total", sa.Integer(), nullable=False, server_default="0"),
sa.Column("processed", sa.Integer(), nullable=False, server_default="0"),
sa.Column("succeeded", sa.Integer(), nullable=False, server_default="0"),
sa.Column("failed", sa.Integer(), nullable=False, server_default="0"),
sa.Column("last_error", sa.String(), nullable=True),
sa.Column("created_at", sa.DateTime(timezone=True), nullable=False),
sa.Column("started_at", sa.DateTime(timezone=True), nullable=True),
sa.Column("finished_at", sa.DateTime(timezone=True), nullable=True),
sa.ForeignKeyConstraint(["owner_id"], ["users.id"], ondelete="CASCADE"),
sa.PrimaryKeyConstraint("id"),
sa.CheckConstraint(
"status IN ('pending', 'running', 'done', 'cancelled')",
name="ck_teacher_jobs_status",
),
)
op.create_index("ix_teacher_jobs_owner_id", "teacher_jobs", ["owner_id"], unique=False)
op.create_index("ix_teacher_jobs_status", "teacher_jobs", ["status"], unique=False)

op.create_table(
"teacher_job_items",
sa.Column("id", sa.UUID(), nullable=False),
sa.Column("job_id", sa.UUID(), nullable=False),
sa.Column("sample_id", sa.UUID(), nullable=False),
sa.Column("status", sa.String(), nullable=False, server_default="queued"),
sa.Column("error_message", sa.String(), nullable=True),
sa.Column("detection_count", sa.Integer(), nullable=True),
sa.Column("detection_score", sa.String(), nullable=True),
sa.Column("created_at", sa.DateTime(timezone=True), nullable=False),
sa.Column("processed_at", sa.DateTime(timezone=True), nullable=True),
sa.ForeignKeyConstraint(["job_id"], ["teacher_jobs.id"], ondelete="CASCADE"),
sa.ForeignKeyConstraint(["sample_id"], ["samples.id"], ondelete="CASCADE"),
sa.PrimaryKeyConstraint("id"),
sa.CheckConstraint(
"status IN ('queued', 'running', 'done', 'error', 'skipped')",
name="ck_teacher_job_items_status",
),
)
op.create_index(
"ix_teacher_job_items_job_id_status",
"teacher_job_items",
["job_id", "status"],
unique=False,
)
op.create_index(
"ix_teacher_job_items_status",
"teacher_job_items",
["status"],
unique=False,
)


def downgrade() -> None:
op.drop_index("ix_teacher_job_items_status", table_name="teacher_job_items")
op.drop_index("ix_teacher_job_items_job_id_status", table_name="teacher_job_items")
op.drop_table("teacher_job_items")
op.drop_index("ix_teacher_jobs_status", table_name="teacher_jobs")
op.drop_index("ix_teacher_jobs_owner_id", table_name="teacher_jobs")
op.drop_table("teacher_jobs")
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
"""add teacher job cost columns

Revision ID: b0c1d2e3f4a5
Revises: a9b0c1d2e3f4
Create Date: 2026-05-20 23:30:00.000000

"""

from typing import Sequence, Union

from alembic import op
import sqlalchemy as sa


revision: str = "b0c1d2e3f4a5"
down_revision: Union[str, None] = "a9b0c1d2e3f4"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None


def upgrade() -> None:
op.add_column(
"teacher_jobs",
sa.Column("cost_usd", sa.Float(), nullable=False, server_default="0"),
)
op.add_column(
"teacher_jobs",
sa.Column("tokens_input", sa.BigInteger(), nullable=False, server_default="0"),
)
op.add_column(
"teacher_jobs",
sa.Column("tokens_output", sa.BigInteger(), nullable=False, server_default="0"),
)
op.add_column("teacher_job_items", sa.Column("cost_usd", sa.Float(), nullable=True))
op.add_column("teacher_job_items", sa.Column("tokens_input", sa.Integer(), nullable=True))
op.add_column("teacher_job_items", sa.Column("tokens_output", sa.Integer(), nullable=True))


def downgrade() -> None:
op.drop_column("teacher_job_items", "tokens_output")
op.drop_column("teacher_job_items", "tokens_input")
op.drop_column("teacher_job_items", "cost_usd")
op.drop_column("teacher_jobs", "tokens_output")
op.drop_column("teacher_jobs", "tokens_input")
op.drop_column("teacher_jobs", "cost_usd")
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
"""add user perceptron api key

Revision ID: c1d2e3f4a5b6
Revises: b0c1d2e3f4a5
Create Date: 2026-05-21 00:30:00.000000

"""

from typing import Sequence, Union

from alembic import op
import sqlalchemy as sa


revision: str = "c1d2e3f4a5b6"
down_revision: Union[str, None] = "b0c1d2e3f4a5"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None


def upgrade() -> None:
op.add_column(
"users",
sa.Column("perceptron_api_key_encrypted", sa.String(), nullable=True),
)


def downgrade() -> None:
op.drop_column("users", "perceptron_api_key_encrypted")
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
"""add preferred teacher model

Revision ID: d2e3f4a5b6c7
Revises: c1d2e3f4a5b6
Create Date: 2026-05-21 00:50:00.000000

"""

from typing import Sequence, Union

from alembic import op
import sqlalchemy as sa


revision: str = "d2e3f4a5b6c7"
down_revision: Union[str, None] = "c1d2e3f4a5b6"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None


def upgrade() -> None:
op.add_column(
"users",
sa.Column("preferred_teacher_model", sa.String(), nullable=True),
)


def downgrade() -> None:
op.drop_column("users", "preferred_teacher_model")
6 changes: 6 additions & 0 deletions software/hive/backend/app/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,12 @@ class Settings(BaseSettings):
PROFILE_CATALOG_AUTO_SYNC_COLORS_MAX_AGE_HOURS: int = 24
PROFILE_CATALOG_AUTO_SYNC_CATEGORIES_MAX_AGE_HOURS: int = 168
OPENROUTER_BASE_URL: str = "https://openrouter.ai/api/v1"
# Perceptron exposes an OpenAI-compatible /v1/chat/completions endpoint.
PERCEPTRON_BASE_URL: str = "https://api.perceptron.inc/v1"
# Hard cap on in-flight teacher items across all providers combined. Per-provider
# concurrency is also capped (see adapter.max_concurrent) so a single noisy provider
# can't starve the pool — this is just the upper bound on overall thread count.
TEACHER_WORKER_PARALLELISM: int = 6
DEFAULT_AI_MODEL: str = "anthropic/claude-sonnet-4.6"
PROFILE_AI_PROMPT_CACHE_ENABLED: bool = True
PROFILE_AI_PROMPT_CACHE_TTL: str | None = None
Expand Down
12 changes: 11 additions & 1 deletion software/hive/backend/app/database.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,15 @@

from app.config import settings

engine = create_engine(settings.DATABASE_URL, pool_pre_ping=True)
# Default pool (5 + 10 overflow = 15) is too tight once the teacher worker pool runs
# alongside normal web traffic. The worker briefly holds a session per item-claim + one
# per result-write across ``TEACHER_WORKER_PARALLELISM`` threads, plus FastAPI handlers
# grab their own. Raising the ceiling avoids the QueuePool timeouts we hit in prod.
engine = create_engine(
settings.DATABASE_URL,
pool_pre_ping=True,
pool_size=20,
max_overflow=20,
pool_recycle=1800,
)
SessionLocal = sessionmaker(bind=engine, autoflush=False, expire_on_commit=False)
5 changes: 5 additions & 0 deletions software/hive/backend/app/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,11 @@
samples,
sets,
stats,
teacher,
upload,
)
from app.services.profile_catalog import get_existing_profile_catalog_service, get_profile_catalog_service
from app.services.teacher_worker import get_teacher_worker

limiter = Limiter(key_func=get_remote_address)

Expand All @@ -31,9 +33,11 @@
async def lifespan(_app: FastAPI):
if settings.PROFILE_CATALOG_AUTO_SYNC_ENABLED and settings.REBRICKABLE_API_KEY:
get_profile_catalog_service().start_auto_sync_loop()
get_teacher_worker().start()
try:
yield
finally:
get_teacher_worker().stop()
service = get_existing_profile_catalog_service()
if service is not None:
service.stop_auto_sync_loop()
Expand Down Expand Up @@ -66,6 +70,7 @@ async def lifespan(_app: FastAPI):
app.include_router(models_router.router)
app.include_router(machine_models.router)
app.include_router(api_keys.router)
app.include_router(teacher.router)


@app.get("/api/health")
Expand Down
1 change: 1 addition & 0 deletions software/hive/backend/app/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,4 @@ class Base(DeclarativeBase):
from app.models.machine_set_progress import MachineSetProgress # noqa: E402, F401
from app.models.detection_model import DetectionModel, DetectionModelVariant # noqa: E402, F401
from app.models.user_api_key import UserApiKey # noqa: E402, F401
from app.models.teacher_job import TeacherJob, TeacherJobItem # noqa: E402, F401
Loading