Hive teacher backfill, Perceptron integration, sorter incident hardening, model publish polish#136
Merged
Conversation
- train publish --dataset-dir auto-fills the structured training_metadata the Hive frontend expects (model.best_metrics, dataset.selection, benchmarks, variant_sizes_bytes) from build.json + track_*_results.json - Add train compose-metadata as a standalone preview command - train build --max-empty-fraction caps empties as a share of the final dataset so --keep-empty doesn't pull in all negatives - ModelTrainingReport adapts to available data: hide HOLDOUT F1 / Decision Match / audit / precheck / count-spectrum sections when those fields are missing, and add an Inference Performance section + training-setup chips
serve_model_variant set Content-Length=<file_size> on the response, which
is wrong when the storage backend returns a 307 redirect to a presigned
S3 URL: the redirect body is empty and starlette aborts the connection
with 'Response content shorter than Content-Length'. Move the file size
out to an X-Model-Size header and let the framework set Content-Length.
Also drop the download={file_name} attribute on the variant link so the
server's Content-Disposition (built by build_download_filename, format
{slug}_v{version}_{date}_{runtime}.ext) is what the browser saves as.
Mirror build_download_filename in the SvelteKit page so the dropdown
shows the same {slug}_v{version}_{date}_{runtime}{ext} the browser will
save, instead of the raw on-disk name like best.onnx. Also fix the
backend to preserve .tar.gz as a compound suffix (Path.suffix alone
returned just .gz for ncnn bundles).
- /api/machine/models filtered out is_public=False, so a freshly published private model couldn't appear in the Sorter's 'Browse Hive' tab. Widen the filter to also include models owned by the same account the machine is registered under (public for everyone else, private for the owner). - Apply the same filter to the per-model GET and variant download routes so the detail/download paths stay consistent with the list. - Sorter UI: relabel the Browse Hive selector from 'Target / <machine name>' to 'Hive / <url>'. The dropdown identifies which Hive instance we're pulling from, not the machine name we registered as.
The Sorter previously forced you to pick one configured Hive at a time and then browsed that one in isolation. Merge the view: when no target_id is supplied the backend now fans out across every enabled target, tags each item with its source (target_id / target_url / target_name), and returns one combined list sorted by published_at. Per-target failures land in a non-fatal `errors` array so one unreachable Hive doesn't blank the catalog. Each model row in the UI now shows the source host (e.g. `hive.basically.website`) instead of hiding it behind a selector, and the per-row Download / detail calls use the model's own target_id rather than a global selectedTargetId.
Mirror the Browse Hive view: each installed row now carries its source Hive's host inline in the meta line (e.g. `hive.basically.website`, or `bundled` for repo-shipped models), and the expanded details panel labels the value 'Hive' with the full URL instead of the old machine name. Same helper renders both spots so they stay in sync.
In both Browse Hive and Installed, the model name is now a link to <target_url>/models/<model_id> on the source Hive (new tab, rel=noopener). Bundled models keep the plain text label since they don't live on any Hive.
Documents the concrete workflow we ran for the 2026-05-17 c-channel-combined yolo26s-320 model so the next training run can be reproduced from a single document instead of reconstructing it from the README's high-level diagram. Includes the new --max-empty-fraction, --dataset-dir auto-compose, --benchmark-json, and Vast.ai destroy step, plus the prod deploy command and a note on the still-TBD Hailo preset.
- Reads (list/detail/diversity/assets) drop the owner-restricted query — any signed-in member can browse all samples; writes still gated to owner or reviewer/admin. Add ?scope=mine opt-in for the legacy view. - Samples list URL gains scope + max_age_hours filter keys via sampleListContext. - Sample detail sidebar shows machine name + owner avatar/display name and links to that machine's sample list across users. - Machine list endpoint exposes nested owner block (auto-coerced from the SQLAlchemy relationship via a pydantic before-validator). - Diversity overview gains a machine_factor: coverage is multiplied by min(distinct_machines, 3) / 3 so a reason captured from one rig only can't read as "done". Trends + ETA fold the same factor in. - test_samples.py updated to assert the new public-default behaviour and the per-scope filter.
- train pull writes machine_id into each manifest entry (taken from the Hive sample detail response). - _LabeledSample carries machine_id; build.py records per-machine counts in build.json regardless of flags so the diversity audit always shows how skewed the selection is. - New --balance-machine flag adds machine to the equal-quota balance group key alongside source_role and piece_count, so one rig can't dominate the FPS-sampled selection.
…mples
Admin-only flow to re-detect bounding boxes on existing samples through a
pluggable provider adapter. Built around a typed adapter Protocol so each
model gets isolated request/response handling instead of one detector
branching on model id.
Backend:
- TeacherJob + TeacherJobItem models with status_counts aggregation, cost
tracking (real usage from provider + projected total), and a parallel
worker that claims items via SELECT...FOR UPDATE SKIP LOCKED.
- Worker uses a ThreadPoolExecutor (TEACHER_WORKER_PARALLELISM=6) with
per-adapter max_concurrent + min_interval_s caps so a single provider
can't monopolize the pool. 429 responses raise TeacherRateLimitError
with Retry-After parsed, triggering exponential-backoff retry +
jitter.
- Adapter registry: OpenRouterChatAdapter (Gemini 3/3.1/3.5, Qwen 3.6,
Kimi, MiMo, Nemotron), GrokAdapter (overrides bbox parse to XYXY
instead of YXYX), PerceptronAdapter (calls Perceptron's native API
directly at api.perceptron.inc/v1/chat/completions instead of via
OpenRouter, which couldn't reliably trigger grounded mode).
- Per-user secrets via secret_kind: openrouter_api_key and
perceptron_api_key both encrypted; resolver picks the right one per
adapter.
- preferred_teacher_model on the user separates teacher fallback from
the AI chat assistant's preferred_ai_model.
- Endpoints: POST jobs (filter-driven), GET jobs (list + detail),
POST cancel, POST samples/{id}/rerun (sync single-sample bypassing the
worker), POST samples/{id}/preview (non-destructive for the compare
page), GET samples/{id}/prompt, GET models (registry).
- Migrations 0xa9-0xd2 add teacher_jobs + teacher_job_items + cost
columns + the two new user secret/preference columns.
Frontend:
- /samples gains an admin "Re-run teacher" button + Jobs link + a
live-polling status banner that auto-restores on page reload by
fetching the latest in-flight job.
- /samples/[id] gains "Compare models" + "Re-run teacher" buttons in
the header toolbar (real Button primitives, not text links). Sidebar
Machine row links to the owning machine's sample list.
- /admin/teacher-jobs splits Active (big cards with cost + tokens) from
History (compact rows). Each job links to a detail view with the
remaining items section, status counts, and a finished-items tail.
- /samples/[id]/compare runs every supported model on the same sample
side-by-side. Per-tile image + bboxes, per-tile metrics (boxes, score,
cost, latency), "Show raw response" expander for debugging coord
formats, and an editable prompt textarea that overrides the default
for chat-style adapters (Perceptron ignores override since its native
short instruction is what triggers grounded XML output).
- /settings gains a Perceptron API Key card + a Default Teacher Model
select populated from /api/admin/teacher/models.
- Samples list filters: Capture Reason dropped (rarely useful), source
labels renamed (c_channel_2 → C2, classification_channel → C-Channel
4 (Classification)), new Age filter (24h / 7d / 30d / All) wired
through to the backend max_age_hours param.
- New warning-strong + warning-bg color tokens fix the unreadable yellow
text in error banners.
Sorter:
- gemini_sam_detector.py gets the same compact classification_channel
prompt as the Hive copy (lock-step per project_teacher_zones).
The new parallel worker held one SQLAlchemy session per in-flight item for
the entire adapter call — including the multi-second Perceptron/Gemini
round-trip plus 429 backoff sleeps. With 6 workers each pinning a
connection for 5-30s the default pool (5+10) timed out, taking the whole
backend with it ("QueuePool limit of size 5 overflow 10 reached").
Restructure _run_item into three short-lived transactions:
1. _load_item_context: open session, fetch item+job+sample+owner, decrypt
the API key, read the image bytes, close session
2. (no session) throttle + adapter.detect() with backoff retry
3. _write_result: reopen session, re-fetch by id, apply mutations, commit
Plus bump the engine pool to 20+20 with pool_recycle=1800 so a noisy
backfill can't elbow user-facing API requests out of the way.
At 4k+ items the single-page item dump made the detail view unusable.
Backend gains items_status, items_page, items_page_size query params on
GET /api/admin/teacher/jobs/{id}. status_counts still scans the whole
job so the header badges stay accurate; the items list is paginated
50/page server-side with smart ordering (queued/running oldest-first,
finished states newest-first).
Frontend replaces the old Remaining + Recently-finished sections with a
single Items section: a filter chip strip (all / queued / running /
done / error / skipped, each with its live count) drives the query;
classic prev/next pagination with first/last/window page links. Filter
+ page are component-local state, not URL-persistent — a refresh always
lands you on page 1 of the default view.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Browsing/filtering by individual machines exposes other users' rig names and owner display names once samples are public-default. Gate the sidebar section behind isAdmin until we've thought through what the per-user view should look like. The detail-page Machine link + URL ?machine_id= filter still work for anyone who already has the id.
"Review Samples" on /samples now forwards the active sidebar filter (scope, machine_id, source_role, capture_reason, max_age_hours) into the URL, and /review reads them back on every loadNext call so the reviewer only sees samples from the slice they picked — same affordance the Re-run teacher button already had. - review.py /queue/next gains the same five filter params with identical semantics to /api/samples. - /samples Review button label becomes "Review filtered" when any filter is active, with a tooltip explaining the scope. - /review header shows the active filter chips with a Clear link that drops back to the unfiltered queue. Plain /review (no query string) keeps the original behaviour: any unreviewed/in-review sample, oldest-first.
Reviewers can already filter the queue by sidebar selection; admins now get a small "Re-run teacher" card next to the action pad on /review with a model dropdown + Run button. Clicking it calls the same sync single-sample endpoint /samples/[id] already uses, swaps the sample state with the fresh detection, and clears the local review history so the action pad shows an unreviewed slate (the backend already resets review_status). Compare → link drops to /samples/[id]/compare for the full side-by-side view. Defaults to the admin's preferred_teacher_model when set, otherwise the first registered adapter. Members see no change.
The dropdown now resolves in order: localStorage > user.preferred_teacher_model > first registered. Saved whenever the value changes after initial prefill, so the reviewer's last-used model sticks across reloads and tabs without needing to touch the global preference in Settings.
The annotator action buttons wired event handlers as bare property
references (`onclick={annotatorApi.save}`). The AnnotatorApi class
fields aren't $state, so when SampleAnnotator's $effect remounts a new
sample and reassigns `externalApi.save = saveAnnotations`, the buttons
keep firing the previous closure (or the default no-op stub) because
Svelte captured the reference at render time.
Affected the Review queue most visibly — every accept/reject re-fetches
a new sample so the annotator remounts every turn. Sample-detail page
worked by accident because it usually stays on one sample.
Indirect through arrow functions (`onclick={() => annotatorApi.save()}`)
so the binding resolves at click time. Applied to all action buttons on
both panels for consistency.
spencerhhubert
approved these changes
May 21, 2026
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
16 commits accumulated on
sorthivesince #129 merged. Most of the surface areais the new Hive teacher backfill stack (queueable Gemini/Perceptron
re-detection on stored samples) plus a handful of Hive UX + sorter reliability
fixes.
Hive
Teacher backfill (the big one)
with isolated request/response handling per provider — adding a model = one
registry entry.
Nemotron), Grok (XYXY bbox parse), Perceptron Mk1 (calls
api.perceptron.inc/v1/chat/completionsdirectly — OpenRouter's shimcouldn't reliably trigger grounded mode).
max_concurrent+min_interval_scaps, 429 → Retry-After + jitter backoff, atomic item claimvia
SELECT … FOR UPDATE SKIP LOCKED. Sessions held only for the shortload + write transactions so the connection pool doesn't get monopolized by
multi-second adapter calls.
openrouter_api_key(existing) + newperceptron_api_key, both encrypted.secret_kindon each adapter routesto the right key.
preview (for compare page), model registry, default prompt fetch.
user perceptron_api_key + preferred_teacher_model.
Teacher UI
/samplesadmin "Re-run teacher" button + live-polling status banner thatauto-restores on reload by fetching the latest in-flight job.
/samples/[id]header gets "Compare models" + "Re-run teacher" (real Buttonprimitives, not text links).
/admin/teacher-jobssplits Active (big cards with cost + tokens) fromHistory (compact rows). Detail view paginates items 50/page server-side with
status filter chips (queued / running / done / error / skipped).
/samples/[id]/compareruns every supported model on the same sample withper-tile image + bboxes, metrics, cost, latency, raw-response inspector, and
an editable prompt textarea.
/settingsadmin section adds Perceptron API Key + Default Teacher Modelselectors (separate from the AI Assistant chat model).
Samples list
owner-or-reviewer/admin gated.
?scope=mineopts back into the private view.to that machine's sample list.
4 (Classification)), Capture Reason filter dropped, new Age filter
(24h/7d/30d/all) wired through to
max_age_hours.single-rig reason can't read as "done".
Model publishing + browsing
Sorter
Training
machine_idend-to-end through pull → build with--balance-machineCLI flag, so dataset composition can balance across rigs (and report
per-machine counts in build.json regardless).
Test plan
pytest software/hive/backend/tests— 112/112 green locallypnpm --dir software/hive/frontend check— 0 errorshive.basically.website(sorthive branch) — backend+ worker stable under 4k-item parallel backfill after the connection-pool
restructure
live status banner, compare page across all 10 registered models, jobs
pagination + status filter