Chat dispatcher ignores body `model` field — always routes to `primary` slot

## Summary

`POST /v1/chat/completions` resolves every request through the legacy
slot resolver and dispatches to the `primary` slot, regardless of what
`model` the request body specifies.

Observed in production logs (2026-05-28):

```
2026-05-28T06:51:24 [info] dispatch.decision [hal0-dispatch]
    cache_state=legacy
    latency_ms=0.157
    model=qwen3-coder-reap-25b-a3b-q5km     ← caller asked for agent-hermes' model
    resolution_path=legacy_slot:primary     ← but we sent it to primary
    upstream=primary
```

The agent-hermes slot was loaded with `qwen3-coder-reap-25b-a3b-q5km`
on port 8002.  The primary slot was loaded with the 40b coder on 8001.
A chat request asking for the 25b model was still forwarded to primary.

## Root cause

`Dispatcher.dispatch()` (`src/hal0/dispatcher/router.py`) reaches Step 4
(legacy heuristics) because:

1. The model isn't in the upstream registry (Lemonade-loaded models don't auto-register).
2. No upstream's cached `/v1/models` advertises it (or the cache is cold).
3. `resolve_slot()` in `dispatcher/proxy.py` matches by path, not by
   model name, and `/v1/chat/completions` always resolves to `primary`.

So the dropdown in the WebUI suggesting "talk to agent-hermes" is
effectively cosmetic for chat requests — they all land on `primary`.

## Impact

- Users can't route chat to specific slots by model name.
- Multi-slot setups (primary + agent-hermes) effectively share the
  primary slot for all `/v1/chat/completions` traffic.
- Will compound once we expose more chat-capable slots (NPU, FLM, etc.).

## Proposed direction (not in scope of this issue — defer)

Either:
- Auto-register Lemonade-loaded models into the model registry on slot transition to READY, so Step 1 finds them; OR
- Make `resolve_slot()` consult slot manifests' `[model] default` + `models` lists when the path is `/v1/chat/completions`.

Deferred from a debug session on 2026-05-28 where we fixed the
swap-window 503 race; see related branch `fix/swap-window-503`.

## Related

- ADR-0006 (Lemonade migration) — registry/catalog drift was noted but not closed
- Memory [[hal0_lemonade_hf_cache_gotchas]] — model catalog surfaces

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chat dispatcher ignores body `model` field — always routes to `primary` slot #377

Summary

Root cause

Impact

Proposed direction (not in scope of this issue — defer)

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Chat dispatcher ignores body model field — always routes to primary slot #377

Description

Summary

Root cause

Impact

Proposed direction (not in scope of this issue — defer)

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Chat dispatcher ignores body `model` field — always routes to `primary` slot #377