Wire QwenNetworkLoader into CLI for proper Qwen3 inference

## Context

The CLI (`Main.kt`) always routes GGUF models through `LlamaIngestion` → `LlamaRuntime`, which works for Llama-architecture models. Qwen3 models load successfully (same tensor names), but produce garbage output because `LlamaRuntime` doesn't handle Qwen3-specific features:

- **QK-norm** (query/key normalization via `attn_q_norm.weight` / `attn_k_norm.weight`)
- **RoPE base** frequency (1,000,000 vs Llama's 10,000)
- **BOS token** differences

The correct loader (`QwenNetworkLoader` in `llm-inference:qwen`) exists but isn't wired into the CLI.

## Scope

- Add `:llm-inference:qwen` dependency to `:llm-runtime:kllama`
- Detect `qwen*` architecture from GGUF metadata in `Main.kt`
- Route to `QwenNetworkLoader.fromGguf()` for Qwen models
- Wire the Qwen DSL network module into a runtime compatible with `AgentLoop`
- Validate end-to-end with `Qwen3-1.7B-Q8_0.gguf --demo`

## Related

- Parent: #35 (Generalize tool-calling support)
- Depends on architecture detection added in #35 commits

## Acceptance Criteria

- [ ] `Qwen3-1.7B-Q8_0.gguf --demo` produces coherent output
- [ ] Tool calling works through the Qwen chat template
- [ ] Llama models continue working unchanged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wire QwenNetworkLoader into CLI for proper Qwen3 inference #46

Context

Scope

Related

Acceptance Criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Wire QwenNetworkLoader into CLI for proper Qwen3 inference #46

Description

Context

Scope

Related

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions