feat(skainet-cli): swap LLaMA/Qwen branch to DSL path by michalharakal · Pull Request #125 · SKaiNET-developers/SKaiNET-transformers

michalharakal · 2026-05-04T18:06:59Z

Phase 5b consumer migration. Mirrors #121, #122, #123. After this merge, no top-level CLI in this repo constructs LlamaRuntime for the GGUF path (only kllama-cli's BIN fallback still does).

What changes

skainet-cli previously routed Gemma + Apertus through DSL but kept LLaMA / Qwen / Mistral on the legacy LlamaRuntime + CpuAttentionBackend + LlamaWeightMapper + MemSegWeightConverter chain. This PR collapses the else branch:

DecoderGgufWeightLoader(NATIVE_OPTIMIZED, family.architectures + [arch]) → DecoderGgufMemSegConverter.convert → per-family network loader → OptimizedLLMRuntime DIRECT mode.
DSL-side family dispatch: ModelFamily.QWEN → QwenNetworkLoader.fromWeights (NEOX RoPE + QK-norm); else → LlamaNetworkLoader.fromWeights.
This CLI previously handled Qwen via the LlamaRuntime-with-detected-flags hybrid that the kllama CLI also used pre-feat(kllama-cli): swap Qwen branch to DSL path (Phase 4) #121 — same architectural collapse here.

Imports + deps cleaned

Removed: CpuAttentionBackend, LlamaRuntime, LlamaWeightMapper, MemSegWeightConverter.
Added: :llm-inference:qwen dep (was missing — the legacy hybrid-Qwen path didn't need an explicit dep on the Qwen module).

Test plan

:llm-apps:skainet-cli:build, :llm-runtime:kllama:jvmTest, :llm-inference:qwen:jvmTest, :llm-inference:llama:jvmTest — all pass.
CI green on PR.
Manual (post-merge): skainet-cli with a real Qwen3 / Llama / Mistral GGUF; verify coherent output.

Numerical equivalence with the legacy path on identical weights is pinned by QwenDslLegacyParityTest (#120).

🤖 Generated with Claude Code

Phase 5b consumer migration. Mirrors PR #122 (kllama CLI) and #123 (KLlamaJava facade). After this merge, no top-level CLI in this repo constructs `LlamaRuntime` for the GGUF path. `skainet-cli` previously routed Gemma + Apertus through DSL but kept LLaMA / Qwen / Mistral on the legacy `LlamaRuntime` + `CpuAttentionBackend` + `LlamaWeightMapper` + `MemSegWeightConverter` chain. This PR collapses the else branch onto the DSL path: - `DecoderGgufWeightLoader(NATIVE_OPTIMIZED, family.architectures + [arch])` → `DecoderGgufMemSegConverter.convert` → per-family network loader → `OptimizedLLMRuntime` DIRECT mode. - Family dispatch on the DSL side: `ModelFamily.QWEN` → `QwenNetworkLoader.fromWeights` (NEOX RoPE + QK-norm), else → `LlamaNetworkLoader.fromWeights`. Previously this CLI handled Qwen via the `LlamaRuntime`-with-detected-flags hybrid that the kllama CLI also used pre-#121 — same architectural collapse here. Imports cleaned: removed `CpuAttentionBackend`, `LlamaRuntime`, `LlamaWeightMapper`, `MemSegWeightConverter`. Added `:llm-inference:qwen` to the build.gradle dependencies (was missing — only the legacy hybrid-Qwen path didn't need it). Numerical equivalence with the legacy path on identical weights is pinned by `QwenDslLegacyParityTest` (#120). Tests pass: `:llm-apps:skainet-cli:build`, `:llm-runtime:kllama:jvmTest`, `:llm-inference:qwen:jvmTest`, `:llm-inference:llama:jvmTest`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

michalharakal merged commit b6ab9e4 into develop May 4, 2026
2 checks passed

michalharakal deleted the feat/skainet-cli-dsl-swap branch May 4, 2026 18:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skainet-cli): swap LLaMA/Qwen branch to DSL path#125

feat(skainet-cli): swap LLaMA/Qwen branch to DSL path#125
michalharakal merged 1 commit into
developfrom
feat/skainet-cli-dsl-swap

michalharakal commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

michalharakal commented May 4, 2026

What changes

Imports + deps cleaned

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant