Skip to content

QA Report: Qwen2.5-Coder-1.5B-Instruct Qualified #171

@noahgift

Description

@noahgift

Model Qualification Report: Qwen2.5-Coder-1.5B-Instruct

Date: 2026-01-30
Qualified By: apr-model-qa-playbook v0.1.0
Model: Qwen/Qwen2.5-Coder-1.5B-Instruct
Format: GGUF Q4_K_M (1.04 GB)

Summary

Metric Value
MQS Score 200/1000
Grade F (Partial - QUAL only)
Gateways 4/4 PASSED
Tests 50/50 PASSED (100%)
Duration 195.4s

Gateway Status

Gateway Status Description
G1-LOAD ✅ PASS Model loads successfully
G2-INFER ✅ PASS Basic inference works
G3-STABLE ✅ PASS No crashes or panics
G4-VALID ✅ PASS Output is not garbage

Performance Metrics

Metric Value
Tokens/second 5.9 - 21.2 tok/s
Generation time (32 tokens) ~1.5s
Total latency (incl. load) ~3.8s
Backend CPU

Test Matrix

  • Modalities: run, chat
  • Backends: cpu
  • Formats: gguf
  • Scenarios per combination: 25

Artifacts

  • evidence.json - Full test evidence (50 entries)
  • junit.xml - JUnit XML for CI integration
  • mqs.json - Machine-readable MQS score
  • report.html - Interactive HTML dashboard

Methodology

Tests follow the Popperian Falsification protocol:

  • Each test is a falsifiable hypothesis
  • Outcome: Corroborated (survived refutation) or Falsified (refuted)
  • All 50 hypotheses were corroborated

Recommendations

  1. Production Ready: Yes, for CPU inference
  2. Performance: Acceptable (5.9+ tok/s on CPU)
  3. Stability: No crashes observed in 50 tests

Next Steps

  • Run full qualification (1800 tests) for comprehensive coverage
  • Add GPU backend testing
  • Test additional quantizations (Q5_K_M, Q8_0)

Generated by apr-model-qa-playbook

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions