QA Report: Qwen2.5-Coder-1.5B-Instruct Qualified

# Model Qualification Report: Qwen2.5-Coder-1.5B-Instruct

**Date:** 2026-01-30
**Qualified By:** apr-model-qa-playbook v0.1.0
**Model:** Qwen/Qwen2.5-Coder-1.5B-Instruct
**Format:** GGUF Q4_K_M (1.04 GB)

## Summary

| Metric | Value |
|--------|-------|
| **MQS Score** | 200/1000 |
| **Grade** | F (Partial - QUAL only) |
| **Gateways** | 4/4 PASSED |
| **Tests** | 50/50 PASSED (100%) |
| **Duration** | 195.4s |

## Gateway Status

| Gateway | Status | Description |
|---------|--------|-------------|
| G1-LOAD | ✅ PASS | Model loads successfully |
| G2-INFER | ✅ PASS | Basic inference works |
| G3-STABLE | ✅ PASS | No crashes or panics |
| G4-VALID | ✅ PASS | Output is not garbage |

## Performance Metrics

| Metric | Value |
|--------|-------|
| Tokens/second | 5.9 - 21.2 tok/s |
| Generation time (32 tokens) | ~1.5s |
| Total latency (incl. load) | ~3.8s |
| Backend | CPU |

## Test Matrix

- **Modalities:** run, chat
- **Backends:** cpu
- **Formats:** gguf
- **Scenarios per combination:** 25

## Artifacts

- `evidence.json` - Full test evidence (50 entries)
- `junit.xml` - JUnit XML for CI integration
- `mqs.json` - Machine-readable MQS score
- `report.html` - Interactive HTML dashboard

## Methodology

Tests follow the **Popperian Falsification** protocol:
- Each test is a falsifiable hypothesis
- Outcome: `Corroborated` (survived refutation) or `Falsified` (refuted)
- All 50 hypotheses were corroborated

## Recommendations

1. **Production Ready:** Yes, for CPU inference
2. **Performance:** Acceptable (5.9+ tok/s on CPU)
3. **Stability:** No crashes observed in 50 tests

## Next Steps

- [ ] Run full qualification (1800 tests) for comprehensive coverage
- [ ] Add GPU backend testing
- [ ] Test additional quantizations (Q5_K_M, Q8_0)

---
*Generated by apr-model-qa-playbook*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QA Report: Qwen2.5-Coder-1.5B-Instruct Qualified #171

Model Qualification Report: Qwen2.5-Coder-1.5B-Instruct

Summary

Gateway Status

Performance Metrics

Test Matrix

Artifacts

Methodology

Recommendations

Next Steps

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Metric	Value
MQS Score	200/1000
Grade	F (Partial - QUAL only)
Gateways	4/4 PASSED
Tests	50/50 PASSED (100%)
Duration	195.4s

Gateway	Status	Description
G1-LOAD	✅ PASS	Model loads successfully
G2-INFER	✅ PASS	Basic inference works
G3-STABLE	✅ PASS	No crashes or panics
G4-VALID	✅ PASS	Output is not garbage

Metric	Value
Tokens/second	5.9 - 21.2 tok/s
Generation time (32 tokens)	~1.5s
Total latency (incl. load)	~3.8s
Backend	CPU

QA Report: Qwen2.5-Coder-1.5B-Instruct Qualified #171

Description

Model Qualification Report: Qwen2.5-Coder-1.5B-Instruct

Summary

Gateway Status

Performance Metrics

Test Matrix

Artifacts

Methodology

Recommendations

Next Steps

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions