-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Overview
Implement GPU performance benchmarks for APR format per Section Y.2 of the spec.
Requirement
- Y7: APR decode speed must be ≥200 tok/s on GPU (RTX 4090 reference)
- Must match or exceed GGUF decode speed on same hardware
Falsification Condition
APR < 200 tok/s when GGUF ≥ 200 tok/s on same GPU
Implementation Tasks
- Add CUDA feature flag to realizar
- Implement GPU kernels for APR inference
- Create benchmark harness for GPU performance
- Verify parity with GGUF on RTX 4090 or equivalent
- Add to CI with GPU runner (optional)
Blocked By
- Requires GPU hardware for development and testing
References
- Spec:
docs/specifications/apr-whisper-and-cookbook-support-eoy-2025.mdSection Y.2 - Related: Y6 (CPU benchmarks) - ✅ Verified at 206.4 tok/s
Priority
P2 - Deferred (no GPU hardware available currently)
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request