Local Python tool for evaluating LLM-generated QA findings with manual validation and false-positive analysis
python qa model-comparison manual-review ai-qa prompt-testing openrouter llm-evaluation technical-qa
-
Updated
May 16, 2026 - Python