The only open-source verification layer for production AI.
Welcome to QWED - The Enterprise Deterministic Verification Engine.
This repository contains the Client SDKs and Benchmarks to help you integrate and independently verify QWED's capabilities.
We believe in transparent AI verification. While our core verification engines remain proprietary, we're open-sourcing:
- β Client SDKs - Integrate QWED into any application
- β Benchmarks - Independently verify our accuracy claims
- β Examples - See real-world use cases
Philosophy: "Trust, but verify." We give you the tools to test our claims yourself.
Sign up at qwed.tech to get your API key.
- Python:
pip install qwed - JavaScript/TypeScript:
npm install qwed-sdk
from qwed import QwedClient
client = QwedClient(api_key="YOUR_KEY")
result = client.verify_natural_language("What is 10 + 10?")
print(result.final_answer) # 20.0
print(result.status) # "VERIFIED"pip install qwed# Verify code security
code_result = client.verify_code("print('Hello World')")
# Verify logic constraints
logic_result = client.verify_logic("x > 5 AND x < 10")
# Verify statistical claims
stats_result = client.verify_stats(csv_data, "Average sales increased by 15%")npm install @qwed/sdkimport { QwedClient } from '@qwed/sdk';
const client = new QwedClient("sk_live_...");
const result = await client.verifyNaturalLanguage("What is 15% of $200?");
console.log(result.finalAnswer); // 30.0from qwed import QwedClient
client = QwedClient(api_key="YOUR_KEY")
# User queries AI for loan payment
query = "What's the monthly payment for a $500k loan at 3.5% over 30 years?"
# QWED verifies the calculation
result = client.verify_natural_language(query)
if result.status == "VERIFIED":
print(f"Payment: ${result.final_answer}/month")
print(f"Proof: {result.verification}")
else:
print("ERROR: LLM hallucinated")query = "Calculate pediatric dosage: patient weighs 25kg, adult dose 500mg"
result = client.verify_natural_language(query)
# QWED uses symbolic math to ensure correctness
if result.status == "VERIFIED":
approve_dosage(result.final_answer)
else:
flag_for_human_review()const code = "os.remove('important.txt')";
const result = await client.verifyCode(code);
if (result.status === "UNSAFE") {
console.log("Blocked dangerous code:", result.issues);
} else {
executeCode(code);
}We tested QWED against raw LLM outputs (GPT-4, Claude) on 1000+ queries.
| Test Category | Raw LLM Accuracy | QWED Accuracy | Improvement |
|---|---|---|---|
| Math | 83% | 99.2% | +16.2% |
| Security | 0% (all exploits missed) | 100% (all caught) | +100% |
| Logic | 67% | 98.5% | +31.5% |
| Overall | 75% | 99.2% | +24.2% |
- 1000+ queries across 8 domains (Math, Logic, Security, Stats, Facts, SQL, Reasoning, Image)
- Edge cases included (division by zero, ambiguous queries, code injection)
- Reproducible (run
python benchmarks/api_runner.pyyourself)
We welcome contributions to improve the SDKs and benchmarks!
- Additional language SDKs (Go, Ruby, Java)
- More benchmark test cases
- Documentation improvements
- Example applications
- Fork this repo
- Create a feature branch (
git checkout -b feature/amazing) - Commit your changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing) - Open a Pull Request
Be kind. We're all learning.
SDKs: MIT License (free to use, modify, distribute)
QWED Core Engines: Proprietary (contact rahul@qwed.tech for licensing)
Benchmarks: CC0 (public domain - use freely)
Built with β€οΈ in Pune by Rahul Dass
Questions? β rahul@qwed.tech