evaluation-framework

Here are 2 public repositories matching this topic...

symflower / eval-dev-quality

DevQualityEval: An evaluation benchmark 📈 and framework to compare and evolve the quality of code generation of LLMs.

evaluation software-development software-quality evaluation-framework llms

Updated May 15, 2025
Go

The DataSnack AI Agent Evaluator is a CLI tool that automates the testing of AI agents by generating test prompts, creating test documents, and evaluating basic functionality, consistency, and vulnerability to data leakage and prompt injection attacks.

ai evaluation-framework ai-agents data-leakage prompt-injection

Updated Sep 25, 2025
Go

Improve this page

Add a description, image, and links to the evaluation-framework topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the evaluation-framework topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evaluation-framework

Here are 2 public repositories matching this topic...

symflower / eval-dev-quality

brianjmarvin / datasnack-ai

Improve this page

Add this topic to your repo