Skip to content
GitHub Universe 2025
Explore 100+ talks, demos, and workshops at Universe 2025. Choose your favorites.
#

llm-testing

Here is 1 public repository matching this topic...

A production-grade platform to evaluate and compare the performance of Large Language Models (LLMs) like OpenAI, Anthropic, and Google’s PaLM. It features real time analytics, hallucination detection, and cost performance benchmarking using standardized datasets (e.g., GSM8K).

  • Updated Sep 11, 2025
  • TypeScript

Improve this page

Add a description, image, and links to the llm-testing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-testing topic, visit your repo's landing page and select "manage topics."

Learn more