🏆 LLM Tournament

A high-performance, blazingly-fast evaluation platform for Large Language Models, built with enterprise-grade architecture and real-time capabilities. This platform enables systematic assessment of LLM performance through comprehensive test suites, sophisticated prompt management, and detailed analytics.

💡 Overview

LLM Tournament addresses the critical challenge of evaluating and comparing language model performance at scale. Built with a focus on reliability and real-time processing, it provides a robust framework for managing complex evaluation workflows while maintaining high performance and data integrity.

Key technical highlights:

Lightweight and blazingly-fast due to pure Go Template without any bloat, single binary
Real-time evaluation engine powered by WebSocket
Horizontally scalable architecture with stateless components
Efficient data persistence layer with JSON-based storage
Responsive frontend built on modern web standards

📚 Table of Contents

🔑 Key Features
🛠️ Stack
🖼️ UI
🏃 Run
🛠️ Develop
🤝 Contribute
📝 TODO/Roadmap
🏆 Badges
👥 Contributors
📜 License
📞 Contact

🔑 Key Features

🚀 Core Functionality

Real-time Evaluation Engine: WebSocket-powered instant updates for results and metrics
Modular Test Suites: Independent prompt and model configurations for different scenarios
Comprehensive Data Management: JSON-based storage with CSV import/export capabilities

📝 Prompt Management

Full Lifecycle Control: Create, edit, delete, and reorder prompts
Rich Content Support: Markdown formatting and multiline input
Advanced Filtering: Search by text, filter by profile and order
Bulk Operations: Delete multiple prompts at once
Solution Tracking: Attach reference solutions to each prompt
Profile Association: Tag prompts with evaluation profiles

🏆 Model Evaluation

Performance Tracking: Pass/fail results with detailed metrics
Real-time Analytics: Scores and pass percentages updated instantly
Flexible Filtering: View results by model or profile
Data Portability: Import/export results in CSV format
Evaluation Management: Reset or refresh results as needed

⚙️ System Management

Prompt Suites: Create and switch between different prompt sets
Model Suites: Manage different model configurations
Profile System: Define and manage evaluation profiles
Data Integrity: Automatic backups and version control
Responsive UI: Modern interface optimized for all devices

🔄 Workflow Automation

Bulk Operations: Manage multiple items simultaneously
Template System: Reuse configurations across evaluations
Data Migration: Easy import/export of prompts and results
Real-time Sync: Instant updates across all connected clients

(Back)

🛠️ Stack

Tech: Go, WebSockets, Built-in Template, HTML, CSS, JS, and database in JSON.
Assistant: Aider with
- free/unlimited APIs: Gemini 2.0 Advanced, Gemini 2.0 Flash, Codestral 2501, Mistral Large Latest.
- paid deepseek-3-chat API since v1.1

(Back)

🖼️ UI

(Back)

🏃 Run

make run

or

./release/llm-tournament-v1.0

Then go to http://localhost:8080

(Back)

🛠️ Develop

Require Linux environment with Python and Go installed (preferably via Brew).

make aiderupdate

Then tweak ./.aider.conf.yml.example into ./.aider.conf.yml with your own API Key.

(Back)

🤝 Contribute

Anyone can just submit a PR and we'll discuss there.

(Back)

📝 TODO/Roadmap

🔧 Non-Functional

Make another prompt suite for vision LLMs.

🔧 Functional

Search model.
Order model.
Add RAG and Web search agentic system under ./tools/ragweb_agent/.
Update the features section about the tools.

(Back)

🏆 Badges

(Back)

👥 Contributors

(Back)

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

(Back)

📞 Contact

For any questions or suggestions or collaboration/job inquiries, feel free to reach out to us at cariyaputta@gmail.com.

(Back)

Name		Name	Last commit message	Last commit date
Latest commit History 528 Commits
.vscode		.vscode
assets		assets
data		data
handlers		handlers
middleware		middleware
templates		templates
tools		tools
.aider		.aider
.aider.conf.yml.example		.aider.conf.yml.example
.aider.model.metadata.json		.aider.model.metadata.json
.gitignore		.gitignore
LICENSE		LICENSE
PRELIMINARY_DESIGN.md		PRELIMINARY_DESIGN.md
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go
makefile		makefile
system_prompt_general.xml		system_prompt_general.xml
system_prompt_programming.xml		system_prompt_programming.xml
system_prompt_translation.xml		system_prompt_translation.xml
test_output.txt		test_output.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏆 LLM Tournament

💡 Overview

📚 Table of Contents

🔑 Key Features

🚀 Core Functionality

📝 Prompt Management

🏆 Model Evaluation

⚙️ System Management

🔄 Workflow Automation

🛠️ Stack

🖼️ UI

🏃 Run

🛠️ Develop

🤝 Contribute

📝 TODO/Roadmap

🔧 Non-Functional

🔧 Functional

🏆 Badges

👥 Contributors

📜 License

📞 Contact

About

Releases 4

Packages

Languages

License

lavantien/llm-tournament

Folders and files

Latest commit

History

Repository files navigation

🏆 LLM Tournament

💡 Overview

📚 Table of Contents

🔑 Key Features

🚀 Core Functionality

📝 Prompt Management

🏆 Model Evaluation

⚙️ System Management

🔄 Workflow Automation

🛠️ Stack

🖼️ UI

🏃 Run

🛠️ Develop

🤝 Contribute

📝 TODO/Roadmap

🔧 Non-Functional

🔧 Functional

🏆 Badges

👥 Contributors

📜 License

📞 Contact

About

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages