Skip to content

cpldcpu/llmbenchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Benchmark Experiments

A collection of benchmark experiments for evaluating different aspects of Large Language Models (LLMs).

Benchmark Projects

Evaluates code generation capabilities by prompting to create Python raytracers. Includes visual comparisons and consistency tests across multiple models.

Analyzes the reasoning patterns and chain-of-thought processes of various LLMs by examining statistical patterns in their thinking traces.

Attempt to tests LLM consistency and uniqueness across various topic domains by analyzing generation statistics for prompts with a single word response. Helps identify model-specific response patterns that can serve as "fingerprints" for different LLMs.

Evaluation Projects

A a vibe coding project to test the capabitilies of the mystery model "Optimus Alpha". It implements a real-time path tracing techniques for realistic lighting and shadows.

License

This project is licensed under CC0 1.0 Universal.

About

Various LLM Benchmarks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages