OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 4,915 523 Updated Mar 13, 2025

Hannibal046 / Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

22,058 1,809 Updated Mar 4, 2025

idavidrein / gpqa

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Jupyter Notebook 306 23 Updated Sep 30, 2024

kaistAI / FLASK

[ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets

Python 214 18 Updated Dec 24, 2023

tatsu-lab / alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook 1,681 261 Updated Dec 27, 2024

WooooDyy / LLM-Agent-Paper-List

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

7,315 430 Updated Jul 28, 2024

bleedline / aimoneyhunter

ai副业赚钱大集合，教你如何利用ai做一些副业项目，赚取更多额外收益。The Ultimate Guide to Making Money with AI Side Hustles: Learn how to leverage AI for some cool side gigs and rake in some extra cash. Check out the English versi…

14,401 1,319 Updated Dec 21, 2024

evalplus / evalplus

Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024

Python 1,395 141 Updated Jan 6, 2025

OpenBMB / UltraEval

[ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.

Python 233 21 Updated Oct 30, 2024

Jack-Cherish / PythonPark

Python 开源项目之「自学编程之路」，保姆级教程：AI实验室、宝藏视频、数据结构、学习指南、机器学习实战、深度学习实战、网络爬虫、大厂面经、程序人生、资源分享。

Python 9,960 1,630 Updated Nov 26, 2024

MoreAgentsIsAllYouNeed / AgentForest

We present the first systematic study on the scaling property of raw agents instantiated by LLMs. We find that performance scales with the increase in the number of agents, using the simple(st) way…

Python 111 13 Updated Oct 8, 2024

QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 17,400 1,442 Updated Feb 25, 2025

CS-BAOYAN / CSSummerCamp2023

Python 1,714 187 Updated Aug 21, 2023

v2ray / v2ray-core

A platform for building proxies to bypass network restrictions.

Go 45,825 8,943 Updated Jan 21, 2025

liuwei881 / Introduction_to_Algorithms_result

算法导论第三版答案(从其他git摘取得, 供自己学习对照使用)

HTML 42 20 Updated Jan 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

moyi moyi-qwq

Block or report moyi-qwq

Stars

zou-group / textgrad

SakanaAI / CycleQD

confident-ai / deepeval

EleutherAI / lm-evaluation-harness

open-compass / opencompass