A MNIST-like fashion product database. Benchmark 👇
-
Updated
Jun 13, 2022 - Python
A MNIST-like fashion product database. Benchmark 👇
OpenMMLab Pose Estimation Toolbox and Benchmark.
Benchmarks of approximate nearest neighbor libraries in Python
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
A series of large language models developed by Baichuan Intelligent Technology
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Python package for the evaluation of odometry and SLAM
A 13B large language model developed by Baichuan Intelligent Technology
A unified evaluation framework for large language models
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
Reference implementations of MLPerf™ training benchmarks
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
A machine learning toolkit for log parsing [ICSE'19, DSN'16]
📊 Benchmark multiple object trackers (MOT) in Python
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Efficient Retrieval Augmentation and Generation Framework
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
py.test fixture for benchmarking code
Reference implementations of MLPerf™ inference benchmarks
Add a description, image, and links to the benchmark topic page so that developers can more easily learn about it.
To associate your repository with the benchmark topic, visit your repo's landing page and select "manage topics."