Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
bc915db
readme updates
Dec 11, 2024
e290211
fix links
Dec 12, 2024
72b9b9e
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Jan 8, 2025
3a20480
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Jan 17, 2025
f58154e
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Jan 22, 2025
9b9c5c0
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Jan 24, 2025
ce1b2fe
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Jan 29, 2025
b2a8376
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Jan 29, 2025
645eefa
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Jan 31, 2025
89daef5
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Feb 11, 2025
d9988df
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Feb 27, 2025
12dffdf
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Mar 1, 2025
8fa0931
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Mar 14, 2025
536b6f7
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Apr 2, 2025
10b0aba
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Apr 4, 2025
f9c2a48
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Apr 4, 2025
e2947ba
allow sys msg as part of model config
Apr 4, 2025
57504cb
merge
Apr 7, 2025
f676e6e
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Apr 8, 2025
9efef0d
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Apr 9, 2025
269d296
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Apr 9, 2025
fd84d48
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Apr 9, 2025
81504c2
pull
Apr 9, 2025
3a1cd28
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Apr 15, 2025
f758195
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
May 13, 2025
3d53322
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
May 14, 2025
559d495
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
May 22, 2025
4f5e487
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
May 22, 2025
f71926e
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
May 22, 2025
80a533a
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
May 23, 2025
7804577
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Jun 12, 2025
12b9984
docstr pipeline
Jun 25, 2025
aef86b1
prompt tmplt for docstring
Jun 25, 2025
f0ea3db
add/improve docstring
Jun 25, 2025
0093a1f
minor readme updates
Jun 26, 2025
60db9e4
minor readme updates
Jun 26, 2025
7778902
updates to doc strs
Jul 8, 2025
4936033
fix import
Jul 8, 2025
e8d347f
rm file
Jul 8, 2025
01d21fe
Merge branch 'main' of https://github.com/microsoft/eureka-ml-insights
Jul 8, 2025
26d0cc3
formatting
Jul 8, 2025
4786eb1
rm deprecated type
Jul 8, 2025
a3b7325
rm commented code
Jul 8, 2025
4e9598b
revert user configs
Jul 8, 2025
8ab63a1
eureka_ml_insights/user_configs/doc_str.py
Jul 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,18 @@
<a href='https://arxiv.org/abs/2409.10566'><img src=https://img.shields.io/badge/arXiv-2409.10566-b31b1b.svg></a>
<a href='https://arxiv.org/pdf/2504.00294'><img src=https://img.shields.io/badge/arXiv-2504.00294-b31b1b.svg></a>
<a href='https://huggingface.co/datasets/microsoft/Eureka-Bench-Logs/tree/main'><img src=https://huggingface.co/front/assets/huggingface_logo-noborder.svg width="16">Eureka Evaluation Logs</a>
<a href='https://microsoft.github.io/eureka-ml-insights'><img src=docs/figures/github.png width="16">Project Website</a>
<a href='https://microsoft.github.io/eureka-ml-insights'><img src=readme_docs/figures/github.png width="16">Project Website</a>
</p>

This repository contains the code for the Eureka ML Insights framework. The framework is designed to help researchers and practitioners run reproducible evaluations of generative models using a variety of benchmarks and metrics efficiently. The framework allows the user to define custom pipelines for data processing, inference, and evaluation, and provides a set of pre-defined evaluation pipelines for key benchmarks.

## 📰 News

- **[2025/5/20]**: We have uploaded logs from all experiment reported in our papers on [HuggingFace](https://huggingface.co/datasets/microsoft/Eureka-Bench-Logs/tree/main).
- **[2025/4/29]**: New blog post out [Eureka Inference-Time Scaling Insights: Where We Stand and What Lies Ahead](https://www.microsoft.com/en-us/research/articles/eureka-inference-time-scaling-insights-where-we-stand-and-what-lies-ahead/)
- **[2025/3/31]**: We have a new paper out [Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead](https://arxiv.org/abs/2504.00294)
- **[2024/9/17]**: New blog post out [Eureka: Evaluating and understanding progress in AI](https://aka.ms/eureka-ml-insights-blog)
- **[2024/9/17]**: New paper out [Eureka: Evaluating and Understanding Large Foundation Models](https://arxiv.org/abs/2409.10566)
- **[2025/5/20]**: <img src=https://huggingface.co/front/assets/huggingface_logo-noborder.svg width="16"> We have uploaded logs from all experiment reported in our papers on [HuggingFace](https://huggingface.co/datasets/microsoft/Eureka-Bench-Logs/tree/main).
- **[2025/4/29]**: <img src=readme_docs/figures/msr_blog.png width="16"> New blog post out [Eureka Inference-Time Scaling Insights: Where We Stand and What Lies Ahead](https://www.microsoft.com/en-us/research/articles/eureka-inference-time-scaling-insights-where-we-stand-and-what-lies-ahead/)
- **[2025/3/31]**: <img src=readme_docs/figures/arxiv_logo.svg width="16"> We have a new technical report out [Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead](https://arxiv.org/abs/2504.00294)
- **[2024/9/17]**: <img src=readme_docs/figures/msr_blog.png width="16"> New blog post out [Eureka: Evaluating and understanding progress in AI](https://aka.ms/eureka-ml-insights-blog)
- **[2024/9/17]**: <img src=readme_docs/figures/arxiv_logo.svg width="16"> New technical report out [Eureka: Evaluating and Understanding Large Foundation Models](https://arxiv.org/abs/2409.10566)
## Table of Contents
- [Eureka ML Insights Framework](#eureka-ml-insights-framework)
- [📰 News](#-news)
Expand Down Expand Up @@ -98,7 +98,7 @@ The results of the experiment will be saved in a directory under `logs/FlenQA_Ex
For other available experiment pipelines and model configurations, see the `eureka_ml_insights/user_configs` and `eureka_ml_insights/configs` directories, respectively. In [model_configs.py](eureka_ml_insights/configs/model_configs.py) you can configure the model classes to use your API keys, Key Vault urls, endpoints, and other model-specific configurations.

## 🗺️ Overview of Experiment Pipelines
![Components](./docs/figures/transparent_uml.png)
![Components](./readme_docs/figures/transparent_uml.png)
Experiment pipelines define the sequence of components that are run to process data, run inference, and evaluate the model outputs. You can find examples of experiment pipeline configurations in the `user_configs` directory. To create a new experiment configuration, you need to define a class that inherits from `ExperimentConfig` and implements the `configure_pipeline` method. In the `configure_pipeline` method you define the Pipeline config (arrangement of Components) for your Experiment. Once your class is ready, add it to `user_configs/__init__.py` import list.


Expand Down
Binary file removed docs/figures/Benchmarks.png
Binary file not shown.
37 changes: 0 additions & 37 deletions docs/figures/huggingface_logo-noborder.svg

This file was deleted.

Loading
Loading