spiceai · lukekim · Feb 10, 2025 · Feb 10, 2025 · Feb 10, 2025 · Feb 10, 2025
diff --git a/website/docs/components/models/index.md b/website/docs/components/models/index.md
@@ -25,7 +25,9 @@ Spice supports various model providers for traditional machine learning (ML) mod
 [ant]: ./anthropic.md
 [xai]: ./xai.md
 
-- LLM Format(s) may require additional files (e.g. `tokenizer_config.json`).
+Spice also tests and evaluates common models and grades their ability to integrate with Spice. See the [Models Grade Report](./report.md).
+
+\*LLM Format(s) may require additional files (e.g., `tokenizer_config.json`).
 
 The model type is inferred based on the model source and files. For more detail, refer to the `model` [reference specification](/docs/reference/spicepod/models.md).
 
@@ -44,7 +46,7 @@ For more details, refer to the [Large Language Models documentation](/docs/featu
 
 ## Model Examples
 
-The following examples demonstrate how to configure and use various models or model features with Spice. Each example provides a specific use case to help you understand the configuration options available.
+The following examples demonstrate how to configure and use various models or model features with Spice. Each example provides a specific use case to help understand the configuration options available.
 
 ### Example: Configuring an OpenAI Model
 

diff --git a/website/docs/components/models/report.md b/website/docs/components/models/report.md
@@ -0,0 +1,24 @@
+---
+title: 'Models Grade Report'
+description: 'Spice AI graded Large-Language-Model (LLM) evaluation report'
+sidebar_label: 'Report'
+sidebar_position: 4
+---
+
+This document presents the evaluation report for various Large-Language-Models (LLMs) graded by Spice AI. The models are assessed based on their basic capabilities, quality of tool calls, and accuracy of output when integrated with Spice.
+
+For more details on how model grades are evaluated in Spice, refer to the [model grading criteria](https://github.com/spiceai/spiceai/blob/f6039123028209e20469b342791fa85d52b7771e/docs/criteria/models/grading.md).
+
+| Model                                           | Spice Grade | Model Provider | Context Window | Max Output Tokens | Chat Completion | Response Format | Tools | Recursive Tool Call | Reasoning | Streaming Response | Evaluation Date | Spice Version |
+| ----------------------------------------------- | ----------- | -------------- | -------------- | ----------------- | --------------- | --------------- | ----- | ------------------- | --------- | ------------------ | --------------- | ------------- |
+| `o3-mini-2025-01-31 (Reasoning effort: high)`   | A           | `openai`       | 200k tokens    | 100k tokens       | ✅              | ✅              | ✅    | ✅                  | ✅        | ✅                 | 2025-01-31      | v1.0.2        |
+| `o3-mini-2025-01-31 (Reasoning effort: medium)` | B           | `openai`       | 200k tokens    | 100k tokens       | ✅              | ✅              | ✅    | ✅                  | ✅        | ✅                 | 2025-01-31      | v1.0.2        |
+| `o3-mini-2025-01-31 (Reasoning effort: low)`    | C           | `openai`       | 200k tokens    | 100k tokens       | ✅              | ✅              | ✅    | ✅                  | ✅        | ✅                 | 2025-01-31      | v1.0.2        |
+| `o1-2024-12-17 (Reasoning effort: high)`        | A           | `openai`       | 200k tokens    | 100k tokens       | ✅              | ✅              | ✅    | ✅                  | ✅        | ✅                 | 2024-12-17      | v1.0.2        |
+| `o1-2024-12-17 (Reasoning effort: medium)`      | A           | `openai`       | 200k tokens    | 100k tokens       | ✅              | ✅              | ✅    | ✅                  | ✅        | ✅                 | 2024-12-17      | v1.0.2        |
+| `o1-2024-12-17 (Reasoning effort: low)`         | C           | `openai`       | 200k tokens    | 100k tokens       | ✅              | ✅              | ✅    | ✅                  | ✅        | ✅                 | 2024-12-17      | v1.0.2        |
+| `gpt-4o-2024-08-06`                             | B           | `openai`       | 128k tokens    | 16384 tokens      | ✅              | ✅              | ✅    | ✅                  | ✅        | ✅                 | 2024-08-06      | v1.0.2        |
+| `claude-3-5-sonnet-20241022`                    | C           | `anthropic`    | 200k tokens    | 8192 tokens       | ✅              | ❌              | ✅    | ✅                  | ✅        | ✅                 | 2024-10-22      | v1.0.2        |
+| `grok-2-1212`                                   | Ungraded    | `xai`          | 128k tokens    | Not Available     | ✅              | ❌              | ❌    | ❌                  | ✅        | ✅                 | Not Available   | v1.0.2        |
+| `deepseek-ai/DeepSeek-R1-Distill-Llama-8B`      | Ungraded    | `huggingface`  | 128k tokens    | Not Available     | ✅              | ❌              | ❌    | ❌                  | ✅        | ✅                 | Not Available   | v1.0.2        |
+| `meta-llama/Llama-3.2-3B-Instruct`              | Ungraded    | `huggingface`  | 128k tokens    | Not Available     | ✅              | ❌              | ✅    | ✅                  | ✅        | ✅                 | Not Available   | v1.0.2        |