spiceai · lukekim · Feb 5, 2025 · Feb 5, 2025 · Feb 5, 2025 · Feb 5, 2025
diff --git a/website/docs/deployment/architectures/cluster.md b/website/docs/deployment/architectures/cluster.md
@@ -9,6 +9,8 @@ pagination_next: null
 
 A full cluster-based deployment leveraging **Spice.ai Enterprise**, which includes advanced services and integrations for Kubernetes. This method is ideal for organizations requiring large-scale or complex deployments, including specialized clustering capabilities.
 
+<img width="740" alt="cluster" src="https://github.com/user-attachments/assets/643e0a5c-6745-40c0-8695-0955c795179b" />
+
 **Benefits**
 
 - Provides **enterprise-grade features**: advanced security, monitoring, and support.

diff --git a/website/docs/deployment/architectures/hosted.md b/website/docs/deployment/architectures/hosted.md
@@ -9,6 +9,8 @@ pagination_next: null
 
 The Spice Runtime is deployed on a fully managed service within the Spice Cloud Platform, minimizing the operational burden of managing clusters, upgrades, and infrastructure.
 
+<img width="740" alt="hosted" src="https://github.com/user-attachments/assets/a985527b-3481-40f4-a689-f784c893b314" />
+
 **Benefits**
 
 - Reduced overhead for deployment, scaling, and maintenance.

diff --git a/website/docs/deployment/architectures/index.md b/website/docs/deployment/architectures/index.md
@@ -7,6 +7,8 @@ pagination_prev: null
 pagination_next: null
 ---
 
+<img width="740" alt="Spice ai OSS as a data and AI compute engine over disaggregated storage" src="https://github.com/user-attachments/assets/da3c0e90-4c48-48ca-b4bd-72eda816cfec" />
+
 - [Sidecar Deployment](sidecar.md)
 - [Microservice Deployment (Single or Multiple Replicas)](microservice.md)
 - [Tiered Deployment](tiered.md)

diff --git a/website/docs/deployment/architectures/microservice.md b/website/docs/deployment/architectures/microservice.md
@@ -9,6 +9,8 @@ pagination_next: null
 
 The Spice Runtime operates as an independent microservice. Multiple replicas may be deployed behind a load balancer to achieve high availability and handle spikes in demand.
 
+<img width="740" alt="microservice" src="https://github.com/user-attachments/assets/b46f050b-e500-4d53-b354-24f0ab30cad3" />
+
 **Benefits**
 
 - Loose coupling between the application and the Spice Runtime.

diff --git a/website/docs/deployment/architectures/sharded.md b/website/docs/deployment/architectures/sharded.md
@@ -9,6 +9,8 @@ pagination_next: null
 
 The Spice Runtime instances can be sharded based on specific criteria, such as by customer, state, or other logical partitions. Each shard operates independently, with a 1:N Application to Spice instances ratio.
 
+<img width="740" alt="sharded" src="https://github.com/user-attachments/assets/5730d108-6d22-4ea4-8c14-8e87ad6d0079" />
+
 **Benefits**
 
 - Helps distribute load across multiple instances, improving performance and scalability.

diff --git a/website/docs/deployment/architectures/sidecar.md b/website/docs/deployment/architectures/sidecar.md
@@ -9,6 +9,8 @@ pagination_next: null
 
 Run the Spice Runtime in a separate container or process on the same machine as the main application. For example, in Kubernetes as a [Sidecar Container](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/). This approach minimizes communication overhead as requests to the Spice Runtime are transported over local loopback.
 
+<img width="740" alt="sidecar" src="https://github.com/user-attachments/assets/716f7c23-1939-4947-85f5-b0ee2bbd63fc" />
+
 **Benefits**
 
 - Low-latency communication between the application and the Spice Runtime.

diff --git a/website/docs/deployment/architectures/tiered.md b/website/docs/deployment/architectures/tiered.md
@@ -9,6 +9,8 @@ pagination_next: null
 
 A hybrid approach combining sidecar deployments for performance-critical tasks and a shared microservice for batch processing or less time-sensitive workloads.
 
+<img width="740" alt="tiered" src="https://github.com/user-attachments/assets/e602bad4-bd0d-4069-bc91-5b5678a10710" />
+
 **Benefits**
 
 - Real-time responsiveness where needed (sidecar).

diff --git a/website/docs/features/large-language-models/index.md b/website/docs/features/large-language-models/index.md
@@ -11,7 +11,7 @@ tags:
 
 Spice provides a high-performance, OpenAI API-compatible AI Gateway optimized for managing and scaling large language models (LLMs). It offers tools for Enterprise Retrieval-Augmented Generation (RAG), such as SQL query across federated datasets and an advanced search feature (see [Search](/docs/features/search)).
 
-![Spice.ai Large-Language-Model (LLM) AI-Gateway](/img/features/ai-gateway.png).
+<img width="740" alt="ai-gateway" src="https://github.com/user-attachments/assets/4a45cd62-ebfc-4a73-956d-661f1ab44cd8" />
 
 Spice supports **full OpenTelemetry observability**, helping with detailed tracking of model tool use, recursion, data flows and requests for full transparency and easier debugging.
 

diff --git a/website/docs/features/observability/index.md b/website/docs/features/observability/index.md
@@ -9,7 +9,7 @@ pagination_next: null
 
 Spice can be monitored using the [Spice Prometheus-compatible Metrics Endpoint](https://prometheus.io/docs/instrumenting/exposition_formats/#basic-info).
 
-![Spice.ai Open Source Monitoring & Observability](/img/features/observability.png)
+<img width="740" alt="observability" src="https://github.com/user-attachments/assets/2468e3e7-4fb4-4a74-8b26-45eeeee90310" />
 
 Monitoring clients configuration: