spiceai · peasee · Feb 4, 2025 · Feb 4, 2025 · Feb 4, 2025 · Feb 4, 2025
diff --git a/website/docs/deployment/architectures/cluster.md b/website/docs/deployment/architectures/cluster.md
@@ -0,0 +1,30 @@
+---
+title: 'Cluster-Based Deployment (Spice.ai Enterprise)'
+sidebar_label: 'Cluster'
+description: 'Deploying Spice as a cluster'
+sidebar_position: 6
+pagination_prev: null
+pagination_next: null
+---
+
+A full cluster-based deployment leveraging **Spice.ai Enterprise**, which includes advanced services and integrations for Kubernetes. This method is ideal for organizations requiring large-scale or complex deployments, including specialized clustering capabilities.
+
+**Benefits**
+
+- Provides **enterprise-grade features**: advanced security, monitoring, and support.
+- Simplifies **managing multiple nodes** for high availability and large workloads.
+- Offers **direct integration** with Spice Cloud or on-prem Kubernetes clusters.
+
+**Considerations**
+
+- **Requires a commercial license** or subscription to Spice Enterprise.
+- More **complex initial setup**, typically involving specialized DevOps expertise.
+
+**Use This Approach When**
+
+- You operate at **significant scale** or have stringent availability requirements.
+- You need **enterprise-level support** and advanced monitoring, security, or compliance features.
+- Your team can manage a **robust Kubernetes environment** or you plan to integrate with the Spice Cloud at scale.
+
+**Example Use Case**  
+A large financial services firm requiring a highly available, secure environment. They run Spice.ai across multiple clusters using Spice Enterprise for advanced monitoring, role-based access control, and dedicated support.
diff --git a/website/docs/deployment/architectures/hosted.md b/website/docs/deployment/architectures/hosted.md
@@ -0,0 +1,32 @@
+---
+title: 'Cloud Hosted'
+sidebar_label: 'Hosted'
+description: 'Deploying Spice cloud hosted in the Spice Cloud Platform'
+sidebar_position: 5
+pagination_prev: null
+pagination_next: null
+---
+
+The Spice Runtime is deployed on a fully managed service within the Spice Cloud Platform, minimizing the operational burden of managing clusters, upgrades, and infrastructure.
+
+**Benefits**
+
+- Reduced overhead for deployment, scaling, and maintenance.
+- Access to specialized hosting features and quick setup.
+- Helps reduce operational complexity and cost.
+
+**Considerations**
+
+- Reliance on external hosting and associated terms or limits.
+- Potential compliance or data residency considerations for certain industries.
+- May introduce latency depending on the cloud provider's infrastructure.
+
+**Use This Approach When**
+
+- Limited DevOps resources are available, or focus on application logic over infrastructure is preferred.
+- A fully managed environment with minimal setup time is desired.
+- A single, managed solution is prioritized over running own clusters.
+- Minimizing operational complexity and cost is the goal.
+
+**Example Use Case**  
+A startup or team with limited DevOps support that needs a reliable, managed environment. Quick deployment and minimal in-house infrastructure responsibilities are priorities.
diff --git a/website/docs/deployment/architectures/index.md b/website/docs/deployment/architectures/index.md
@@ -0,0 +1,15 @@
+---
+title: 'Deployment Architectures'
+sidebar_label: 'Architectures'
+description: 'Spice.ai Open Source Deployment architectures'
+sidebar_position: 1
+pagination_prev: null
+pagination_next: null
+---
+
+- [Sidecar Deployment](sidecar.md)
+- [Microservice Deployment (Single or Multiple Replicas)](microservice.md)
+- [Tiered Deployment](tiered.md)
+- [Cloud-Hosted in the Spice Cloud Platform](hosted.md)
+- [Sharded Deployment](sharded.md)
+- [Cluster Deployment (Spice.ai Enterprise)](cluster.md)
diff --git a/website/docs/deployment/architectures/microservice.md b/website/docs/deployment/architectures/microservice.md
@@ -0,0 +1,33 @@
+---
+title: 'Microservice Deployment (Single or Multiple Replicas)'
+sidebar_label: 'Microservice'
+description: 'Deploying Spice as a microservice'
+sidebar_position: 2
+pagination_prev: null
+pagination_next: null
+---
+
+The Spice Runtime operates as an independent microservice. Multiple replicas may be deployed behind a load balancer to achieve high availability and handle spikes in demand.
+
+**Benefits**
+
+- Loose coupling between the application and the Spice Runtime.
+- Independent scaling and upgrades.
+- Can serve multiple applications or services within an organization.
+- Helps achieve high availability and redundancy.
+
+**Considerations**
+
+- Additional network hop introduces latency compared to sidecar.
+- More complex infrastructure, requiring service discovery and load balancing.
+- Potentially higher cost due to additional infrastructure components.
+
+**Use This Approach When**
+
+- A loosely coupled architecture and the ability to independently scale the AI service are desired.
+- Multiple services or teams need to share the same AI engine.
+- Heavy or varying traffic is anticipated, requiring independent scaling of the Spice Runtime.
+- Resiliency and redundancy are prioritized over simplicity.
+
+**Example Use Case**  
+A large organization where multiple services (recommendations, analytics, etc.) need to share AI insights. A centralized Spice Runtime microservice cluster helps separate teams consume AI outputs without duplicating efforts.
diff --git a/website/docs/deployment/architectures/sharded.md b/website/docs/deployment/architectures/sharded.md
@@ -0,0 +1,32 @@
+---
+title: 'Sharded'
+sidebar_label: 'Sharded'
+description: 'Deploying Spice with shards'
+sidebar_position: 4
+pagination_prev: null
+pagination_next: null
+---
+
+The Spice Runtime instances can be sharded based on specific criteria, such as by customer, state, or other logical partitions. Each shard operates independently, with a 1:N Application to Spice instances ratio.
+
+**Benefits**
+
+- Helps distribute load across multiple instances, improving performance and scalability.
+- Isolates failures to specific shards, enhancing resiliency.
+- Allows tailored configurations and optimizations for different shards.
+
+**Considerations**
+
+- More complex deployment and management due to multiple instances.
+- Requires effective sharding strategy to balance load and avoid hotspots.
+- Potentially higher cost due to multiple instances.
+
+**Use This Approach When**
+
+- Distributing load across multiple instances for better performance is needed.
+- Isolating failures to specific shards to improve resiliency is desired.
+- The application can benefit from tailored configurations for different logical partitions.
+- The complexity of managing multiple instances can be handled.
+
+**Example Use Case**  
+A multi-tenant application where each customer has a dedicated Spice Runtime instance. This helps ensure that heavy usage by one customer does not impact others, and allows for customer-specific optimizations.
diff --git a/website/docs/deployment/architectures/sidecar.md b/website/docs/deployment/architectures/sidecar.md
@@ -0,0 +1,34 @@
+---
+title: 'Sidecar Deployment'
+sidebar_label: 'Sidecar'
+description: 'Deploying Spice as a application sidecar'
+sidebar_position: 1
+pagination_prev: null
+pagination_next: null
+---
+
+Run the Spice Runtime in a separate container or process on the same machine as the main application. For example, in Kubernetes as a [Sidecar Container](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/). This approach minimizes communication overhead as requests to the Spice Runtime are transported over local loopback.
+
+**Benefits**
+
+- Low-latency communication between the application and the Spice Runtime.
+- Simplified lifecycle management (same pod).
+- Isolated environment without needing a separate microservice.
+- Helps ensure resiliency and redundancy by replicating data across sidecars.
+
+**Considerations**
+
+- Each application pod includes a copy of the Spice Runtime, increasing resource usage.
+- Updating the Spice Runtime independently requires updating each pod.
+- Accelerated data is replicated to each sidecar, adding resiliency and redundancy but increasing resource usage and requests to data sources.
+- May increase overall cost due to resource duplication.
+
+**Use This Approach When**
+
+- Fast, low-latency interactions between the application and the Spice Runtime are needed (e.g., real-time decision-making).
+- Scaling needs are small or moderate, making duplication of the Spice Runtime in each pod acceptable.
+- Keeping the architecture simple without additional services or load balancers is preferred.
+- Performance and latency are prioritized over cost and complexity.
+
+**Example Use Case**  
+A real-time trading bot or a data-intensive application that relies on immediate feedback, where minimal latency is critical. Both containers in the same pod ensure very fast data exchange.
diff --git a/website/docs/deployment/architectures/tiered.md b/website/docs/deployment/architectures/tiered.md
@@ -0,0 +1,33 @@
+---
+title: 'Tiered Deployment'
+sidebar_label: 'Tiered'
+description: 'Deploying Spice in tiers'
+sidebar_position: 3
+pagination_prev: null
+pagination_next: null
+---
+
+A hybrid approach combining sidecar deployments for performance-critical tasks and a shared microservice for batch processing or less time-sensitive workloads.
+
+**Benefits**
+
+- Real-time responsiveness where needed (sidecar).
+- Centralized microservice handles broader or shared tasks.
+- Balances resource usage by limiting sidecar instances to high-priority operations.
+- Helps balance performance and latency with cost and complexity.
+
+**Considerations**
+
+- More complex deployment structure, mixing two patterns.
+- Must ensure consistent versioning between sidecar and microservice instances.
+- Potentially higher operational complexity and cost.
+
+**Use This Approach When**
+
+- Certain application components require ultra-low-latency responses, while others do not.
+- Centralized AI or analytics is needed, but localized real-time decision-making is also required.
+- The system can handle the operational complexity of running multiple deployment patterns.
+- Balancing performance and latency with cost and complexity is the goal.
+
+**Example Use Case**  
+A logistics application that calculates routing decisions in real time (sidecar) while a microservice component processes aggregated data for periodic analysis or re-training models.
diff --git a/website/docs/deployment/docker.md b/website/docs/deployment/docker.md
@@ -2,7 +2,7 @@
 title: 'Docker - Kubernetes'
 description: 'Running Spice.ai as Docker container'
 sidebar_label: 'Docker'
-sidebar_position: 2
+sidebar_position: 3
 tags:
   - deployment
   - docker
@@ -42,15 +42,15 @@ Example Docker Compose configuration to build and start the container:
 
 ```yaml
 services:
-    spiced:
-        build:
-            context: .
-            dockerfile: Dockerfile
-        container_name: spiced-container
-        ports:
-            - "50051:50051"
-            - "8090:8090"
-            - "9090:9090"
+  spiced:
+    build:
+      context: .
+      dockerfile: Dockerfile
+    container_name: spiced-container
+    ports:
+      - '50051:50051'
+      - '8090:8090'
+      - '9090:9090'
 ```
 
 ```bash
@@ -76,7 +76,7 @@ docker-compose up --build
  => => naming to docker.io/library/accounts-spiced                                                                                                                         0.0s
  => [spiced] resolving provenance for metadata file                                                                                                                        0.0s
 [+] Running 1/0
- ✔ Container spiced-container  Recreated                                                                                                                                   0.0s 
+ ✔ Container spiced-container  Recreated                                                                                                                                   0.0s
 Attaching to spiced-container
 spiced-container  | 2024-12-19T00:43:13.844091Z  INFO runtime::init::dataset: No datasets were configured. If this is unexpected, check the Spicepod configuration.
 spiced-container  | 2024-12-19T00:43:13.844615Z  INFO runtime::opentelemetry: Spice Runtime OpenTelemetry listening on 127.0.0.1:50052

diff --git a/website/docs/deployment/kubernetes/index.md b/website/docs/deployment/kubernetes/index.md
@@ -1,7 +1,7 @@
 ---
 title: 'Helm - Kubernetes'
 sidebar_label: 'Helm - Kubernetes'
-sidebar_position: 1
+sidebar_position: 2
 description: 'Deploy Spice.ai in Kubernetes using Helm.'
 pagination_prev: 'deployment/index'
 pagination_next: null
@@ -118,7 +118,7 @@ livenessProbe:
     port: 8090
 ```
 
-In Kubernetes, this pod will not be marked as *Healthy* until the `/health` endpoint returns `200`.
+In Kubernetes, this pod will not be marked as _Healthy_ until the `/health` endpoint returns `200`.
 
 #### Readiness probe