Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions website/docs/deployment/architectures/cluster.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title: 'Cluster-Based Deployment (Spice.ai Enterprise)'
sidebar_label: 'Cluster'
description: 'Deploying Spice as a cluster'
sidebar_position: 6
pagination_prev: null
pagination_next: null
---

A full cluster-based deployment leveraging **Spice.ai Enterprise**, which includes advanced services and integrations for Kubernetes. This method is ideal for organizations requiring large-scale or complex deployments, including specialized clustering capabilities.

**Benefits**

- Provides **enterprise-grade features**: advanced security, monitoring, and support.
- Simplifies **managing multiple nodes** for high availability and large workloads.
- Offers **direct integration** with Spice Cloud or on-prem Kubernetes clusters.

**Considerations**

- **Requires a commercial license** or subscription to Spice Enterprise.
- More **complex initial setup**, typically involving specialized DevOps expertise.

**Use This Approach When**

- You operate at **significant scale** or have stringent availability requirements.
- You need **enterprise-level support** and advanced monitoring, security, or compliance features.
- Your team can manage a **robust Kubernetes environment** or you plan to integrate with the Spice Cloud at scale.

**Example Use Case**
A large financial services firm requiring a highly available, secure environment. They run Spice.ai across multiple clusters using Spice Enterprise for advanced monitoring, role-based access control, and dedicated support.
32 changes: 32 additions & 0 deletions website/docs/deployment/architectures/hosted.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
title: 'Cloud Hosted'
sidebar_label: 'Hosted'
description: 'Deploying Spice cloud hosted in the Spice Cloud Platform'
sidebar_position: 5
pagination_prev: null
pagination_next: null
---

The Spice Runtime is deployed on a fully managed service within the Spice Cloud Platform, minimizing the operational burden of managing clusters, upgrades, and infrastructure.

**Benefits**

- Reduced overhead for deployment, scaling, and maintenance.
- Access to specialized hosting features and quick setup.
- Helps reduce operational complexity and cost.

**Considerations**

- Reliance on external hosting and associated terms or limits.
- Potential compliance or data residency considerations for certain industries.
- May introduce latency depending on the cloud provider's infrastructure.

**Use This Approach When**

- Limited DevOps resources are available, or focus on application logic over infrastructure is preferred.
- A fully managed environment with minimal setup time is desired.
- A single, managed solution is prioritized over running own clusters.
- Minimizing operational complexity and cost is the goal.

**Example Use Case**
A startup or team with limited DevOps support that needs a reliable, managed environment. Quick deployment and minimal in-house infrastructure responsibilities are priorities.
15 changes: 15 additions & 0 deletions website/docs/deployment/architectures/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
title: 'Deployment Architectures'
sidebar_label: 'Architectures'
description: 'Spice.ai Open Source Deployment architectures'
sidebar_position: 1
pagination_prev: null
pagination_next: null
---

- [Sidecar Deployment](sidecar.md)
- [Microservice Deployment (Single or Multiple Replicas)](microservice.md)
- [Tiered Deployment](tiered.md)
- [Cloud-Hosted in the Spice Cloud Platform](hosted.md)
- [Sharded Deployment](sharded.md)
- [Cluster Deployment (Spice.ai Enterprise)](cluster.md)
33 changes: 33 additions & 0 deletions website/docs/deployment/architectures/microservice.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
title: 'Microservice Deployment (Single or Multiple Replicas)'
sidebar_label: 'Microservice'
description: 'Deploying Spice as a microservice'
sidebar_position: 2
pagination_prev: null
pagination_next: null
---

The Spice Runtime operates as an independent microservice. Multiple replicas may be deployed behind a load balancer to achieve high availability and handle spikes in demand.

**Benefits**

- Loose coupling between the application and the Spice Runtime.
- Independent scaling and upgrades.
- Can serve multiple applications or services within an organization.
- Helps achieve high availability and redundancy.

**Considerations**

- Additional network hop introduces latency compared to sidecar.
- More complex infrastructure, requiring service discovery and load balancing.
- Potentially higher cost due to additional infrastructure components.

**Use This Approach When**

- A loosely coupled architecture and the ability to independently scale the AI service are desired.
- Multiple services or teams need to share the same AI engine.
- Heavy or varying traffic is anticipated, requiring independent scaling of the Spice Runtime.
- Resiliency and redundancy are prioritized over simplicity.

**Example Use Case**
A large organization where multiple services (recommendations, analytics, etc.) need to share AI insights. A centralized Spice Runtime microservice cluster helps separate teams consume AI outputs without duplicating efforts.
32 changes: 32 additions & 0 deletions website/docs/deployment/architectures/sharded.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
title: 'Sharded'
sidebar_label: 'Sharded'
description: 'Deploying Spice with shards'
sidebar_position: 4
pagination_prev: null
pagination_next: null
---

The Spice Runtime instances can be sharded based on specific criteria, such as by customer, state, or other logical partitions. Each shard operates independently, with a 1:N Application to Spice instances ratio.

**Benefits**

- Helps distribute load across multiple instances, improving performance and scalability.
- Isolates failures to specific shards, enhancing resiliency.
- Allows tailored configurations and optimizations for different shards.

**Considerations**

- More complex deployment and management due to multiple instances.
- Requires effective sharding strategy to balance load and avoid hotspots.
- Potentially higher cost due to multiple instances.

**Use This Approach When**

- Distributing load across multiple instances for better performance is needed.
- Isolating failures to specific shards to improve resiliency is desired.
- The application can benefit from tailored configurations for different logical partitions.
- The complexity of managing multiple instances can be handled.

**Example Use Case**
A multi-tenant application where each customer has a dedicated Spice Runtime instance. This helps ensure that heavy usage by one customer does not impact others, and allows for customer-specific optimizations.
34 changes: 34 additions & 0 deletions website/docs/deployment/architectures/sidecar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
title: 'Sidecar Deployment'
sidebar_label: 'Sidecar'
description: 'Deploying Spice as a application sidecar'
sidebar_position: 1
pagination_prev: null
pagination_next: null
---

Run the Spice Runtime in a separate container or process on the same machine as the main application. For example, in Kubernetes as a [Sidecar Container](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/). This approach minimizes communication overhead as requests to the Spice Runtime are transported over local loopback.

**Benefits**

- Low-latency communication between the application and the Spice Runtime.
- Simplified lifecycle management (same pod).
- Isolated environment without needing a separate microservice.
- Helps ensure resiliency and redundancy by replicating data across sidecars.

**Considerations**

- Each application pod includes a copy of the Spice Runtime, increasing resource usage.
- Updating the Spice Runtime independently requires updating each pod.
- Accelerated data is replicated to each sidecar, adding resiliency and redundancy but increasing resource usage and requests to data sources.
- May increase overall cost due to resource duplication.

**Use This Approach When**

- Fast, low-latency interactions between the application and the Spice Runtime are needed (e.g., real-time decision-making).
- Scaling needs are small or moderate, making duplication of the Spice Runtime in each pod acceptable.
- Keeping the architecture simple without additional services or load balancers is preferred.
- Performance and latency are prioritized over cost and complexity.

**Example Use Case**
A real-time trading bot or a data-intensive application that relies on immediate feedback, where minimal latency is critical. Both containers in the same pod ensure very fast data exchange.
33 changes: 33 additions & 0 deletions website/docs/deployment/architectures/tiered.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
title: 'Tiered Deployment'
sidebar_label: 'Tiered'
description: 'Deploying Spice in tiers'
sidebar_position: 3
pagination_prev: null
pagination_next: null
---

A hybrid approach combining sidecar deployments for performance-critical tasks and a shared microservice for batch processing or less time-sensitive workloads.

**Benefits**

- Real-time responsiveness where needed (sidecar).
- Centralized microservice handles broader or shared tasks.
- Balances resource usage by limiting sidecar instances to high-priority operations.
- Helps balance performance and latency with cost and complexity.

**Considerations**

- More complex deployment structure, mixing two patterns.
- Must ensure consistent versioning between sidecar and microservice instances.
- Potentially higher operational complexity and cost.

**Use This Approach When**

- Certain application components require ultra-low-latency responses, while others do not.
- Centralized AI or analytics is needed, but localized real-time decision-making is also required.
- The system can handle the operational complexity of running multiple deployment patterns.
- Balancing performance and latency with cost and complexity is the goal.

**Example Use Case**
A logistics application that calculates routing decisions in real time (sidecar) while a microservice component processes aggregated data for periodic analysis or re-training models.
22 changes: 11 additions & 11 deletions website/docs/deployment/docker.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: 'Docker - Kubernetes'
description: 'Running Spice.ai as Docker container'
sidebar_label: 'Docker'
sidebar_position: 2
sidebar_position: 3
tags:
- deployment
- docker
Expand Down Expand Up @@ -42,15 +42,15 @@ Example Docker Compose configuration to build and start the container:

```yaml
services:
spiced:
build:
context: .
dockerfile: Dockerfile
container_name: spiced-container
ports:
- "50051:50051"
- "8090:8090"
- "9090:9090"
spiced:
build:
context: .
dockerfile: Dockerfile
container_name: spiced-container
ports:
- '50051:50051'
- '8090:8090'
- '9090:9090'
```

```bash
Expand All @@ -76,7 +76,7 @@ docker-compose up --build
=> => naming to docker.io/library/accounts-spiced 0.0s
=> [spiced] resolving provenance for metadata file 0.0s
[+] Running 1/0
✔ Container spiced-container Recreated 0.0s
✔ Container spiced-container Recreated 0.0s
Attaching to spiced-container
spiced-container | 2024-12-19T00:43:13.844091Z INFO runtime::init::dataset: No datasets were configured. If this is unexpected, check the Spicepod configuration.
spiced-container | 2024-12-19T00:43:13.844615Z INFO runtime::opentelemetry: Spice Runtime OpenTelemetry listening on 127.0.0.1:50052
Expand Down
4 changes: 2 additions & 2 deletions website/docs/deployment/kubernetes/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: 'Helm - Kubernetes'
sidebar_label: 'Helm - Kubernetes'
sidebar_position: 1
sidebar_position: 2
description: 'Deploy Spice.ai in Kubernetes using Helm.'
pagination_prev: 'deployment/index'
pagination_next: null
Expand Down Expand Up @@ -118,7 +118,7 @@ livenessProbe:
port: 8090
```

In Kubernetes, this pod will not be marked as *Healthy* until the `/health` endpoint returns `200`.
In Kubernetes, this pod will not be marked as _Healthy_ until the `/health` endpoint returns `200`.

#### Readiness probe

Expand Down