Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions website/docs/cli/reference/refresh.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ spice refresh [dataset] [flags]
#### Flags

- `--tls-root-certificate-file` The path to the root certificate file used to verify the Spice.ai runtime server certificate
- `--refresh-sql` SQL used to refresh the dataset, see [Refresh SQL docs](/docs/components/data-accelerators/data-refresh.md#refresh-sql).
- `--refresh-mode` Refresh mode to use, see [Refresh Modes docs](/docs/components/data-accelerators/data-refresh.md#refresh-modes).
- `--refresh-sql` SQL used to refresh the dataset, see [Refresh SQL docs](/docs/features/data-acceleration/data-refresh.md#refresh-sql).
- `--refresh-mode` Refresh mode to use, see [Refresh Modes docs](/docs/features/data-acceleration/data-refresh.md#refresh-modes).
- `-h`, `--help` Print this help message

### Examples
Expand Down
2 changes: 1 addition & 1 deletion website/docs/clients/Datadog/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ sidebar_position: 3
pagination_next: null
---

Spice can be monitored with [Datadog](https://www.datadoghq.com/) using the [Spice Metrics Endpoint](/docs/features/monitoring/) and pre-built dashboards available in the [Spice repository](https://github.com/spiceai/spiceai/tree/trunk/monitoring).
Spice can be monitored with [Datadog](https://www.datadoghq.com/) using the [Spice Metrics Endpoint](/docs/features/observability/) and pre-built dashboards available in the [Spice repository](https://github.com/spiceai/spiceai/tree/trunk/monitoring).

## Datadog Agent Configuration

Expand Down
2 changes: 1 addition & 1 deletion website/docs/clients/grafana/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ pagination_prev: 'clients/index'
pagination_next: null
---

Spice can be monitored with [Grafana](https://grafana.com/grafana/) using the [Spice Metrics Endpoint](/docs/features/monitoring/) and pre-built dashboards available in the [Spice repository](https://github.com/spiceai/spiceai/tree/trunk/monitoring).
Spice can be monitored with [Grafana](https://grafana.com/grafana/) using the [Spice Metrics Endpoint](/features/observability/index.md) and pre-built dashboards available in the [Spice repository](https://github.com/spiceai/spiceai/tree/trunk/monitoring).

## Import Grafana Dashboard

Expand Down
2 changes: 1 addition & 1 deletion website/docs/components/data-accelerators/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ pagination_next: null

Data sourced by Data Connectors can be locally materialized and accelerated using a Data Accelerator.

A Data Accelerator will query/fetch data from a connected data source and store/update it locally in an embedded acceleration engine, such as DuckDB or SQLite. To set data refresh behavior, such as refreshing data on an interval see [Data Refresh](./data-refresh.md).
A Data Accelerator will query/fetch data from a connected data source and store/update it locally in an embedded acceleration engine, such as DuckDB or SQLite. To set data refresh behavior, such as refreshing data on an interval see [Data Refresh](/features/data-acceleration/data-refresh.md).

Dataset acceleration is enabled by setting the acceleration configuration. E.g.

Expand Down
2 changes: 1 addition & 1 deletion website/docs/components/data-connectors/file.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ For CSV-specific parameters, see [CSV Parameters](/docs/reference/file_format.md

## Trigger data refresh on file change

In addition to standard [Data Refresh](/docs/components/data-accelerators/data-refresh), a data refresh can also be triggered when the source file is modified. The File Data Connector uses a file system watcher to be notified the file has changed. The file watcher is disabled by default and can be enabled by setting the `file_watcher` parameter to `enabled` in the acceleration parameters.
In addition to standard [Data Refresh](/features/data-acceleration/data-refresh.md), a data refresh can also be triggered when the source file is modified. The File Data Connector uses a file system watcher to be notified the file has changed. The file watcher is disabled by default and can be enabled by setting the `file_watcher` parameter to `enabled` in the acceleration parameters.

```yaml
datasets:
Expand Down
2 changes: 1 addition & 1 deletion website/docs/components/data-connectors/github.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ All other filters are supported when `github_query_mode` is set to `search`, but
:::warning[Limitations]

- GitHub has a limitation in the Search API where it may return more stale data than the standard API used in the default query mode.
- GitHub has a limitation in the Search API where it only returns a maximum of 1000 results for a query. Use [append mode acceleration](../data-accelerators/data-refresh.md) to retrieve more results over time. See the [append example](#append-example) for pull requests.
- GitHub has a limitation in the Search API where it only returns a maximum of 1000 results for a query. Use [append mode acceleration](/features/data-acceleration/data-refresh.md) to retrieve more results over time. See the [append example](#append-example) for pull requests.

:::

Expand Down
2 changes: 1 addition & 1 deletion website/docs/features/caching/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ pagination_next: null

Spice supports in-memory caching of query results, which is enabled by default for both the HTTP (`/v1/sql`) and Arrow Flight APIs.

Results caching can help improve performance for bursts of requests and for non-accelerated results such as refresh data returned [on zero results](/docs/components/data-accelerators/data-refresh.md#behavior-on-zero-results).
Results caching can help improve performance for bursts of requests and for non-accelerated results such as refresh data returned [on zero results](/docs/features/data-acceleration/data-refresh.md#behavior-on-zero-results).

Results caching employs a [least-recently-used (LRU)](https://en.wikipedia.org/wiki/Cache_replacement_policies#LRU) cache replacement policy with the ability to specify an item expiry duration, which defaults to 1-second.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,18 @@ pagination_prev: null
pagination_next: null
---

Acceleration data can be refreshed (updated) by:

- **API**: POST to `/v1/datasets/:name/acceleration/refresh`. See [Refresh Dataset HTTP API](/docs/api/HTTP/post-dataset-refresh.api.mdx).

- **Interval**: Time-based refresh interval. See [Refresh Interval](#refresh-on-demand).

- **Change Data Capture (CDC)**: CDC from a database using Debezium. See [Change Data Capture](/features/cdc/index.md).

- **Push**: Spice-to-Spice Push over Apache Arrow DoExchange.

![Spice.ai Open Source Acceleration Refresh](/img/features/acceleration-refresh.png).

## Refresh Modes

Spice supports three modes to refresh/update local data from a connected data source. `full` is the default mode.
Expand Down
2 changes: 2 additions & 0 deletions website/docs/features/data-acceleration/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ pagination_prev: null

Datasets can be locally accelerated by the Spice runtime, pulling data from any [Data Connector](/docs/components/data-connectors) and storing it locally in a [Data Accelerator](/docs/components/data-accelerators) for faster access. The data can be kept up-to-date in real-time or on a refresh schedule, ensuring you always have the latest data locally for querying.

![Spice.ai Open Source Query Federation with Acceleration](/img/features/data-acceleration.png)

## Benefits

Local data acceleration stores data alongside your application, providing faster query times by eliminating network latency. This is especially beneficial for large query results, as data transfer over the network is avoided. Depending on the [Acceleration Engine](/docs/components/data-accelerators) used, data can also be stored in-memory, further reducing query times. [Indexes](./indexes.md) can be applied to speed up certain queries.
Expand Down
4 changes: 2 additions & 2 deletions website/docs/features/data-ingestion/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: 'Data Ingestion'
sidebar_label: 'Data Ingestion'
description: 'Learn how to ingest data in Spice.'
sidebar_position: 8
sidebar_position: 5
pagination_prev: null
pagination_next: null
---
Expand Down Expand Up @@ -46,7 +46,7 @@ datasets:

Start telegraf with the following config:

```
```toml
[[inputs.smart]]
attributes = true
[[outputs.opentelemetry]]
Expand Down
2 changes: 1 addition & 1 deletion website/docs/features/embeddings/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: 'Embedding Datasets'
sidebar_label: 'Embedding Datasets'
description: 'Learn how to define, or augment existing datasets with embedding column(s).'
sidebar_position: 11
sidebar_position: 7
pagination_prev: null
pagination_next: null
---
Expand Down
2 changes: 2 additions & 0 deletions website/docs/features/large-language-models/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags:

Spice provides a high-performance, OpenAI API-compatible AI Gateway optimized for managing and scaling large language models (LLMs). It offers tools for Enterprise Retrieval-Augmented Generation (RAG), such as SQL query across federated datasets and an advanced search feature (see [Search](/docs/features/search)).

![Spice.ai Large-Language-Model (LLM) AI-Gateway](/img/features/ai-gateway.png).

Spice supports **full OpenTelemetry observability**, helping with detailed tracking of model tool use, recursion, data flows and requests for full transparency and easier debugging.

## Configuring Language Models
Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
---
title: 'Monitoring'
sidebar_label: 'Monitoring'
title: 'Observability & Monitoring'
sidebar_label: 'Observability'
description: 'Learn how to use Spice telemetry.'
sidebar_position: 9
sidebar_position: 10
pagination_prev: null
pagination_next: null
---

Spice can be monitored using the [Spice Prometheus-compatible Metrics Endpoint](https://prometheus.io/docs/instrumenting/exposition_formats/#basic-info). Monitoring clients configuration:
Spice can be monitored using the [Spice Prometheus-compatible Metrics Endpoint](https://prometheus.io/docs/instrumenting/exposition_formats/#basic-info).

![Spice.ai Open Source Monitoring & Observability](/img/features/observability.png)

Monitoring clients configuration:

- [Grafana](/docs/clients/grafana/)
- [Datadog](/docs/clients/Datadog/)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
---
title: 'Federated Queries'
sidebar_label: 'Federated Queries'
description: 'Learn how to use federated queries in Spice.'
title: 'Query Federation'
sidebar_label: 'Query Federation'
description: 'Learn how to use federated SQL queries in Spice.ai Open Source'
sidebar_position: 1
pagination_prev: null
pagination_next: null
---

Spice supports federated queries, enabling you to join and combine data from multiple sources, including databases (PostgreSQL, MySQL), data warehouses (Databricks, Snowflake, BigQuery), and data lakes (S3, MinIO). For a full list of supported sources, see [Data Connectors](/docs/components/data-connectors/index.md).
Spice supports query federation, enabling you to join, combine, and query data using SQL from multiple sources, including databases (PostgreSQL, MySQL), data warehouses (Databricks, Snowflake, BigQuery), and data lakes (S3, MinIO).

### Getting Started
![Spice.ai Open Source Query Federation](/img/features/query-federation.png)

For a full list of supported sources, see [Data Connectors](/docs/components/data-connectors/index.md).

## Getting Started

To start using federated queries in Spice, follow these steps:

Expand Down
2 changes: 1 addition & 1 deletion website/docs/features/search/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: 'Search Functionality'
sidebar_label: 'Search'
description: 'Learn how Spice can search across datasets using database-native and vector-search methods.'
sidebar_position: 7
sidebar_position: 8
pagination_prev: null
pagination_next: null
tags:
Expand Down
2 changes: 1 addition & 1 deletion website/docs/features/semantic-model/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: 'Semantic Model'
sidebar_label: 'Semantic Model'
description: 'Learn how to define and use semantic data models with Spice.'
sidebar_position: 7
sidebar_position: 9
pagination_prev: null
pagination_next: null
---
Expand Down
8 changes: 3 additions & 5 deletions website/docs/index.mdx
Original file line number Diff line number Diff line change
@@ -1,19 +1,17 @@
---
sidebar_position: 0
title: Home
tags:
- home
- overview
- introduction
---

import ReactPlayer from 'react-player';
import ThemeBasedImage from '@site/src/components/ThemeBasedImage';

# Spice.ai OSS
# Spice.ai Open Source

**Spice** is an open-source SQL query and AI compute engine, written in Rust, for data-driven apps and agents.

![Spice.ai Open Source Data Query & AI-Inference Compute Engine](/img/spice.ai-compute-engine.png)

Spice provides three industry standard APIs in a lightweight, portable runtime (single ~140 MB binary):

1. **SQL Query APIs**: HTTP, Arrow Flight, Arrow Flight SQL, ODBC, JDBC, and ADBC.
Expand Down
Loading
Loading