spiceai · lukekim · Feb 7, 2025 · Feb 7, 2025 · Feb 7, 2025 · Feb 7, 2025
diff --git a/website/docs/components/tools/index.md b/website/docs/components/tools/index.md
@@ -0,0 +1,54 @@
+---
+title: 'LLM Tools (Function Calling)'
+sidebar_label: 'LLM Tools'
+description: 'Overview of supported LLM tools (function calling) and how to define new tools'
+---
+
+A tool is a function or operation that can be called directly or by a [language model](/docs/features/large-language-models) (LLMs). The Spice runtime has several tools available by default, giving LLMs access to various parts of the runtime. Tools can also be added or configured by the user by declaring them in the `tools` section of `spicepod.yaml`.
+
+For details about providing LLMs tool access, see [Language Model Tools](/docs/features/large-language-models/tools).
+
+**Example**
+```yaml
+tools:
+  - name: arpanet
+    from: websearch
+    description: "Search the web for information."
+    params:
+      engine: perplexity
+      perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN }
+
+```
+
+For details on tool  specifications, see the [Tools Spicepod Reference](/docs/reference/spicepod/tools).
+
+### Available Tools
+
+| Name                      | Description                                                       | Default Group     |
+| ------------------------- | ----------------------------------------------------------------- | ----------------- |
+| `list_datasets`           | List all available datasets in the runtime.                       | `auto`               |
+| `sql`                     | Execute SQL queries on the runtime.                               | `auto`               |
+| `table_schema`            | Get the schema of a specific SQL table.                           | `auto`               |
+| `document_similarity`     | Retrieve documents based on an input query.                       | `auto`               |
+| `sample_distinct_columns` | Generate a synthetic sample of data with distinct values.         | `auto`               |
+| `random_sample`           | Sample random rows from a table.                                  | `auto`               |
+| `top_n_sample`            | Sample the top N rows from a table based on a specified ordering. | `auto`               |
+| `memory:load`             | Retrieve all stored memories from the last time period.           | `memory`                |
+| `memory:store`            | Store information from LLM interaction(s) for future reference.   | `memory`                |
+| [`websearch`][websearch]  | Search the web for information.                                   | -                |
+
+[websearch]: /docs/components/tools/websearch
+
+### Tool Groups
+Tool groups are predefined sets of tools that can be provided to LLMs in a single tool name. For example, the `auto` tool group provides all default tools to the LLM (see above table).
+```yaml
+models:
+  - name: full-runtime
+    from: openai:gpt-4o
+    params:
+      tools: auto # Use all default tools
+```
+
+Available tool groups:
+ - `auto`: All default tools (see above table).
+ - `memory`: Memory tools for storing and retrieving information across conversations.
diff --git a/website/docs/components/tools/websearch.md b/website/docs/components/tools/websearch.md
@@ -0,0 +1,57 @@
+---
+title: 'Web Search Tool'
+sidebar_label: 'Websearch'
+---
+
+The Web Search Tool enables Spice models to search the web for information. The tool is available through the `websearch` tool, and backed by different search engines.
+
+## Usage
+```yaml
+tools:
+  - name: the_internet
+    from: websearch
+    description: "Search the web for information."
+    params:
+      engine: perplexity
+      perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN }
+```
+
+# Configuration
+## `from`
+
+The `from` field is used to specify the tool to use. For the Web Search Tool, use `websearch`.
+
+## `name`
+
+The `name` field is used to specify the name of the tool. This name is used:
+ - To reference the tool in the model's `params.tools` field
+ - To make HTTP requests to the tool via the [API](/docs/api/HTTP/post), i.e. `v1/tools/{name}`.
+ - Provided to any language model that uses the tool.
+
+## `description`
+
+The `description` field is used to provide a description of the tool. This description is provided to any language model that uses the tool.
+
+## `params`
+
+The `params` field is used to ... The following parameters are supported:
+ - `engine`: The search engine to use. Possible values:
+   - `perplexity`: Use the Perplexity search engine.
+ - `engine_*`: Each search engine has its own set of parameters. See the documentation for the specific search engine for more information.
+
+# Search Engines
+- [Perplexity](https://perplexity.com): Powered by Perplexity [Sonar](https://docs.perplexity.ai/).
+
+## Perplexity
+To define a Perplexity search engine, use the following parameters:
+ - `perplexity_auth_token` (required):  The authentication token for the Perplexity API. Use the [secret replacement syntax](../secret-stores/index.md) to reference a secret, e.g. `${secrets:my_perplexity_auth_token}`. To get an authentication token, see Perplexity's [Getting Started](https://docs.perplexity.ai/guides/getting-started).
+ - `perplexity_return_images` (default: false): Determines whether or not a request should return images.
+ - `perplexity_return_related_questions` (default: false): Determines whether or not a request should return related questions.
+  - `perplexity_search_domain_filter`: Given a list of domains, limit the citations used by the online model to URLs from the specified domains. Currently limited to only 3 domains for whitelisting and blacklisting. For blacklisting add a - to the beginning of the domain string.
+    - Example:
+      ```yaml
+      perplexity_search_domain_filter:
+        - spice.ai
+        - docs.perplexity.ai
+      ```
+  - `perplexity_search_recency_filter`: Returns search results within the specified time interval - does not apply to images. One of: `month`, `week`, `day`, `hour`.
diff --git a/website/docs/features/large-language-models/memory.md b/website/docs/features/large-language-models/memory.md
@@ -33,7 +33,4 @@ models:
       tools: memory, sql # Can be combined with other tool groups
 ```
 
-## Available Tools
-
-- `store_memory`: Store important information for future reference
-- `load_memory`: Retrieve previously stored memories from the last time period.
+For more information on tools, see [Tool components](/docs/components/tools).
diff --git a/website/docs/features/large-language-models/tools.md b/website/docs/features/large-language-models/tools.md
@@ -13,6 +13,8 @@ tags:
 
 Spice provides tools that help LLMs interact with the runtime. To specify these tools for a Spice model, include them in its `params.tools`.
 
+For a list of available tools, or how to define additional tools, see [Tool Components](/docs/components/tools).
+
 ### Example: Specifying Tools for a Model
 
 ```yaml
@@ -21,21 +23,26 @@ models:
     from: openai:gpt-4o
     params:
       tools: list_datasets, sql, table_schema
+```
 
+### Example: Specifying tools via a Tool Group
+```yaml
   - name: full-runtime
     from: openai:gpt-4o
     params:
       tools: auto # Use all default tools
 ```
 
-Additional tools can be appended:
+For details on tool groups, see [Tool Components](/docs/components/tools#tool-groups).
+
+### Example: Specifying tools and tool groups
 
 ```yaml
 models:
   - name: full-runtime
     from: openai:gpt-4o
     params:
-      tools: auto, memory
+      tools: memory, sql
 ```
 
 ### Tool Recursion Limit
@@ -49,13 +56,3 @@ models:
     params:
       tool_recursion_limit: 3
 ```
-
-## Available tools
-
-- `list_datasets`: List all available datasets in the runtime.
-- `sql`: Execute SQL queries on the runtime.
-- `table_schema`: Get the schema of a specific SQL table.
-- `document_similarity`: For datasets with an embedding column, retrieve documents based on an input query. It is equivalent to [/v1/search](/docs/api/HTTP/post-search).
-- `sample_distinct_columns`: For a dataset, generate a synthetic sample of data whereby each column has at least a number of distinct values.
-- `random_sample`: Sample random rows from a table.
-- `top_n_sample`: Sample the top N rows from a table based on a specified ordering.
diff --git a/website/docs/reference/spicepod/tools.md b/website/docs/reference/spicepod/tools.md
@@ -0,0 +1,43 @@
+---
+title: 'Tools (Function Calling)'
+sidebar_label: 'Tools'
+description: 'Tools YAML reference'
+---
+
+Tools define functions that can be invoked within the Spice runtime, either directly or by a [language model](/docs/features/large-language-models) (LLMs). These tools provide access to different functionalities and can be customized in the `tools` section of `spicepod.yaml`.
+
+## `tools`
+
+The `tools` section in your configuration specifies one or more tools available for use in the runtime.
+
+Example:
+
+```yaml
+tools:
+  - name: arpanet
+    from: websearch
+    description: "Search the web for information."
+    params:
+      engine: perplexity
+      perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN }
+```
+
+### `name`
+
+A unique identifier for this tool.
+
+### `from`
+
+Defines the source of the tool, or the specific built-in tool to customise. See [Available Tools](/docs/components/tools#available-tools) for a list of available tools.
+
+### `description`
+
+Optional. A textual description of the tool's function.
+
+### `params`
+
+Optional. A map of key-value pairs for additional parameters specific to the tool.
+
+### `dependsOn`
+
+Optional. A list of dependencies that must be available before this tool can be used.