Skip to content
This repository was archived by the owner on Oct 30, 2024. It is now read-only.

feat(breaking): EmbeddingModelProviderConfig #44

Merged
merged 12 commits into from
Jul 24, 2024
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,6 @@ tests/venv/
tests/__pycache__/

.DS_Store


vendor/
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ endif
GO_TAGS := netgo
LD_FLAGS := -s -w -X github.com/gptscript-ai/knowledge/version.Version=${GIT_TAG}
build:
go build -o bin/knowledge -tags "${GO_TAGS}" -ldflags '$(LD_FLAGS) ' .
go build -mod=mod -o bin/knowledge -tags "${GO_TAGS}" -ldflags '$(LD_FLAGS) ' .

run: build
bin/knowledge server
Expand Down
53 changes: 53 additions & 0 deletions docs/docs/04-configfile.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
title: Config File
---

# Config File

::: warning

The config file format is subject to change as it's still in development.

:::

::: note

This global configuration file is independent from the [flow configuration files](11-flows/01-overview.md#flow-config-file---flows-file).

:::

## Usage

Using the config file is as simple as passing `-c <path>` or `--config-file <path>` to the knowledge CLI on [supported commands](99-cmd/knowledge.md).
You may as well use the `KNOW_CONFIG_FILE` environment variable to set the path to the config file.

## Configuration Overview

Here we try to capture all supported configuration items in one example.

::: note

You can write the config in YAML or JSON format.
You can find some example config files in the [GitHub repository](https://github.com/gptscript-ai/knowledge/blob/main/examples/configfiles).

:::

```yaml
embeddings:
provider: vertex # this selects one of the below providers
cohere:
apiKey: "${COHERE_API_KEY}" # environment variables are expanded when reading the config file
model: "embed-english-v2.0"
openai:
apiKey: "${OPENAI_API_KEY}"
embeddingEndpoint: "/some-custom-endpoint" # anything that's not the default /embeddings
vertex:
apiKey: "${GOOGLE_API_KEY}"
project: "acorn-io"
model: "text-embedding-004"
```

### Sections

- `embeddings`: See [Embedding Models](05-embedding_models.md) for more details.
- `provider`: May as well be set using the command line flag `--embedding-model-provider` or the environment variable `KNOW_EMBEDDING_MODEL_PROVIDER` (default: `openai`).
53 changes: 0 additions & 53 deletions docs/docs/04-embedding_models.md

This file was deleted.

118 changes: 118 additions & 0 deletions docs/docs/05-embedding_models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
---
title: Embedding Models
---

# Embedding Models

## Generate Embeddings

Embeddings are automatically generated when ingesting a document.
Currently, this is part of the job of the vector store implementation ([chromem-go](https://github.com/philippgille/chromem-go)).


## Choosing an Embedding Model Provider

The knowledge tool supports multiple embedding model providers, which you can configure via the [global config file](04-configfile.md#configuration-overview) or via environment variables.
You can choose which of your configured providers to use by setting the `KNOW_EMBEDDING_MODEL_PROVIDER` environment variable or using the `--embedding-model-provider` flag.

::: note

The default selected provider is **OpenAI**

:::

### [OpenAI](https://openai.com/) + [Azure](https://ai.azure.com/)

OpenAI and Azure are configured via a single provider configuration to make the configuration similar to the one used by GPTScript.

| Environment Variable | Config Key | Default | Notes |
|---------------------------|------------------|-----------------------------|---------------------------------------|
| `OPENAI_BASE_URL` | `baseURL` | `https://api.openai.com/v1` | --- |
| `OPENAI_API_KEY` | `apiKey` | `sk-foo` | **required** |
| `OPENAI_EMBEDDING_MODEL` | `embeddingModel` | `text-embedding-ada-002` | --- |
| `OPENAI_API_TYPE` | `apiType` | `OPEN_AI` | one of `OPEN_AI`, `AZURE`, `AZURE_AD` |
| `OPENAI_API_VERSION` | `apiVersion` | `2024-02-01` | for **Azure** |
| `OPENAI_AZURE_DEPLOYMENT` | | `` | **required** for **Azure** |


#### OpenAI Compatible Providers

We have some first-class supported providers that are compatible with OpenAI API, that you can find on this page.
If yours is not in the list, you can still try to configure it using the OpenAI provider configuration as shown below for LM-Studio and Ollama.

<details>
<summary id="example-configurations-lm-studio"><strong>LM-Studio</strong></summary>

LM-Studio failed to return any embeddings if requested concurrently, so we set the parallel threads to 1.
This may change in the future. Tested with LM-Studio v0.2.27.


```dotenv
export OPENAI_BASE_URL=http://localhost:1234/v1
export OPENAI_API_KEY=lm-studio
export OPENAI_EMBEDDING_MODEL="CompendiumLabs/bge-large-en-v1.5-gguf"
export VS_CHROMEM_EMBEDDING_PARALLEL_THREAD="1"
```

::: note

Running with VS_CHROMEM_EMBEDDING_PARALLEL_THREAD="1" may be really really slow for a large amount of files (or just really large files).

:::

</details>

<details>
<summary id="example-configurations-ollama"><strong>Ollama</strong></summary>

Tested with Ollama v0.2.6 (pre-release that introduced OpenAI API compatibility).


```dotenv
export OPENAI_BASE_URL=http://localhost:11434/v1
export OPENAI_EMBEDDING_MODEL="mxbai-embed-large"
```

</details>

### [Cohere](https://cohere.com/)

| Environment Variable | Config Key | Default | Notes |
|----------------------|------------|----------------------|--------------|
| `COHERE_API_KEY` | `apiKey` | --- | **required** |
| `COHERE_MODEL` | `model` | `embed-english-v3.0` | --- |

### [Jina](https://jina.ai/)

| Environment Variable | Config Key | Default | Notes |
|----------------------|------------|------------------------------|--------------|
| `JINA_API_KEY` | `apiKey` | --- | **required** |
| `JINA_MODEL` | `model` | `jina-embeddings-v2-base-en` | --- |

### [LocalAI](https://localai.io/)

| Environment Variable | Config Key | Default | Notes |
|----------------------|------------|----------------------|-------|
| `LOCALAI_MODEL` | `model` | `bert-cpp-minilm-v6` | --- |

### [Mistral](https://mistral.ai/)

| Environment Variable | Config Key | Default | Notes |
|----------------------|------------|---------|--------------|
| `MISTRAL_API_KEY` | `apiKey` | --- | **required** |

### [Mixedbread](https://www.mixedbread.ai/)

| Environment Variable | Config Key | Default | Notes |
|----------------------|------------|--------------------|--------------|
| `MIXEDBREAD_API_KEY` | `apiKey` | --- | **required** |
| `MIXEDBREAD_MODEL` | `model` | `all-MiniLM-L6-v2` | --- |

### [Ollama](https://ollama.com/)

Requires Ollama v0.2.6 or later.

| Environment Variable | Config Key | Default | Notes |
|----------------------|------------|-----------------------------|-------|
| `OLLAMA_BASE_URL` | `baseURL` | `http://localhost:11434/v1` | --- |
| `MIXEDBREAD_MODEL` | `model` | `mxbai-embed-large` | --- |
13 changes: 13 additions & 0 deletions examples/configfiles/embedding_provider.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
embeddings:
provider: vertex # this selects one of the below
cohere:
apiKey: "${COHERE_API_KEY}" # environment variables are expanded when reading the config file
model: "embed-english-v2.0"
openai:
apiKey: "${OPENAI_API_KEY}"
embeddingEndpoint: "/some-custom-endpoint" # anything that's not the default /embeddings
vertex:
apiKey: "${GOOGLE_API_KEY}"
project: "acorn-io"
# apiEndpoint: https://us-central1-aiplatform.googleapis.com
model: "text-embedding-004"
17 changes: 16 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@ go 1.22.3

toolchain go1.22.4

replace github.com/tmc/langchaingo => github.com/StrongMonkey/langchaingo v0.0.0-20240617180437-9af4bee04c8b // Context-Aware Markdown Splitting
replace (
github.com/philippgille/chromem-go => github.com/iwilltry42/chromem-go v0.0.0-20240724101255-75d217fcc704 // Vertex Provider
github.com/tmc/langchaingo => github.com/StrongMonkey/langchaingo v0.0.0-20240617180437-9af4bee04c8b // Context-Aware Markdown Splitting
)

require (
code.sajari.com/docconv/v2 v2.0.0-pre.4
Expand All @@ -21,6 +24,12 @@ require (
github.com/glebarez/sqlite v1.11.0
github.com/google/uuid v1.6.0
github.com/hupe1980/golc v0.0.112
github.com/joho/godotenv v1.5.1
github.com/knadh/koanf/parsers/json v0.1.0
github.com/knadh/koanf/parsers/yaml v0.1.0
github.com/knadh/koanf/providers/env v0.1.0
github.com/knadh/koanf/providers/rawbytes v0.1.0
github.com/knadh/koanf/v2 v2.1.1
github.com/lu4p/cat v0.1.5
github.com/mitchellh/mapstructure v1.5.0
github.com/philippgille/chromem-go v0.6.1-0.20240703185604-935ec301a0a4
Expand Down Expand Up @@ -83,9 +92,12 @@ require (
github.com/go-playground/universal-translator v0.18.1 // indirect
github.com/go-playground/validator/v10 v10.20.0 // indirect
github.com/go-resty/resty/v2 v2.3.0 // indirect
github.com/go-viper/mapstructure/v2 v2.0.0-alpha.1 // indirect
github.com/gobwas/ws v1.2.1 // indirect
github.com/goccy/go-json v0.10.2 // indirect
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
github.com/google/go-querystring v1.1.0 // indirect
github.com/google/pprof v0.0.0-20230926050212-f7f687d19a98 // indirect
github.com/googleapis/gax-go/v2 v2.12.4 // indirect
github.com/gorilla/css v1.0.0 // indirect
github.com/hupe1980/go-promptlayer v0.0.6 // indirect
Expand All @@ -101,13 +113,16 @@ require (
github.com/kevinburke/ssh_config v1.2.0 // indirect
github.com/klauspost/compress v1.17.6 // indirect
github.com/klauspost/cpuid/v2 v2.2.7 // indirect
github.com/knadh/koanf/maps v0.1.1 // indirect
github.com/ledongthuc/pdf v0.0.0-20240201131950-da5b75280b06 // indirect
github.com/leodido/go-urn v1.4.0 // indirect
github.com/levigross/exp-html v0.0.0-20120902181939-8df60c69a8f5 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/mattn/go-runewidth v0.0.15 // indirect
github.com/microcosm-cc/bluemonday v1.0.26 // indirect
github.com/mitchellh/copystructure v1.2.0 // indirect
github.com/mitchellh/reflectwalk v1.0.2 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/olekukonko/tablewriter v0.0.5 // indirect
Expand Down
Loading