Skip to content

Commit

Permalink
0001: Service Config (#2)
Browse files Browse the repository at this point in the history
* #1 Inited the proposal

* #1 Adding the motivation and important context for configurations

* #1Updated the template to suggest RFC2119 verbs

* #1: Added design detail

* #1Fixed type

* #1 Added references

* #1 The initial raw config structure proposal

* #1 Fixed typos

* #1 Updated the main config structure example & added an example of the most minimalistic config possible. Elaborated the reqs. Added rejected alternatives & future work

* #1 Added lmpl link
  • Loading branch information
roma-glushko authored Dec 27, 2023
1 parent d1a614d commit d4d7f89
Show file tree
Hide file tree
Showing 4 changed files with 233 additions and 1 deletion.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
.idea
tmp
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,6 @@ To start a new proposal, you need to:

| GEP | Title |
|---------------------------------|----------------|
| [0001](./proposals/0001-gep.md) | Service Config |


230 changes: 230 additions & 0 deletions proposals/0001-gep.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
---
MEP: 1
Title: Service Config
Discussion: N/A
Implementation: https://github.com/modelgateway/glide/pull/37
---

# Service Config

## Abstract

Glide is going to have many configs that control both functional and non-functional aspects of the service work.
This GEP defines how that config is going to be set, structured and maintained across the service lifetime.

## Motivation

Glide is a gateway that provides a bunch of configurations to set, manage, and maintain.
This ensures flexibility needed to meet various needs of our end users.

For example:

- **Telemetry Configs**: Log Output Format, Min Log Level, OTEL Configs
- **HTTP Server Configs**: Is enabled?, IP and Port to bind, max body size, CORS, etc.
- **SSL/TLS Configs:** Ciphers, Cert Paths, etc.
- **LLM Provider Configs**: API Keys, Default Parameters, Timeouts, Connection Pool Configs, etc.
- **Routing and Load Balancing Configs**: Balancing Strategy, Routing Weights, Provider Pool Definitions, etc.
- **Caching Configs**: Redis Connection Configs, TTLs, etc.
- **gRPC Server Configs**: Is enabled?, IP and Port to bind, etc.
- etc.

The configuration may be versioned via git to track and audit all changes.
At the same time, some configs are sensitive data (notably, LLM API Keys)
and it must not be provided directly in the config (otherwise, they would be leaked in git).

As any other config, users should be able to easily apply new changes to the config.
The API keys may need to be rotated from time to time.

Finally, the more configs we expose, the harder may become to understand the overall config
and find the right setting to adjust. So the config structure is an important thing here to eliminate this fatigue.

### Requirements

- R1: Config format must be readable
- R2: Config format must allow comments
- R3: Config format must be `git diff`-able and versioned
- R4: Config must be easy to understand and reason about
- R5: Config must allow to include sensitive values from environment variables
- R6: Config may allow to include sensitive values from other plain files
- R7: Config should be easy to change and operate on like "hitless" reloads

## Design

There are basically a few sources of configurations possible:
- via Glide CLI params & args
- via environment variables
- via a separate config file
- via a combination of CLI, environment variables, and/or config file

Setting complex nested configs via CLI or environment variables is usually a tedious error-prone process.
So Glide will try to leverage the file as config approach as long as possible.

### The Config Source

Glide config is represented as a YAML file (R1, R2, R3).
YAML is a human-friendly version of JSON that supports comments and some useful magic like (including environment variables).
Go has good support for the YAML format.

### The Config Structure

The config structure uses Glide's internal logical structure that we are going to communicate in Glide docs (R4).
So learning more about how Glide works should give enough context to understand the config.
The opposite should be true as well, so knowing how the config looks will give some understanding about Glide's overall design.
This is a route that the OTEL Collector config takes.

Glide should allow you to pull values of configs from environment variables (see the config example below).
This helps to keep sensitive information (R5) out of the config file that avoids leakages.

### Config Ops

Glide should be able to automatically detect changes to the config (especially, our domain-related configs like routing) and
reload its internal representation together with the related state (R7).
In Kubernetes, configs are going to be represented as a configmap mounted into Glide pods.
On configmap change, the mounted file changes as well. So we need to watch for the file changes there.
Outside of Kubernetes, we may support config reloading on SIGHUP signals.

### Config Example

The most minimalistic possible config setup would look like this:

```yaml
routes:
language:
- name: openai-pool
providers:
- name: openai-boring
provider: openai
model: gpt-3.5-turbo
api_key: ${env:OPENAI_API_KEY}
default_params:
temperature: 0
```
Here is a rich config example to get better sense of the config structure:
```yaml
telemetry:
logging:
level: info
encoding: console # console, json
# other configs

api:
http:
listen_addr: 0.0.0.0:7685
max_body_size: "2Mi"
tls:
ca_path:
cert_path:
# other configs

grpc:
listen_addr: 0.0.0.0:7686
tls:
ca_path:
cert_path:
# other configs

routes:
language:
- name: openai-pool
balancing: priority # round-robin, weighted-round-robin, priority, least-latency, priority, etc.
providers:
- name: openai-boring
provider: openai # anthropic, azureopenai, gemini, other providers we support
model: gpt-3.5-turbo
api_key: ${env:OPENAI_API_KEY}
default_params:
temperature: 0
- name: anthropic-boring
provider: anthropic
model: claude-2
apiKey: ${env:ANTHROPIC_API_KEY}
default_params: # set the default request params
temperature: 0

- name: latency-critical-pool
balancing: least-latency
providers:
- name: openai-boring
provider: openai
model: gpt-3.5-turbo
api_key: ${env:OPENAI_API_KEY}
timeout_ms: 200
- name: anthropic-boring
provider: anthropic
api_key: ${env:ANTHROPIC_API_KEY}
timeout_ms: 200

- name: anthropic
balancing: weighted-round-robin
providers:
- name: openai
provider: openai
api_key: ${env:OPENAI_API_KEY}
weight: 30
- name: anthropic
provider: anthropic
api_key: ${env:ANTHROPIC_API_KEY}
weight: 70

- name: ab-test1
balancing: weighted-round-robin
providers:
- name: openai-gpt4
provider: openai
model: gpt-4
api_key: ${env:OPENAI_API_KEY}
weight: 50
default_params:
temperature: 0.7

- name: openai-chatgpt
provider: openai
model: gpt-3.5-turbo
api_key: ${env:OPENAI_API_KEY}
weight: 50
default_params:
temperature: 0.7

# we want to use OpenAI only in this A/B test, but that's bad resiliency wise
# so add Anthropic just in case OpenAI is down
- name: anthropic-fallback
provider: anthropic
model: claude-2
api_key: ${env:ANTHROPIC_API_KEY}
weight: 0

# in case of speach-to-text models
# transcribers:
# - ...

# in case of text-to-speach models
# synthesizers:
# - ...

# in case of embeddings API
# embeddings:
# - ...

```

### References

- [OTEL Collector: Config Sample](https://opentelemetry.io/docs/collector/configuration/)
- [MLFlow AI Gateway: Configuration](https://mlflow.org/docs/latest/llms/gateway/index.html#ai-gateway-configuration)
- [etcd: Configuration Options](https://etcd.io/docs/v3.4/op-guide/configuration/)
- [etcd: Config Sample](https://github.com/etcd-io/etcd/blob/release-3.4/etcd.conf.yml.sample)
- [Portkey: Config Sample](https://docs.portkey.ai/portkey-docs/portkey-features/ai-gateway/load-balancing)
- [HashiCorp Vault: Configurations](https://developer.hashicorp.com/vault/docs/configuration)
- [Prometheus AlertManager: Config](https://prometheus.io/docs/alerting/latest/configuration/)
- [Kong: Config](https://github.com/Kong/kong/blob/master/kong.conf.default)

## Alternatives Considered

- **Using JSON as config format**. We could still use JSON as secondary config format (OTEL Collector does that), but it doesn't developer-friendly enough to be our primary config format (fails at R1, R2, and it's not too strong with R3)

## Future Work

- If our users want that, we could consider supporting more than one config formats like JSON or TOML.
- Pull configs from sources other than filesystem (like fetching it via HTTP). Needs some signals from community to figure out if this is useful
2 changes: 1 addition & 1 deletion template.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Implementation: Link

### Requirements

[TBU, list all key requirements to keep in mind]
[TBU, list all key requirements to keep in mind, use https://datatracker.ietf.org/doc/html/rfc2119 verbs]

## Design

Expand Down

0 comments on commit d4d7f89

Please sign in to comment.