|
| 1 | +# Redis Cache Plugin for Bifrost |
| 2 | + |
| 3 | +This plugin provides Redis-based caching functionality for Bifrost requests. It caches responses based on request body hashes and returns cached responses for identical requests, significantly improving performance and reducing API costs. |
| 4 | + |
| 5 | +## Features |
| 6 | + |
| 7 | +- **High-Performance Hashing**: Uses xxhash for ultra-fast request body hashing |
| 8 | +- **Asynchronous Caching**: Non-blocking cache writes for optimal response times |
| 9 | +- **Response Caching**: Stores complete responses in Redis with configurable TTL |
| 10 | +- **Streaming Cache Support**: Caches and retrieves streaming responses chunk by chunk |
| 11 | +- **Cache Hit Detection**: Returns cached responses for identical requests |
| 12 | +- **Intelligent Cache Recovery**: Automatically reconstructs streaming responses from cached chunks |
| 13 | +- **Simple Setup**: Only requires Redis address and cache key - sensible defaults for everything else |
| 14 | +- **Self-Contained**: Creates and manages its own Redis client |
| 15 | + |
| 16 | +## Installation |
| 17 | + |
| 18 | +```bash |
| 19 | +go get github.com/maximhq/bifrost/core |
| 20 | +go get github.com/maximhq/bifrost/plugins/redis |
| 21 | +``` |
| 22 | + |
| 23 | +## Quick Start |
| 24 | + |
| 25 | +### Basic Setup |
| 26 | + |
| 27 | +```go |
| 28 | +import ( |
| 29 | + "github.com/maximhq/bifrost/plugins/redis" |
| 30 | + bifrost "github.com/maximhq/bifrost/core" |
| 31 | +) |
| 32 | + |
| 33 | +// Simple configuration - only Redis address and cache key are required! |
| 34 | +config := redis.RedisPluginConfig{ |
| 35 | + Addr: "localhost:6379", // Your Redis server address |
| 36 | + CacheKey: "x-my-cache-key", // Context key for cache identification |
| 37 | +} |
| 38 | + |
| 39 | +// Create the plugin |
| 40 | +plugin, err := redis.NewRedisPlugin(config, logger) |
| 41 | +if err != nil { |
| 42 | + log.Fatal("Failed to create Redis plugin:", err) |
| 43 | +} |
| 44 | + |
| 45 | +// Use with Bifrost |
| 46 | +bifrostConfig := schemas.BifrostConfig{ |
| 47 | + Account: yourAccount, |
| 48 | + Plugins: []schemas.Plugin{plugin}, |
| 49 | + // ... other config |
| 50 | +} |
| 51 | +``` |
| 52 | + |
| 53 | +That's it! The plugin uses Redis client defaults for connection handling and these defaults for caching: |
| 54 | + |
| 55 | +- **TTL**: 5 minutes |
| 56 | +- **CacheByModel**: true (include model in cache key) |
| 57 | +- **CacheByProvider**: true (include provider in cache key) |
| 58 | + |
| 59 | +**Important**: You must provide the cache key in your request context for caching to work: |
| 60 | + |
| 61 | +```go |
| 62 | +ctx := context.WithValue(ctx, redis.ContextKey("x-my-cache-key"), "cache-value") |
| 63 | +response, err := client.ChatCompletionRequest(ctx, request) |
| 64 | +``` |
| 65 | + |
| 66 | +### With Password Authentication |
| 67 | + |
| 68 | +```go |
| 69 | +config := redis.RedisPluginConfig{ |
| 70 | + Addr: "localhost:6379", |
| 71 | + CacheKey: "x-my-cache-key", |
| 72 | + Password: "your-redis-password", |
| 73 | +} |
| 74 | +``` |
| 75 | + |
| 76 | +### With Custom TTL and Prefix |
| 77 | + |
| 78 | +```go |
| 79 | +config := redis.RedisPluginConfig{ |
| 80 | + Addr: "localhost:6379", |
| 81 | + CacheKey: "x-my-cache-key", |
| 82 | + TTL: time.Hour, // Cache for 1 hour |
| 83 | + Prefix: "myapp:cache:", // Custom prefix |
| 84 | +} |
| 85 | +``` |
| 86 | + |
| 87 | +### Per-Request TTL Override (via Context) |
| 88 | + |
| 89 | +You can override the cache TTL for individual requests by providing a TTL in the request context. Configure a `CacheTTLKey` on the plugin, then set a `time.Duration` value at that context key before making the request. |
| 90 | + |
| 91 | +```go |
| 92 | +// Configure plugin with a context key used to read per-request TTLs |
| 93 | +config := redis.RedisPluginConfig{ |
| 94 | + Addr: "localhost:6379", |
| 95 | + CacheKey: "x-my-cache-key", |
| 96 | + CacheTTLKey: "x-my-cache-ttl", // The context key for reading TTL |
| 97 | + TTL: 5 * time.Minute, // Fallback/default TTL |
| 98 | +} |
| 99 | + |
| 100 | +plugin, err := redis.NewRedisPlugin(config, logger) |
| 101 | +// ... init Bifrost client with plugin |
| 102 | + |
| 103 | +// Before making a request, set a per-request TTL |
| 104 | +ctx := context.WithValue(ctx, redis.ContextKey("x-my-cache-ttl"), 30*time.Second) |
| 105 | +resp, err := client.ChatCompletionRequest(ctx, request) |
| 106 | +``` |
| 107 | + |
| 108 | +Notes: |
| 109 | + |
| 110 | +- The context value must be of type `time.Duration`. If it is missing or of the wrong type, the plugin falls back to `config.TTL`. |
| 111 | +- This applies to both regular and streaming requests. For streaming, the same per-request TTL applies to all chunks. |
| 112 | + |
| 113 | +### With Different Database |
| 114 | + |
| 115 | +```go |
| 116 | +config := redis.RedisPluginConfig{ |
| 117 | + Addr: "localhost:6379", |
| 118 | + CacheKey: "x-my-cache-key", |
| 119 | + DB: 1, // Use Redis database 1 |
| 120 | +} |
| 121 | +``` |
| 122 | + |
| 123 | +### Streaming Cache Example |
| 124 | + |
| 125 | +```go |
| 126 | +// Configure plugin for streaming cache |
| 127 | +config := redis.RedisPluginConfig{ |
| 128 | + Addr: "localhost:6379", |
| 129 | + CacheKey: "x-stream-cache-key", |
| 130 | + TTL: 30 * time.Minute, // Cache streaming responses for 30 minutes |
| 131 | +} |
| 132 | + |
| 133 | +// Use with streaming requests |
| 134 | +ctx := context.WithValue(ctx, redis.ContextKey("x-stream-cache-key"), "stream-session-1") |
| 135 | +stream, err := client.ChatCompletionStreamRequest(ctx, request) |
| 136 | +// Subsequent identical requests will be served from cache as a reconstructed stream |
| 137 | +``` |
| 138 | + |
| 139 | +### Custom Cache Key Configuration |
| 140 | + |
| 141 | +```go |
| 142 | +config := redis.RedisPluginConfig{ |
| 143 | + Addr: "localhost:6379", |
| 144 | + CacheKey: "x-my-cache-key", |
| 145 | + CacheByModel: bifrost.Ptr(false), // Don't include model in cache key |
| 146 | + CacheByProvider: bifrost.Ptr(true), // Include provider in cache key |
| 147 | +} |
| 148 | +``` |
| 149 | + |
| 150 | +### Custom Redis Client Configuration |
| 151 | + |
| 152 | +```go |
| 153 | +config := redis.RedisPluginConfig{ |
| 154 | + Addr: "localhost:6379", |
| 155 | + CacheKey: "x-my-cache-key", |
| 156 | + PoolSize: 20, // Custom connection pool size |
| 157 | + DialTimeout: 5 * time.Second, // Custom connection timeout |
| 158 | + ReadTimeout: 3 * time.Second, // Custom read timeout |
| 159 | + ConnMaxLifetime: time.Hour, // Custom connection lifetime |
| 160 | +} |
| 161 | +``` |
| 162 | + |
| 163 | +## Configuration Options |
| 164 | + |
| 165 | +| Option | Type | Required | Default | Description | |
| 166 | +| ----------------- | --------------- | -------- | ----------------- | ----------------------------------- | |
| 167 | +| `Addr` | `string` | ✅ | - | Redis server address (host:port) | |
| 168 | +| `CacheKey` | `string` | ✅ | - | Context key for cache identification| |
| 169 | +| `Username` | `string` | ❌ | `""` | Username for Redis AUTH (Redis 6+) | |
| 170 | +| `Password` | `string` | ❌ | `""` | Password for Redis AUTH | |
| 171 | +| `DB` | `int` | ❌ | `0` | Redis database number | |
| 172 | +| `TTL` | `time.Duration` | ❌ | `5 * time.Minute` | Time-to-live for cached responses | |
| 173 | +| `Prefix` | `string` | ❌ | `""` | Prefix for cache keys | |
| 174 | +| `CacheByModel` | `*bool` | ❌ | `true` | Include model in cache key | |
| 175 | +| `CacheByProvider` | `*bool` | ❌ | `true` | Include provider in cache key | |
| 176 | + |
| 177 | +**Redis Connection Options** (all optional, Redis client uses its own defaults for zero values): |
| 178 | + |
| 179 | +- `PoolSize`, `MinIdleConns`, `MaxIdleConns` - Connection pool settings |
| 180 | +- `ConnMaxLifetime`, `ConnMaxIdleTime` - Connection lifetime settings |
| 181 | +- `DialTimeout`, `ReadTimeout`, `WriteTimeout` - Timeout settings |
| 182 | + |
| 183 | +All Redis configuration values are passed directly to the Redis client, which handles its own zero-value defaults. You only need to specify values you want to override from Redis client defaults. |
| 184 | + |
| 185 | +## How It Works |
| 186 | + |
| 187 | +The plugin generates an xxhash of the normalized request including: |
| 188 | + |
| 189 | +- Provider (if CacheByProvider is true) |
| 190 | +- Model (if CacheByModel is true) |
| 191 | +- Input (chat completion or text completion) |
| 192 | +- Parameters (includes tool calls) |
| 193 | + |
| 194 | +Identical requests will always produce the same hash, enabling effective caching. |
| 195 | + |
| 196 | +### Caching Flow |
| 197 | + |
| 198 | +#### Regular Requests |
| 199 | + |
| 200 | +1. **PreHook**: Checks Redis for cached response, returns immediately if found |
| 201 | +2. **PostHook**: Stores the response in Redis asynchronously (non-blocking) |
| 202 | +3. **Cleanup**: Clears all cached entries and closes connection on shutdown |
| 203 | + |
| 204 | +#### Streaming Requests |
| 205 | + |
| 206 | +1. **PreHook**: Checks Redis for cached chunks using pattern `{cache_key}_chunk_*` |
| 207 | +2. **Cache Hit**: Reconstructs stream from cached chunks in correct order |
| 208 | +3. **PostHook**: Stores each stream chunk with index: `{cache_key}_chunk_{index}` |
| 209 | +4. **Stream Reconstruction**: Subsequent requests get sorted chunks as a new stream |
| 210 | + |
| 211 | +**Asynchronous Caching**: Cache writes happen in background goroutines with a 30-second timeout, ensuring responses are never delayed by Redis operations. This provides optimal performance while maintaining cache functionality. |
| 212 | + |
| 213 | +**Streaming Intelligence**: The plugin automatically detects streaming requests and handles chunk-based caching. Each chunk is stored with its index, allowing perfect reconstruction of the original stream order. |
| 214 | + |
| 215 | +### Cache Keys |
| 216 | + |
| 217 | +#### Regular Responses |
| 218 | + |
| 219 | +Cache keys follow the pattern: `{prefix}{cache_value}_{xxhash}` |
| 220 | + |
| 221 | +Example: `bifrost:cache:my-session_a1b2c3d4e5f6...` |
| 222 | + |
| 223 | +#### Streaming Responses |
| 224 | + |
| 225 | +Chunk keys follow the pattern: `{prefix}{cache_value}_{xxhash}_chunk_{index}` |
| 226 | + |
| 227 | +Examples: |
| 228 | + |
| 229 | +- `bifrost:cache:my-session_a1b2c3d4e5f6..._chunk_0` |
| 230 | +- `bifrost:cache:my-session_a1b2c3d4e5f6..._chunk_1` |
| 231 | +- `bifrost:cache:my-session_a1b2c3d4e5f6..._chunk_2` |
| 232 | + |
| 233 | +## Manual Cache Invalidation |
| 234 | + |
| 235 | +You can invalidate specific cached entries at runtime using the method `ClearCacheForKey(key string)` on the concrete `redis.Plugin` type. This deletes the provided key and, if it corresponds to a streaming response, all of its chunk entries (`<key>_chunk_*`). |
| 236 | + |
| 237 | +### Getting the cache key from responses |
| 238 | + |
| 239 | +- **Regular responses**: When a response is served from cache, the plugin adds metadata to `response.ExtraFields.RawResponse`: |
| 240 | + - `bifrost_cached: true` |
| 241 | + - `bifrost_cache_key: "<prefix><cache_value>_<xxhash>"` |
| 242 | + Use this `bifrost_cache_key` as the argument to `ClearCacheForKey`. |
| 243 | + |
| 244 | +- **Streaming responses**: Cached stream chunks include `bifrost_cache_key` for the specific chunk, in the form `"<base>_chunk_{index}"`. To invalidate the entire stream cache, strip the `"_chunk_{index}"` suffix to obtain the base key and pass that base key to `ClearCacheForKey`. |
| 245 | + |
| 246 | +### Examples |
| 247 | + |
| 248 | +```go |
| 249 | +// Non-streaming: clear the cached response you just used |
| 250 | +resp, err := client.ChatCompletionRequest(ctx, req) |
| 251 | +if err != nil { |
| 252 | + // handle error |
| 253 | +} |
| 254 | + |
| 255 | +if resp != nil && resp.ExtraFields.RawResponse != nil { |
| 256 | + if raw, ok := resp.ExtraFields.RawResponse.(map[string]interface{}); ok { |
| 257 | + if keyAny, ok := raw["bifrost_cache_key"]; ok { |
| 258 | + if pluginImpl, ok := plugin.(*redis.Plugin); ok { |
| 259 | + _ = pluginImpl.ClearCacheForKey(keyAny.(string)) |
| 260 | + } |
| 261 | + } |
| 262 | + } |
| 263 | +} |
| 264 | +``` |
| 265 | + |
| 266 | +```go |
| 267 | +// Streaming: clear all chunks for a cached stream |
| 268 | +for msg := range stream { |
| 269 | + if msg.BifrostResponse == nil { |
| 270 | + continue |
| 271 | + } |
| 272 | + raw := msg.BifrostResponse.ExtraFields.RawResponse |
| 273 | + rawMap, ok := raw.(map[string]interface{}) |
| 274 | + if !ok { |
| 275 | + continue |
| 276 | + } |
| 277 | + keyAny, ok := rawMap["bifrost_cache_key"] |
| 278 | + if !ok { |
| 279 | + continue |
| 280 | + } |
| 281 | + chunkKey := keyAny.(string) // e.g., "<base>_chunk_3" |
| 282 | + |
| 283 | + // Derive base key by removing the trailing "_chunk_{index}" part |
| 284 | + baseKey := chunkKey |
| 285 | + if idx := strings.LastIndex(chunkKey, "_chunk_"); idx != -1 { |
| 286 | + baseKey = chunkKey[:idx] |
| 287 | + } |
| 288 | + |
| 289 | + if pluginImpl, ok := plugin.(*redis.Plugin); ok { |
| 290 | + _ = pluginImpl.ClearCacheForKey(baseKey) |
| 291 | + } |
| 292 | + break // we only need one chunk to compute the base key |
| 293 | +} |
| 294 | +``` |
| 295 | + |
| 296 | +To clear all entries managed by this plugin (by prefix), call `Cleanup()` during shutdown: |
| 297 | + |
| 298 | +```go |
| 299 | +_ = plugin.(*redis.Plugin).Cleanup() |
| 300 | +``` |
| 301 | + |
| 302 | +## Testing |
| 303 | + |
| 304 | +The plugin includes comprehensive tests for both regular and streaming cache functionality. |
| 305 | + |
| 306 | +Run the tests with a Redis instance running: |
| 307 | + |
| 308 | +```bash |
| 309 | +# Start Redis (using Docker) |
| 310 | +docker run -d -p 6379:6379 redis:latest |
| 311 | + |
| 312 | +# Run all tests |
| 313 | +go test -v |
| 314 | + |
| 315 | +# Run specific tests |
| 316 | +go test -run TestRedisPlugin -v # Test regular caching |
| 317 | +go test -run TestRedisPluginStreaming -v # Test streaming cache |
| 318 | +``` |
| 319 | + |
| 320 | +Tests will be skipped if Redis is not available. The tests validate: |
| 321 | + |
| 322 | +- Cache hit/miss behavior |
| 323 | +- Performance improvements (cache should be significantly faster) |
| 324 | +- Content integrity (cached responses match originals) |
| 325 | +- Streaming chunk ordering and reconstruction |
| 326 | +- Provider information preservation |
| 327 | + |
| 328 | +## Performance Benefits |
| 329 | + |
| 330 | +- **Reduced API Calls**: Identical requests are served from cache |
| 331 | +- **Ultra-Low Latency**: Cache hits return immediately, cache writes are non-blocking |
| 332 | +- **Streaming Efficiency**: Cached streams are reconstructed and delivered faster than original API calls |
| 333 | +- **Cost Savings**: Fewer API calls to expensive LLM providers |
| 334 | +- **Improved Reliability**: Cached responses available even if provider is down |
| 335 | +- **High Throughput**: Asynchronous caching doesn't impact response times |
| 336 | +- **Perfect Stream Fidelity**: Cached streams maintain exact chunk ordering and content |
| 337 | + |
| 338 | +## Error Handling |
| 339 | + |
| 340 | +The plugin is designed to fail gracefully: |
| 341 | + |
| 342 | +- If Redis is unavailable during startup, plugin creation fails with clear error |
| 343 | +- If Redis becomes unavailable during operation, requests continue without caching |
| 344 | +- If cache retrieval fails, requests proceed normally |
| 345 | +- If cache storage fails asynchronously, responses are unaffected (already returned) |
| 346 | +- Malformed cached data is ignored and requests proceed normally |
| 347 | +- Cache operations have timeouts to prevent resource leaks |
| 348 | + |
| 349 | +## Best Practices |
| 350 | + |
| 351 | +1. **Start Simple**: Use only `Addr` and `CacheKey` - let defaults handle the rest |
| 352 | +2. **Choose meaningful cache keys**: Use descriptive context keys that identify cache sessions |
| 353 | +3. **Set appropriate TTL**: Balance between cache efficiency and data freshness |
| 354 | +4. **Use meaningful prefixes**: Helps organize cache keys in shared Redis instances |
| 355 | +5. **Monitor Redis memory**: Track cache usage, especially for streaming responses (more chunks = more storage) |
| 356 | +6. **Context management**: Always provide cache key in request context for caching to work |
| 357 | +7. **Use `bifrost.Ptr()`**: For boolean pointer configuration options |
| 358 | +8. **Streaming considerations**: Longer streams create more cache entries, adjust TTL accordingly |
| 359 | + |
| 360 | +## Security Considerations |
| 361 | + |
| 362 | +- **Sensitive Data**: Be cautious about caching responses containing sensitive information |
| 363 | +- **Redis Security**: Use authentication and network security for Redis |
| 364 | +- **Data Isolation**: Use different Redis databases or prefixes for different environments |
0 commit comments