Skip to content

Commit 02b877d

Browse files
Songbird99claudetduhamel42
authored
Feature/litellm proxy (#27)
* feat: seed governance config and responses routing * Add env-configurable timeout for proxy providers * Integrate LiteLLM OTEL collector and update docs * Make .env.litellm optional for LiteLLM proxy * Add LiteLLM proxy integration with model-agnostic virtual keys Changes: - Bootstrap generates 3 virtual keys with individual budgets (CLI: $100, Task-Agent: $25, Cognee: $50) - Task-agent loads config at runtime via entrypoint script to wait for bootstrap completion - All keys are model-agnostic by default (no LITELLM_DEFAULT_MODELS restrictions) - Bootstrap handles database/env mismatch after docker prune by deleting stale aliases - CLI and Cognee configured to use LiteLLM proxy with virtual keys - Added comprehensive documentation in volumes/env/README.md Technical details: - task-agent entrypoint waits for keys in .env file before starting uvicorn - Bootstrap creates/updates TASK_AGENT_API_KEY, COGNEE_API_KEY, and OPENAI_API_KEY - Removed hardcoded API keys from docker-compose.yml - All services route through http://localhost:10999 proxy Generated with Claude Code https://claude.com/claude-code Co-Authored-By: Claude <noreply@anthropic.com> * Fix CLI not loading virtual keys from global .env Project .env files with empty OPENAI_API_KEY values were overriding the global virtual keys. Updated _load_env_file_if_exists to only override with non-empty values. Generated with Claude Code https://claude.com/claude-code Co-Authored-By: Claude <noreply@anthropic.com> * Fix agent executor not passing API key to LiteLLM The agent was initializing LiteLlm without api_key or api_base, causing authentication errors when using the LiteLLM proxy. Now reads from OPENAI_API_KEY/LLM_API_KEY and LLM_ENDPOINT environment variables and passes them to LiteLlm constructor. Generated with Claude Code https://claude.com/claude-code Co-Authored-By: Claude <noreply@anthropic.com> * Auto-populate project .env with virtual key from global config When running 'ff init', the command now checks for a global volumes/env/.env file and automatically uses the OPENAI_API_KEY virtual key if found. This ensures projects work with LiteLLM proxy out of the box without manual key configuration. Generated with Claude Code https://claude.com/claude-code Co-Authored-By: Claude <noreply@anthropic.com> * docs: Update README with LiteLLM configuration instructions Add note about LITELLM_GEMINI_API_KEY configuration and clarify that OPENAI_API_KEY default value should not be changed as it's used for the LLM proxy. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Refactor workflow parameters to use JSON Schema defaults Consolidates parameter defaults into JSON Schema format, removing the separate default_parameters field. Adds extract_defaults_from_json_schema() helper to extract defaults from the standard schema structure. Updates LiteLLM proxy config to use LITELLM_OPENAI_API_KEY environment variable. * Remove .env.example from task_agent * Fix MDX syntax error in llm-proxy.md * fix: apply default parameters from metadata.yaml automatically Fixed TemporalManager.run_workflow() to correctly apply default parameter values from workflow metadata.yaml files when parameters are not provided by the caller. Previous behavior: - When workflow_params was empty {}, the condition `if workflow_params and 'parameters' in metadata` would fail - Parameters would not be extracted from schema, resulting in workflows receiving only target_id with no other parameters New behavior: - Removed the `workflow_params and` requirement from the condition - Now explicitly checks for defaults in parameter spec - Applies defaults from metadata.yaml automatically when param not provided - Workflows receive all parameters with proper fallback: provided value > metadata default > None This makes metadata.yaml the single source of truth for parameter defaults, removing the need for workflows to implement defensive default handling. Affected workflows: - llm_secret_detection (was failing with KeyError) - All other workflows now benefit from automatic default application --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: tduhamel42 <tduhamel@fuzzinglabs.com>
1 parent bd94d19 commit 02b877d

File tree

29 files changed

+1870
-107
lines changed

29 files changed

+1870
-107
lines changed

.gitignore

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -188,6 +188,10 @@ logs/
188188
# Docker volume configs (keep .env.example but ignore actual .env)
189189
volumes/env/.env
190190

191+
# Vendored proxy sources (kept locally for reference)
192+
ai/proxy/bifrost/
193+
ai/proxy/litellm/
194+
191195
# Test project databases and configurations
192196
test_projects/*/.fuzzforge/
193197
test_projects/*/findings.db*
@@ -304,4 +308,4 @@ test_projects/*/.npmrc
304308
test_projects/*/.git-credentials
305309
test_projects/*/credentials.*
306310
test_projects/*/api_keys.*
307-
test_projects/*/ci-*.sh
311+
test_projects/*/ci-*.sh

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
137137

138138
### 🐛 Bug Fixes
139139

140+
- Fixed default parameters from metadata.yaml not being applied to workflows when no parameters provided
140141
- Fixed gitleaks workflow failing on uploaded directories without Git history
141142
- Fixed worker startup command suggestions (now uses `docker compose up -d` with service names)
142143
- Fixed missing `cognify_text` method in CogneeProjectIntegration

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,9 @@ For AI-powered workflows, configure your LLM API keys:
117117
```bash
118118
cp volumes/env/.env.example volumes/env/.env
119119
# Edit volumes/env/.env and add your API keys (OpenAI, Anthropic, Google, etc.)
120+
# Add your key to LITELLM_GEMINI_API_KEY
120121
```
122+
> Dont change the OPENAI_API_KEY default value, as it is used for the LLM proxy.
121123
122124
This is required for:
123125
- `llm_secret_detection` workflow

ai/agents/task_agent/.env.example

Lines changed: 0 additions & 10 deletions
This file was deleted.

ai/agents/task_agent/Dockerfile

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,9 @@ COPY . /app/agent_with_adk_format
1616
WORKDIR /app/agent_with_adk_format
1717
ENV PYTHONPATH=/app
1818

19+
# Copy and set up entrypoint
20+
COPY docker-entrypoint.sh /docker-entrypoint.sh
21+
RUN chmod +x /docker-entrypoint.sh
22+
23+
ENTRYPOINT ["/docker-entrypoint.sh"]
1924
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

ai/agents/task_agent/README.md

Lines changed: 25 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -43,18 +43,34 @@ cd task_agent
4343
# cp .env.example .env
4444
```
4545

46-
Edit `.env` (or `.env.example`) and add your API keys. The agent must be restarted after changes so the values are picked up:
46+
Edit `.env` (or `.env.example`) and add your proxy + API keys. The agent must be restarted after changes so the values are picked up:
4747
```bash
48-
# Set default model
49-
LITELLM_MODEL=gemini/gemini-2.0-flash-001
50-
51-
# Add API keys for providers you want to use
52-
GOOGLE_API_KEY=your_google_api_key
53-
OPENAI_API_KEY=your_openai_api_key
54-
ANTHROPIC_API_KEY=your_anthropic_api_key
55-
OPENROUTER_API_KEY=your_openrouter_api_key
48+
# Route every request through the proxy container (use http://localhost:10999 from the host)
49+
FF_LLM_PROXY_BASE_URL=http://llm-proxy:4000
50+
51+
# Default model + provider the agent boots with
52+
LITELLM_MODEL=openai/gpt-4o-mini
53+
LITELLM_PROVIDER=openai
54+
55+
# Virtual key issued by the proxy to the task agent (bootstrap replaces the placeholder)
56+
OPENAI_API_KEY=sk-proxy-default
57+
58+
# Upstream keys stay inside the proxy. Store real secrets under the LiteLLM
59+
# aliases and the bootstrapper mirrors them into .env.litellm for the proxy container.
60+
LITELLM_OPENAI_API_KEY=your_real_openai_api_key
61+
LITELLM_ANTHROPIC_API_KEY=your_real_anthropic_key
62+
LITELLM_GEMINI_API_KEY=your_real_gemini_key
63+
LITELLM_MISTRAL_API_KEY=your_real_mistral_key
64+
LITELLM_OPENROUTER_API_KEY=your_real_openrouter_key
5665
```
5766

67+
> When running the agent outside of Docker, swap `FF_LLM_PROXY_BASE_URL` to the host port (default `http://localhost:10999`).
68+
69+
The bootstrap container provisions LiteLLM, copies provider secrets into
70+
`volumes/env/.env.litellm`, and rewrites `volumes/env/.env` with the virtual key.
71+
Populate the `LITELLM_*_API_KEY` values before the first launch so the proxy can
72+
reach your upstream providers as soon as the bootstrap script runs.
73+
5874
### 2. Install Dependencies
5975

6076
```bash
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
#!/bin/bash
2+
set -e
3+
4+
# Wait for .env file to have keys (max 30 seconds)
5+
echo "[task-agent] Waiting for virtual keys to be provisioned..."
6+
for i in $(seq 1 30); do
7+
if [ -f /app/config/.env ]; then
8+
# Check if TASK_AGENT_API_KEY has a value (not empty)
9+
KEY=$(grep -E '^TASK_AGENT_API_KEY=' /app/config/.env | cut -d'=' -f2)
10+
if [ -n "$KEY" ] && [ "$KEY" != "" ]; then
11+
echo "[task-agent] Virtual keys found, loading environment..."
12+
# Export keys from .env file
13+
export TASK_AGENT_API_KEY="$KEY"
14+
export OPENAI_API_KEY=$(grep -E '^OPENAI_API_KEY=' /app/config/.env | cut -d'=' -f2)
15+
export FF_LLM_PROXY_BASE_URL=$(grep -E '^FF_LLM_PROXY_BASE_URL=' /app/config/.env | cut -d'=' -f2)
16+
echo "[task-agent] Loaded TASK_AGENT_API_KEY: ${TASK_AGENT_API_KEY:0:15}..."
17+
echo "[task-agent] Loaded FF_LLM_PROXY_BASE_URL: $FF_LLM_PROXY_BASE_URL"
18+
break
19+
fi
20+
fi
21+
echo "[task-agent] Keys not ready yet, waiting... ($i/30)"
22+
sleep 1
23+
done
24+
25+
if [ -z "$TASK_AGENT_API_KEY" ]; then
26+
echo "[task-agent] ERROR: Virtual keys were not provisioned within 30 seconds!"
27+
exit 1
28+
fi
29+
30+
echo "[task-agent] Starting uvicorn..."
31+
exec "$@"

ai/agents/task_agent/litellm_agent/config.py

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,28 @@
44

55
import os
66

7+
8+
def _normalize_proxy_base_url(raw_value: str | None) -> str | None:
9+
if not raw_value:
10+
return None
11+
cleaned = raw_value.strip()
12+
if not cleaned:
13+
return None
14+
# Avoid double slashes in downstream requests
15+
return cleaned.rstrip("/")
16+
717
AGENT_NAME = "litellm_agent"
818
AGENT_DESCRIPTION = (
919
"A LiteLLM-backed shell that exposes hot-swappable model and prompt controls."
1020
)
1121

12-
DEFAULT_MODEL = os.getenv("LITELLM_MODEL", "gemini-2.0-flash-001")
13-
DEFAULT_PROVIDER = os.getenv("LITELLM_PROVIDER")
22+
DEFAULT_MODEL = os.getenv("LITELLM_MODEL", "openai/gpt-4o-mini")
23+
DEFAULT_PROVIDER = os.getenv("LITELLM_PROVIDER") or None
24+
PROXY_BASE_URL = _normalize_proxy_base_url(
25+
os.getenv("FF_LLM_PROXY_BASE_URL")
26+
or os.getenv("LITELLM_API_BASE")
27+
or os.getenv("LITELLM_BASE_URL")
28+
)
1429

1530
STATE_PREFIX = "app:litellm_agent/"
1631
STATE_MODEL_KEY = f"{STATE_PREFIX}model"

ai/agents/task_agent/litellm_agent/state.py

Lines changed: 169 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,15 @@
33
from __future__ import annotations
44

55
from dataclasses import dataclass
6+
import os
67
from typing import Any, Mapping, MutableMapping, Optional
78

9+
import httpx
10+
811
from .config import (
912
DEFAULT_MODEL,
1013
DEFAULT_PROVIDER,
14+
PROXY_BASE_URL,
1115
STATE_MODEL_KEY,
1216
STATE_PROMPT_KEY,
1317
STATE_PROVIDER_KEY,
@@ -66,11 +70,109 @@ def instantiate_llm(self):
6670
"""Create a LiteLlm instance for the current state."""
6771

6872
from google.adk.models.lite_llm import LiteLlm # Lazy import to avoid cycle
73+
from google.adk.models.lite_llm import LiteLLMClient
74+
from litellm.types.utils import Choices, Message, ModelResponse, Usage
6975

7076
kwargs = {"model": self.model}
7177
if self.provider:
7278
kwargs["custom_llm_provider"] = self.provider
73-
return LiteLlm(**kwargs)
79+
if PROXY_BASE_URL:
80+
provider = (self.provider or DEFAULT_PROVIDER or "").lower()
81+
if provider and provider != "openai":
82+
kwargs["api_base"] = f"{PROXY_BASE_URL.rstrip('/')}/{provider}"
83+
else:
84+
kwargs["api_base"] = PROXY_BASE_URL
85+
kwargs.setdefault("api_key", os.environ.get("TASK_AGENT_API_KEY") or os.environ.get("OPENAI_API_KEY"))
86+
87+
provider = (self.provider or DEFAULT_PROVIDER or "").lower()
88+
model_suffix = self.model.split("/", 1)[-1]
89+
use_responses = provider == "openai" and (
90+
model_suffix.startswith("gpt-5") or model_suffix.startswith("o1")
91+
)
92+
if use_responses:
93+
kwargs.setdefault("use_responses_api", True)
94+
95+
llm = LiteLlm(**kwargs)
96+
97+
if use_responses and PROXY_BASE_URL:
98+
99+
class _ResponsesAwareClient(LiteLLMClient):
100+
def __init__(self, base_client: LiteLLMClient, api_base: str, api_key: str):
101+
self._base_client = base_client
102+
self._api_base = api_base.rstrip("/")
103+
self._api_key = api_key
104+
105+
async def acompletion(self, model, messages, tools, **kwargs): # type: ignore[override]
106+
use_responses_api = kwargs.pop("use_responses_api", False)
107+
if not use_responses_api:
108+
return await self._base_client.acompletion(
109+
model=model,
110+
messages=messages,
111+
tools=tools,
112+
**kwargs,
113+
)
114+
115+
resolved_model = model
116+
if "/" not in resolved_model:
117+
resolved_model = f"openai/{resolved_model}"
118+
119+
payload = {
120+
"model": resolved_model,
121+
"input": _messages_to_responses_input(messages),
122+
}
123+
124+
timeout = kwargs.get("timeout", 60)
125+
headers = {
126+
"Authorization": f"Bearer {self._api_key}",
127+
"Content-Type": "application/json",
128+
}
129+
130+
async with httpx.AsyncClient(timeout=timeout) as client:
131+
response = await client.post(
132+
f"{self._api_base}/v1/responses",
133+
json=payload,
134+
headers=headers,
135+
)
136+
try:
137+
response.raise_for_status()
138+
except httpx.HTTPStatusError as exc:
139+
text = exc.response.text
140+
raise RuntimeError(
141+
f"LiteLLM responses request failed: {text}"
142+
) from exc
143+
data = response.json()
144+
145+
text_output = _extract_output_text(data)
146+
usage = data.get("usage", {})
147+
148+
return ModelResponse(
149+
id=data.get("id"),
150+
model=model,
151+
choices=[
152+
Choices(
153+
finish_reason="stop",
154+
index=0,
155+
message=Message(role="assistant", content=text_output),
156+
provider_specific_fields={"bifrost_response": data},
157+
)
158+
],
159+
usage=Usage(
160+
prompt_tokens=usage.get("input_tokens"),
161+
completion_tokens=usage.get("output_tokens"),
162+
reasoning_tokens=usage.get("output_tokens_details", {}).get(
163+
"reasoning_tokens"
164+
),
165+
total_tokens=usage.get("total_tokens"),
166+
),
167+
)
168+
169+
llm.llm_client = _ResponsesAwareClient(
170+
llm.llm_client,
171+
PROXY_BASE_URL,
172+
os.environ.get("TASK_AGENT_API_KEY") or os.environ.get("OPENAI_API_KEY", ""),
173+
)
174+
175+
return llm
74176

75177
@property
76178
def display_model(self) -> str:
@@ -84,3 +186,69 @@ def apply_state_to_agent(invocation_context, state: HotSwapState) -> None:
84186

85187
agent = invocation_context.agent
86188
agent.model = state.instantiate_llm()
189+
190+
191+
def _messages_to_responses_input(messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
192+
inputs: list[dict[str, Any]] = []
193+
for message in messages:
194+
role = message.get("role", "user")
195+
content = message.get("content", "")
196+
text_segments: list[str] = []
197+
198+
if isinstance(content, list):
199+
for item in content:
200+
if isinstance(item, dict):
201+
text = item.get("text") or item.get("content")
202+
if text:
203+
text_segments.append(str(text))
204+
elif isinstance(item, str):
205+
text_segments.append(item)
206+
elif isinstance(content, str):
207+
text_segments.append(content)
208+
209+
text = "\n".join(segment.strip() for segment in text_segments if segment)
210+
if not text:
211+
continue
212+
213+
entry_type = "input_text"
214+
if role == "assistant":
215+
entry_type = "output_text"
216+
217+
inputs.append(
218+
{
219+
"role": role,
220+
"content": [
221+
{
222+
"type": entry_type,
223+
"text": text,
224+
}
225+
],
226+
}
227+
)
228+
229+
if not inputs:
230+
inputs.append(
231+
{
232+
"role": "user",
233+
"content": [
234+
{
235+
"type": "input_text",
236+
"text": "",
237+
}
238+
],
239+
}
240+
)
241+
return inputs
242+
243+
244+
def _extract_output_text(response_json: dict[str, Any]) -> str:
245+
outputs = response_json.get("output", [])
246+
collected: list[str] = []
247+
for item in outputs:
248+
if isinstance(item, dict) and item.get("type") == "message":
249+
for part in item.get("content", []):
250+
if isinstance(part, dict) and part.get("type") == "output_text":
251+
text = part.get("text", "")
252+
if text:
253+
collected.append(str(text))
254+
return "\n\n".join(collected).strip()

ai/proxy/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# LLM Proxy Integrations
2+
3+
This directory contains vendor source trees that were vendored only for reference when integrating LLM gateways. The actual FuzzForge deployment uses the official Docker images for each project.
4+
5+
See `docs/docs/how-to/llm-proxy.md` for up-to-date instructions on running the proxy services and issuing keys for the agents.

0 commit comments

Comments
 (0)