-
Notifications
You must be signed in to change notification settings - Fork 4
Configuration
Flamehaven edited this page Nov 14, 2025
·
1 revision
Flamehaven FileSearch loads settings from the Config dataclass, environment
variables, and CLI flags. Use this document as the single source of truth.
flamehaven_filesearch.config.Config
| Field | Type | Default | Description |
|---|---|---|---|
api_key |
Optional[str] |
None |
Gemini API key. Loaded from GEMINI_API_KEY or GOOGLE_API_KEY if omitted. |
max_file_size_mb |
int |
50 |
Hard limit per file upload. Applies to REST + SDK. |
upload_timeout_sec |
int |
60 |
Maximum time to wait for Gemini ingest operations. |
default_model |
str |
gemini-2.5-flash |
Model passed to google-genai. |
max_output_tokens |
int |
1024 |
Upper bound for generated answers. |
temperature |
float |
0.5 |
Creativity knob. 0.0 = deterministic. |
max_sources |
int |
5 |
Number of citations returned. |
cache_ttl_sec |
int |
600 |
TTL for search result cache. |
cache_max_size |
int |
1024 |
Number of cached entries before eviction. |
| Driftlock | |||
min_answer_length |
int |
10 |
Log warnings if answer shorter than this. |
max_answer_length |
int |
4096 |
Truncate longer outputs. |
banned_terms |
list[str] |
["PII-leak"] |
Case-insensitive forbidden strings. |
-
api_keymust be non-empty whenrequire_api_key=True. -
max_file_size_mb> 0. -
0.0 ≤ temperature ≤ 1.0. - Strings are stripped of whitespace during
__post_init__.
| Variable | Purpose | Example |
|---|---|---|
GEMINI_API_KEY / GOOGLE_API_KEY
|
Primary authentication | export GEMINI_API_KEY="sk-..." |
DEFAULT_MODEL |
Override Config.default_model
|
export DEFAULT_MODEL="gemini-2.0-pro" |
MAX_FILE_SIZE_MB |
Increase upload limit | export MAX_FILE_SIZE_MB=200 |
UPLOAD_TIMEOUT_SEC |
Slow network support | export UPLOAD_TIMEOUT_SEC=180 |
MAX_OUTPUT_TOKENS |
Larger answers | export MAX_OUTPUT_TOKENS=2048 |
TEMPERATURE |
Model sampling | export TEMPERATURE=0.2 |
MAX_SOURCES |
Number of citations | export MAX_SOURCES=3 |
CACHE_TTL_SEC / CACHE_MAX_SIZE
|
Search cache tuning | |
ENVIRONMENT |
Logging mode (production / development) |
export ENVIRONMENT=development |
UPLOAD_RATE_LIMIT |
e.g. 30/minute
|
export UPLOAD_RATE_LIMIT="30/minute" |
SEARCH_RATE_LIMIT |
e.g. 200/minute
|
|
HOST, PORT, WORKERS, RELOAD
|
CLI runtime options |
All numeric values accept strings (parsed via
int()/float()).
flamehaven-api honours environment variables first, but you can override with
flags:
HOST=0.0.0.0 PORT=9000 WORKERS=4 RELOAD=false flamehaven-api-
HOST: Bind address. -
PORT: HTTP port. -
WORKERS: Uvicorn workers (ignored whenRELOAD=true). -
RELOAD: Hot reload during development (true/false).
| Endpoint | Default | Env Override |
|---|---|---|
/api/upload/single, /upload
|
10/minute |
UPLOAD_RATE_LIMIT |
/api/upload/multiple, /upload-multiple
|
5/minute |
MULTI_UPLOAD_RATE_LIMIT |
/api/search (POST/GET) |
100/minute |
SEARCH_RATE_LIMIT |
/metrics, /prometheus
|
100/minute |
METRICS_RATE_LIMIT |
Rate limits follow SlowAPI syntax (N/period). Supported units: second,
minute, hour, day.
Search results use cachetools.TTLCache. Tune via:
from flamehaven_filesearch.cache import get_search_cache
cache = get_search_cache(maxsize=5000, ttl=1800)You can also reset caches programmatically:
from flamehaven_filesearch.cache import reset_all_caches
reset_all_caches()- Uploaded files are streamed to a temporary directory (
tempfile.mkdtemp()). - When
google-genaiSDK is missing, the fallback in-memory store_local_store_docskeeps contents in-process. Useallow_offline=Truefor unit tests.
-
Per-environment
.env: Create.env.development,.env.production, and load them via process manager. - Secrets: Limit API key scope at Google AI Studio. Rotate quarterly.
-
Autoscaling: When running multiple instances, point them to a shared
CACHE_BACKEND(Redis) if you require cross-node caching. The current cache is in-memory; an adapter can be added via the factory functions.
For additional examples, see examples/api_example.py and
tests/test_security.py (config validation tests).