feat: propagate bank_id as user field in LLM calls by AlexanderZaytsev · Pull Request #425 · vectorize-io/hindsight

AlexanderZaytsev · 2026-02-23T14:03:58Z

Problem

When Hindsight is deployed behind an LLM proxy (e.g. for cost tracking or rate limiting), the proxy has no way to know which memory bank triggered each LLM call. This makes it impossible to attribute LLM costs to specific banks/tenants.

Solution

Add an llm_user context variable (contextvars.ContextVar) that propagates into the OpenAI-compatible LLM provider as the standard user field in API requests.

Changes (2 files)

engine/providers/openai_compatible_llm.py

Define llm_user context variable
Inject as user field in both call() and call_with_tools() when set

engine/memory_engine.py

Set llm_user in retain_batch_async() — covers extraction LLM calls
Set llm_user in reflect_async() — covers reflection LLM calls
Set llm_user in execute_task() — covers all background worker tasks (consolidation, mental model refresh, etc.)

How it works

HTTP request (retain/reflect)    → MemoryEngine method sets llm_user from bank_id
Background task (consolidation)  → execute_task() sets llm_user from task dict
                                   ↓
                          OpenAICompatibleLLM reads llm_user
                                   ↓
                          Injects as "user" field in API request
                                   ↓
                          Proxy reads "user" → attributes cost

Why contextvars?

bank_id is available at the entry points but would need to be threaded through many intermediate functions (_extract_facts_from_chunk, _consolidate_batch, run_reflect_agent, etc.) to reach the LLM provider. contextvars provides clean propagation without touching any function signatures, and is async-safe — concurrent requests for different banks don't interfere.

No behavior change when not using a proxy

The user field is simply ignored by providers when no proxy is involved. The feature is zero-cost when not used — llm_user defaults to None and no field is injected.

… attribution When Hindsight is deployed behind an LLM proxy, the proxy needs to know which caller triggered each LLM call for cost tracking and attribution. This adds an `llm_user` context variable (Python contextvars) that gets injected as the standard `user` field in OpenAI API requests. The context is set in three MemoryEngine entry points: - retain_batch_async: covers extraction LLM calls - reflect_async: covers reflection LLM calls - execute_task: covers all background worker tasks (consolidation, mental model refresh, etc.) The `user` field is part of the OpenAI API spec and is passed through by all major providers. Proxies can read this field to attribute costs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

nicoloboschi

user is deprecated in OpenAI Completion

https://developers.openai.com/api/reference/resources/chat/subresources/completions/methods/create

we should use safety_identifier

what gateway are you using?

AlexanderZaytsev · 2026-02-23T15:29:09Z

user is deprecated in OpenAI Completion

https://developers.openai.com/api/reference/resources/chat/subresources/completions/methods/create

we should use safety_identifier

what gateway are you using?

I'm using my own gateway (sort of like LiteLLM) where each agent has its own api key. Another idea is to make hindsight also accept api keys via headers for each request (X-LLM-API-Key, X-Embeddings-API-Key, etc). But there has to be a way to map all requests to each agent.

nicoloboschi · 2026-02-24T12:23:24Z

user is deprecated in OpenAI Completion
https://developers.openai.com/api/reference/resources/chat/subresources/completions/methods/create
we should use safety_identifier
what gateway are you using?

I'm using my own gateway (sort of like LiteLLM) where each agent has its own api key. Another idea is to make hindsight also accept api keys via headers for each request (X-LLM-API-Key, X-Embeddings-API-Key, etc). But there has to be a way to map all requests to each agent.

I'm fine adding this header, as long as it doesn't cause any problem on other llms provider. since user is deprecated I was wondering if safety_identifier was a better fit
Looks like LiteLLM gateway supports both https://docs.litellm.ai/docs/completion/input#optional-fields

nicoloboschi reviewed Feb 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: propagate bank_id as user field in LLM calls#425

feat: propagate bank_id as user field in LLM calls#425
AlexanderZaytsev wants to merge 1 commit intovectorize-io:mainfrom
AlexanderZaytsev:feat/bank-id-in-llm-calls

AlexanderZaytsev commented Feb 23, 2026

Uh oh!

nicoloboschi left a comment

Uh oh!

AlexanderZaytsev commented Feb 23, 2026

Uh oh!

nicoloboschi commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AlexanderZaytsev commented Feb 23, 2026

Problem

Solution

Changes (2 files)

How it works

Why contextvars?

No behavior change when not using a proxy

Uh oh!

nicoloboschi left a comment

Choose a reason for hiding this comment

Uh oh!

AlexanderZaytsev commented Feb 23, 2026

Uh oh!

nicoloboschi commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants