Skip to content

[feature request] Add token usage tracking for Haystack embedder components #2322

@Hansehart

Description

@Hansehart

Is your feature request related to a problem? Please describe.
I'm using HaystackInstrumentor to trace my document indexing pipeline that uses MistralDocumentEmbedder. While the embedder returns token usage in its response metadata, this information is not captured in the
OpenTelemetry spans sent to Langfuse.

Describe the solution you'd like
Add token usage tracking for embedder components, similar to the existing implementation for LLM generators.

Describe alternatives you've considered
Manually tracking usage with Langfuse SDK - defeats the purpose of auto-instrumentation

Additional context
I've reviewed the code in openinference/instrumentation/haystack/_wrappers.py and the implementation would follow the same pattern as _get_llm_token_count_attributes() (lines 492-519), but for embedder responses.

I'm willing to contribute a PR for this feature if the maintainers are interested. Is this feature aligned with the project's goals?

Metadata

Metadata

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions