Skip to content

Conversation

@andrewm4894
Copy link
Member

@andrewm4894 andrewm4894 commented Jan 14, 2026

Summary

Adds PostHog tracing integration for the OpenAI Agents SDK.

Implements PostHogTracingProcessor that captures agent traces, spans, and LLM generations to PostHog LLM Analytics.

Changes

  • posthog/ai/openai_agents/processor.py - TracingProcessor implementation
  • posthog/ai/openai_agents/__init__.py - exports and instrument() helper
  • pyproject.toml - added posthog.ai.openai_agents to setuptools packages
  • 37 unit tests
  • Version bump to 7.7.0

Event Mapping

Agents SDK Span PostHog Event
GenerationSpanData $ai_generation
ResponseSpanData $ai_generation
FunctionSpanData $ai_span (type=tool)
AgentSpanData $ai_span (type=agent)
HandoffSpanData $ai_span (type=handoff)
GuardrailSpanData $ai_span (type=guardrail)
CustomSpanData $ai_span (type=custom)
TranscriptionSpanData $ai_span (type=transcription)
SpeechSpanData $ai_span (type=speech)
SpeechGroupSpanData $ai_span (type=speech_group)
MCPListToolsSpanData $ai_span (type=mcp_tools)

Properties Captured

Core Properties

Property Description
$ai_provider "openai" (the underlying LLM provider)
$ai_framework "openai-agents" (identifies the framework)
$ai_total_tokens Sum of input + output tokens
$ai_error_type Error categorization (model_behavior_error, user_error, input_guardrail_triggered, output_guardrail_triggered, max_turns_exceeded, unknown)
$ai_input_tokens / $ai_output_tokens Token counts
$ai_model Model name
$ai_input / $ai_output_choices Input/output content (respects privacy mode)
$ai_latency Operation latency in seconds
$ai_group_id Links related traces (conversation threads)

Audio Pass-Through Properties

Property Description
first_content_at Time to first audio byte
audio_input_format Input audio format
audio_output_format Output audio format
model_config Model configuration dict

Usage

from posthog.ai.openai_agents import instrument

instrument(distinct_id="user@example.com")

# Run agents as normal
from agents import Agent, Runner
agent = Agent(name="Assistant", instructions="You are helpful.")
result = Runner.run_sync(agent, "Hello!")

Related

Test plan

  • Unit tests pass (37 tests)
  • Manual test with real agent (llm-analytics-apps)

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b06f2a9fc7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. pyproject.toml, line 81-91 (link)

    logic: Missing package in setuptools config - the posthog.ai.openai_agents package needs to be added to be included in the distribution

4 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Add PostHogTracingProcessor that implements the OpenAI Agents SDK
TracingProcessor interface to capture agent traces in PostHog.

- Maps GenerationSpanData to $ai_generation events
- Maps FunctionSpanData, AgentSpanData, HandoffSpanData, GuardrailSpanData
  to $ai_span events with appropriate types
- Supports privacy mode, groups, and custom properties
- Includes instrument() helper for one-liner setup
- 22 unit tests covering all span types
…n traces

- Capture group_id from trace and include as $ai_group_id on all events
- Add _get_group_id() helper to retrieve group_id from trace metadata
- Pass group_id through all span handlers (generation, function, agent, handoff, guardrail, response, custom, audio, mcp, generic)
- Enables linking multiple traces in the same conversation thread
- Add $ai_total_tokens to generation and response spans (required by PostHog cost reporting)
- Add $ai_error_type for cross-provider error categorization (model_behavior_error, user_error, input_guardrail_triggered, output_guardrail_triggered, max_turns_exceeded)
- Add $ai_output_choices to response spans for output content capture
- Add audio pass-through properties for voice spans:
  - first_content_at (time to first audio byte)
  - audio_input_format / audio_output_format
  - model_config
  - $ai_input for TTS text input
- Add comprehensive tests for all new properties
…ents

- Add $ai_framework="openai-agents" to all events for framework identification
- Standardize $ai_provider="openai" on all events (previously some used "openai_agents")
- Follows pattern from posthog-js where $ai_provider is the underlying LLM provider
@andrewm4894 andrewm4894 force-pushed the feat/llma-add-openai-agents-sdk branch from 4cfcc78 to 6193698 Compare January 27, 2026 12:53
Without this, the module is not included in the distribution
and users get an ImportError after pip install.
Add max entry limit and eviction for _span_start_times and
_trace_metadata dicts. If on_span_end or on_trace_end is never
called (e.g., due to an SDK exception), these dicts could grow
indefinitely in long-running processes.
Previously on_span_end always called _get_distinct_id(None), which
meant callable distinct_id resolvers never received the trace object
for spans. Now the resolved distinct_id is stored at trace start and
looked up by trace_id during span end.
All span handlers repeated the same 6 base fields (trace_id, span_id,
parent_id, provider, framework, latency) plus the group_id conditional.
Extract into a shared helper to reduce ~100 lines of boilerplate.
- test_generation_span_with_no_usage: zero tokens when usage is None
- test_generation_span_with_partial_usage: only input_tokens present
- test_error_type_categorization_by_type_field_only: type field without
  matching message content
- test_distinct_id_resolved_from_trace_for_spans: callable resolver
  uses trace context for span events
- test_eviction_of_stale_entries: memory leak prevention works
If span.error is a string instead of a dict, calling .get() would
raise AttributeError. Now falls back to str() for non-dict errors.
@andrewm4894 andrewm4894 force-pushed the feat/llma-add-openai-agents-sdk branch from 4ab9da3 to 789be8d Compare January 27, 2026 13:33
The rebase conflict resolution accidentally truncated the changelog
to only the most recent entries. Restored all historical entries.
When no distinct_id is provided, _get_distinct_id falls back to
trace_id or "unknown". Since these are non-None strings, the
$process_person_profile=False check in _capture_event never fired,
creating unwanted person profiles keyed by trace IDs.

Track whether the user explicitly provided a distinct_id and use
that flag to control personless mode, matching the pattern used
by the langchain and openai integrations.
Two fixes from bot review:

1. CHANGELOG.md was accidentally truncated to 38 lines during rebase
   conflict resolution. Restored all 767 lines of history.

2. Personless mode now follows the same pattern as langchain/openai
   integrations: _get_distinct_id returns None when no user-provided
   ID is available, and callers set $process_person_profile=False
   before falling back to trace_id. This covers the edge case where
   a callable distinct_id returns None.
"$ai_output_choices": self._with_privacy_mode(_safe_json(span_data.output)),
"$ai_input_tokens": input_tokens,
"$ai_output_tokens": output_tokens,
"$ai_total_tokens": input_tokens + output_tokens,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should handle input_tokens or output_tokens being None.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 61d43e3 — added defensive or 0 guards so None values won't cause a TypeError when summing tokens.

error_type = "model_behavior_error"
elif "UserError" in error_type_raw or "UserError" in error_message:
error_type = "user_error"
elif "InputGuardrailTripwireTriggered" in error_message:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you also check in error_type_raw for the rest of the errors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b626a16 — now checking both error_type_raw and error_message for all error categories, consistent with ModelBehaviorError/UserError.

__all__ = ["PostHogTracingProcessor", "instrument"]


def instrument(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add type hints here too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b4a2d8b — added full type hints to instrument() matching the PostHogTracingProcessor.__init__ signature.

log = logging.getLogger("posthog")


def _safe_json(obj: Any) -> Any:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this. Are you trying to serialize it? Because you are either returning str(obj) or obj.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d4f4a3a — renamed to _ensure_serializable with a clearer docstring. The purpose is to validate that an object is JSON-serializable (returning it as-is if so), falling back to str(obj) for non-serializable types so downstream json.dumps() won't fail.

except Exception as e:
log.debug(f"Error in on_trace_start: {e}")

def on_trace_end(self, trace: Trace) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LangChain implementation emits the $ai_trace at the end, to capture all the metadata. Any reason we're doing the opposite for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7a534de — moved $ai_trace emission from on_trace_start to on_trace_end, matching the LangChain approach. The trace now includes $ai_latency and all metadata is captured after the trace completes. on_trace_start now only stores metadata for use by child spans.

@Radu-Raicea
Copy link
Member

I left a few comments to address before merging this.

Guard against input_tokens or output_tokens being None when computing
$ai_total_tokens to avoid TypeError.
Check both error_type_raw and error_message for guardrail and
max_turns errors, consistent with how ModelBehaviorError and
UserError are already checked.
The function validates JSON serializability and falls back to str(),
not serializes. Rename and update docstring to make the contract clear.
Move the $ai_trace event from on_trace_start to on_trace_end to
capture full metadata including latency, matching the LangChain
integration approach. on_trace_start now only stores metadata for
use by spans.
@andrewm4894 andrewm4894 merged commit 1875b71 into master Jan 27, 2026
20 checks passed
@andrewm4894 andrewm4894 deleted the feat/llma-add-openai-agents-sdk branch January 27, 2026 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants