feat(ai): add OpenAI Agents SDK integration #408

andrewm4894 · 2026-01-14T20:44:00Z

Summary

Adds PostHog tracing integration for the OpenAI Agents SDK.

Implements PostHogTracingProcessor that captures agent traces, spans, and LLM generations to PostHog LLM Analytics.

Changes

posthog/ai/openai_agents/processor.py - TracingProcessor implementation
posthog/ai/openai_agents/__init__.py - exports and instrument() helper
pyproject.toml - added posthog.ai.openai_agents to setuptools packages
37 unit tests
Version bump to 7.7.0

Event Mapping

Agents SDK Span	PostHog Event
GenerationSpanData	`$ai_generation`
ResponseSpanData	`$ai_generation`
FunctionSpanData	`$ai_span` (type=tool)
AgentSpanData	`$ai_span` (type=agent)
HandoffSpanData	`$ai_span` (type=handoff)
GuardrailSpanData	`$ai_span` (type=guardrail)
CustomSpanData	`$ai_span` (type=custom)
TranscriptionSpanData	`$ai_span` (type=transcription)
SpeechSpanData	`$ai_span` (type=speech)
SpeechGroupSpanData	`$ai_span` (type=speech_group)
MCPListToolsSpanData	`$ai_span` (type=mcp_tools)

Properties Captured

Core Properties

Property	Description
`$ai_provider`	"openai" (the underlying LLM provider)
`$ai_framework`	"openai-agents" (identifies the framework)
`$ai_total_tokens`	Sum of input + output tokens
`$ai_error_type`	Error categorization (model_behavior_error, user_error, input_guardrail_triggered, output_guardrail_triggered, max_turns_exceeded, unknown)
`$ai_input_tokens` / `$ai_output_tokens`	Token counts
`$ai_model`	Model name
`$ai_input` / `$ai_output_choices`	Input/output content (respects privacy mode)
`$ai_latency`	Operation latency in seconds
`$ai_group_id`	Links related traces (conversation threads)

Audio Pass-Through Properties

Property	Description
`first_content_at`	Time to first audio byte
`audio_input_format`	Input audio format
`audio_output_format`	Output audio format
`model_config`	Model configuration dict

Usage

from posthog.ai.openai_agents import instrument

instrument(distinct_id="user@example.com")

# Run agents as normal
from agents import Agent, Runner
agent = Agent(name="Assistant", instructions="You are helpful.")
result = Runner.run_sync(agent, "Hello!")

Test plan

Unit tests pass (37 tests)
Manual test with real agent (llm-analytics-apps)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b06f2a9fc7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

posthog/ai/openai_agents/processor.py

greptile-apps

Additional Comments (1)

pyproject.toml, line 81-91 (link)

logic: Missing package in setuptools config - the posthog.ai.openai_agents package needs to be added to be included in the distribution

_{4 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

posthog/ai/openai_agents/processor.py

Add PostHogTracingProcessor that implements the OpenAI Agents SDK TracingProcessor interface to capture agent traces in PostHog. - Maps GenerationSpanData to $ai_generation events - Maps FunctionSpanData, AgentSpanData, HandoffSpanData, GuardrailSpanData to $ai_span events with appropriate types - Supports privacy mode, groups, and custom properties - Includes instrument() helper for one-liner setup - 22 unit tests covering all span types

…n traces - Capture group_id from trace and include as $ai_group_id on all events - Add _get_group_id() helper to retrieve group_id from trace metadata - Pass group_id through all span handlers (generation, function, agent, handoff, guardrail, response, custom, audio, mcp, generic) - Enables linking multiple traces in the same conversation thread

- Add $ai_total_tokens to generation and response spans (required by PostHog cost reporting) - Add $ai_error_type for cross-provider error categorization (model_behavior_error, user_error, input_guardrail_triggered, output_guardrail_triggered, max_turns_exceeded) - Add $ai_output_choices to response spans for output content capture - Add audio pass-through properties for voice spans: - first_content_at (time to first audio byte) - audio_input_format / audio_output_format - model_config - $ai_input for TTS text input - Add comprehensive tests for all new properties

…ents - Add $ai_framework="openai-agents" to all events for framework identification - Standardize $ai_provider="openai" on all events (previously some used "openai_agents") - Follows pattern from posthog-js where $ai_provider is the underlying LLM provider

Without this, the module is not included in the distribution and users get an ImportError after pip install.

Add max entry limit and eviction for _span_start_times and _trace_metadata dicts. If on_span_end or on_trace_end is never called (e.g., due to an SDK exception), these dicts could grow indefinitely in long-running processes.

Previously on_span_end always called _get_distinct_id(None), which meant callable distinct_id resolvers never received the trace object for spans. Now the resolved distinct_id is stored at trace start and looked up by trace_id during span end.

All span handlers repeated the same 6 base fields (trace_id, span_id, parent_id, provider, framework, latency) plus the group_id conditional. Extract into a shared helper to reduce ~100 lines of boilerplate.

- test_generation_span_with_no_usage: zero tokens when usage is None - test_generation_span_with_partial_usage: only input_tokens present - test_error_type_categorization_by_type_field_only: type field without matching message content - test_distinct_id_resolved_from_trace_for_spans: callable resolver uses trace context for span events - test_eviction_of_stale_entries: memory leak prevention works

If span.error is a string instead of a dict, calling .get() would raise AttributeError. Now falls back to str() for non-dict errors.

The rebase conflict resolution accidentally truncated the changelog to only the most recent entries. Restored all historical entries.

When no distinct_id is provided, _get_distinct_id falls back to trace_id or "unknown". Since these are non-None strings, the $process_person_profile=False check in _capture_event never fired, creating unwanted person profiles keyed by trace IDs. Track whether the user explicitly provided a distinct_id and use that flag to control personless mode, matching the pattern used by the langchain and openai integrations.

Two fixes from bot review: 1. CHANGELOG.md was accidentally truncated to 38 lines during rebase conflict resolution. Restored all 767 lines of history. 2. Personless mode now follows the same pattern as langchain/openai integrations: _get_distinct_id returns None when no user-provided ID is available, and callers set $process_person_profile=False before falling back to trace_id. This covers the edge case where a callable distinct_id returns None.

Radu-Raicea · 2026-01-27T19:31:21Z

posthog/ai/openai_agents/processor.py

+            "$ai_output_choices": self._with_privacy_mode(_safe_json(span_data.output)),
+            "$ai_input_tokens": input_tokens,
+            "$ai_output_tokens": output_tokens,
+            "$ai_total_tokens": input_tokens + output_tokens,


You should handle input_tokens or output_tokens being None.

Fixed in 61d43e3 — added defensive or 0 guards so None values won't cause a TypeError when summing tokens.

Radu-Raicea · 2026-01-27T19:35:48Z

posthog/ai/openai_agents/processor.py

+                    error_type = "model_behavior_error"
+                elif "UserError" in error_type_raw or "UserError" in error_message:
+                    error_type = "user_error"
+                elif "InputGuardrailTripwireTriggered" in error_message:


Why don't you also check in error_type_raw for the rest of the errors?

Fixed in b626a16 — now checking both error_type_raw and error_message for all error categories, consistent with ModelBehaviorError/UserError.

Radu-Raicea · 2026-01-27T19:36:34Z

posthog/ai/openai_agents/__init__.py

+__all__ = ["PostHogTracingProcessor", "instrument"]
+
+
+def instrument(


Let's add type hints here too

Fixed in b4a2d8b — added full type hints to instrument() matching the PostHogTracingProcessor.__init__ signature.

Radu-Raicea · 2026-01-27T19:39:14Z

posthog/ai/openai_agents/processor.py

+log = logging.getLogger("posthog")
+
+
+def _safe_json(obj: Any) -> Any:


I'm a bit confused by this. Are you trying to serialize it? Because you are either returning str(obj) or obj.

Fixed in d4f4a3a — renamed to _ensure_serializable with a clearer docstring. The purpose is to validate that an object is JSON-serializable (returning it as-is if so), falling back to str(obj) for non-serializable types so downstream json.dumps() won't fail.

Radu-Raicea · 2026-01-27T19:44:10Z

posthog/ai/openai_agents/processor.py

+        except Exception as e:
+            log.debug(f"Error in on_trace_start: {e}")
+
+    def on_trace_end(self, trace: Trace) -> None:


The LangChain implementation emits the $ai_trace at the end, to capture all the metadata. Any reason we're doing the opposite for this?

Fixed in 7a534de — moved $ai_trace emission from on_trace_start to on_trace_end, matching the LangChain approach. The trace now includes $ai_latency and all metadata is captured after the trace completes. on_trace_start now only stores metadata for use by child spans.

Radu-Raicea · 2026-01-27T19:44:53Z

I left a few comments to address before merging this.

Guard against input_tokens or output_tokens being None when computing $ai_total_tokens to avoid TypeError.

Check both error_type_raw and error_message for guardrail and max_turns errors, consistent with how ModelBehaviorError and UserError are already checked.

The function validates JSON serializability and falls back to str(), not serializes. Rename and update docstring to make the contract clear.

Move the $ai_trace event from on_trace_start to on_trace_end to capture full metadata including latency, matching the LangChain integration approach. on_trace_start now only stores metadata for use by spans.

chatgpt-codex-connector bot reviewed Jan 14, 2026

View reviewed changes

posthog/ai/openai_agents/processor.py Outdated Show resolved Hide resolved

greptile-apps bot reviewed Jan 14, 2026

View reviewed changes

posthog/ai/openai_agents/processor.py Outdated Show resolved Hide resolved

posthog/ai/openai_agents/processor.py Show resolved Hide resolved

andrewm4894 mentioned this pull request Jan 14, 2026

chore(docs): add OpenAI Agents SDK LLM analytics installation PostHog/posthog#45071

Merged

andrewm4894 self-assigned this Jan 14, 2026

andrewm4894 added the team/llm-analytics label Jan 14, 2026

andrewm4894 requested a review from a team January 21, 2026 14:14

andrewm4894 added 5 commits January 27, 2026 12:44

chore: bump version to 7.7.0 for OpenAI Agents SDK integration

6193698

andrewm4894 force-pushed the feat/llma-add-openai-agents-sdk branch from 4cfcc78 to 6193698 Compare January 27, 2026 12:53

andrewm4894 added 9 commits January 27, 2026 13:02

fix: add openai_agents package to setuptools config

df0fcc0

Without this, the module is not included in the distribution and users get an ImportError after pip install.

fix: correct indentation in on_trace_start properties dict

6bf341a

fix: prevent unbounded growth of span/trace tracking dicts

71069ed

Add max entry limit and eviction for _span_start_times and _trace_metadata dicts. If on_span_end or on_trace_end is never called (e.g., due to an SDK exception), these dicts could grow indefinitely in long-running processes.

refactor: extract _base_properties helper to reduce duplication

8d7a68d

All span handlers repeated the same 6 base fields (trace_id, span_id, parent_id, provider, framework, latency) plus the group_id conditional. Extract into a shared helper to reduce ~100 lines of boilerplate.

fix: handle non-dict error_info in span error parsing

143ff91

If span.error is a string instead of a dict, calling .get() would raise AttributeError. Now falls back to str() for non-dict errors.

style: apply ruff formatting

ae519ef

style: replace lambda assignments with def (ruff E731)

789be8d

andrewm4894 force-pushed the feat/llma-add-openai-agents-sdk branch from 4ab9da3 to 789be8d Compare January 27, 2026 13:33

andrewm4894 added 3 commits January 27, 2026 14:03

fix: restore full CHANGELOG.md history

b3c631c

The rebase conflict resolution accidentally truncated the changelog to only the most recent entries. Restored all historical entries.

Radu-Raicea reviewed Jan 27, 2026

View reviewed changes

Radu-Raicea approved these changes Jan 27, 2026

View reviewed changes

andrewm4894 added 7 commits January 27, 2026 20:25

fix: handle None token counts in generation span

61d43e3

Guard against input_tokens or output_tokens being None when computing $ai_total_tokens to avoid TypeError.

fix: check error_type_raw for all error categories

b626a16

Check both error_type_raw and error_message for guardrail and max_turns errors, consistent with how ModelBehaviorError and UserError are already checked.

fix: add type hints to instrument() function

b4a2d8b

refactor: rename _safe_json to _ensure_serializable for clarity

d4f4a3a

The function validates JSON serializability and falls back to str(), not serializes. Rename and update docstring to make the contract clear.

refactor: emit $ai_trace at trace end instead of start

7a534de

Move the $ai_trace event from on_trace_start to on_trace_end to capture full metadata including latency, matching the LangChain integration approach. on_trace_start now only stores metadata for use by spans.

style: fix ruff formatting

ea6cba3

fix: add TYPE_CHECKING imports for type hints in instrument()

f239609

andrewm4894 merged commit 1875b71 into master Jan 27, 2026
20 checks passed

andrewm4894 deleted the feat/llma-add-openai-agents-sdk branch January 27, 2026 21:15

		__all__ = ["PostHogTracingProcessor", "instrument"]


		def instrument(

		log = logging.getLogger("posthog")


		def _safe_json(obj: Any) -> Any:

feat(ai): add OpenAI Agents SDK integration #408

feat(ai): add OpenAI Agents SDK integration #408

Uh oh!

Conversation

andrewm4894 commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Event Mapping

Properties Captured

Core Properties

Audio Pass-Through Properties

Usage

Related

Test plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (1)

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Radu-Raicea commented Jan 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andrewm4894 commented Jan 14, 2026 •

edited

Loading

greptile-apps bot left a comment •

edited

Loading