-
Notifications
You must be signed in to change notification settings - Fork 52
feat(ai): add OpenAI Agents SDK integration #408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b06f2a9fc7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
pyproject.toml, line 81-91 (link)logic: Missing package in setuptools config - the
posthog.ai.openai_agentspackage needs to be added to be included in the distribution
4 files reviewed, 3 comments
Add PostHogTracingProcessor that implements the OpenAI Agents SDK TracingProcessor interface to capture agent traces in PostHog. - Maps GenerationSpanData to $ai_generation events - Maps FunctionSpanData, AgentSpanData, HandoffSpanData, GuardrailSpanData to $ai_span events with appropriate types - Supports privacy mode, groups, and custom properties - Includes instrument() helper for one-liner setup - 22 unit tests covering all span types
…n traces - Capture group_id from trace and include as $ai_group_id on all events - Add _get_group_id() helper to retrieve group_id from trace metadata - Pass group_id through all span handlers (generation, function, agent, handoff, guardrail, response, custom, audio, mcp, generic) - Enables linking multiple traces in the same conversation thread
- Add $ai_total_tokens to generation and response spans (required by PostHog cost reporting) - Add $ai_error_type for cross-provider error categorization (model_behavior_error, user_error, input_guardrail_triggered, output_guardrail_triggered, max_turns_exceeded) - Add $ai_output_choices to response spans for output content capture - Add audio pass-through properties for voice spans: - first_content_at (time to first audio byte) - audio_input_format / audio_output_format - model_config - $ai_input for TTS text input - Add comprehensive tests for all new properties
…ents - Add $ai_framework="openai-agents" to all events for framework identification - Standardize $ai_provider="openai" on all events (previously some used "openai_agents") - Follows pattern from posthog-js where $ai_provider is the underlying LLM provider
4cfcc78 to
6193698
Compare
Without this, the module is not included in the distribution and users get an ImportError after pip install.
Add max entry limit and eviction for _span_start_times and _trace_metadata dicts. If on_span_end or on_trace_end is never called (e.g., due to an SDK exception), these dicts could grow indefinitely in long-running processes.
Previously on_span_end always called _get_distinct_id(None), which meant callable distinct_id resolvers never received the trace object for spans. Now the resolved distinct_id is stored at trace start and looked up by trace_id during span end.
All span handlers repeated the same 6 base fields (trace_id, span_id, parent_id, provider, framework, latency) plus the group_id conditional. Extract into a shared helper to reduce ~100 lines of boilerplate.
- test_generation_span_with_no_usage: zero tokens when usage is None - test_generation_span_with_partial_usage: only input_tokens present - test_error_type_categorization_by_type_field_only: type field without matching message content - test_distinct_id_resolved_from_trace_for_spans: callable resolver uses trace context for span events - test_eviction_of_stale_entries: memory leak prevention works
If span.error is a string instead of a dict, calling .get() would raise AttributeError. Now falls back to str() for non-dict errors.
4ab9da3 to
789be8d
Compare
The rebase conflict resolution accidentally truncated the changelog to only the most recent entries. Restored all historical entries.
When no distinct_id is provided, _get_distinct_id falls back to trace_id or "unknown". Since these are non-None strings, the $process_person_profile=False check in _capture_event never fired, creating unwanted person profiles keyed by trace IDs. Track whether the user explicitly provided a distinct_id and use that flag to control personless mode, matching the pattern used by the langchain and openai integrations.
Two fixes from bot review: 1. CHANGELOG.md was accidentally truncated to 38 lines during rebase conflict resolution. Restored all 767 lines of history. 2. Personless mode now follows the same pattern as langchain/openai integrations: _get_distinct_id returns None when no user-provided ID is available, and callers set $process_person_profile=False before falling back to trace_id. This covers the edge case where a callable distinct_id returns None.
| "$ai_output_choices": self._with_privacy_mode(_safe_json(span_data.output)), | ||
| "$ai_input_tokens": input_tokens, | ||
| "$ai_output_tokens": output_tokens, | ||
| "$ai_total_tokens": input_tokens + output_tokens, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should handle input_tokens or output_tokens being None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 61d43e3 — added defensive or 0 guards so None values won't cause a TypeError when summing tokens.
| error_type = "model_behavior_error" | ||
| elif "UserError" in error_type_raw or "UserError" in error_message: | ||
| error_type = "user_error" | ||
| elif "InputGuardrailTripwireTriggered" in error_message: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you also check in error_type_raw for the rest of the errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in b626a16 — now checking both error_type_raw and error_message for all error categories, consistent with ModelBehaviorError/UserError.
| __all__ = ["PostHogTracingProcessor", "instrument"] | ||
|
|
||
|
|
||
| def instrument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add type hints here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in b4a2d8b — added full type hints to instrument() matching the PostHogTracingProcessor.__init__ signature.
| log = logging.getLogger("posthog") | ||
|
|
||
|
|
||
| def _safe_json(obj: Any) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by this. Are you trying to serialize it? Because you are either returning str(obj) or obj.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in d4f4a3a — renamed to _ensure_serializable with a clearer docstring. The purpose is to validate that an object is JSON-serializable (returning it as-is if so), falling back to str(obj) for non-serializable types so downstream json.dumps() won't fail.
| except Exception as e: | ||
| log.debug(f"Error in on_trace_start: {e}") | ||
|
|
||
| def on_trace_end(self, trace: Trace) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LangChain implementation emits the $ai_trace at the end, to capture all the metadata. Any reason we're doing the opposite for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 7a534de — moved $ai_trace emission from on_trace_start to on_trace_end, matching the LangChain approach. The trace now includes $ai_latency and all metadata is captured after the trace completes. on_trace_start now only stores metadata for use by child spans.
|
I left a few comments to address before merging this. |
Guard against input_tokens or output_tokens being None when computing $ai_total_tokens to avoid TypeError.
Check both error_type_raw and error_message for guardrail and max_turns errors, consistent with how ModelBehaviorError and UserError are already checked.
The function validates JSON serializability and falls back to str(), not serializes. Rename and update docstring to make the contract clear.
Move the $ai_trace event from on_trace_start to on_trace_end to capture full metadata including latency, matching the LangChain integration approach. on_trace_start now only stores metadata for use by spans.
Summary
Adds PostHog tracing integration for the OpenAI Agents SDK.
Implements
PostHogTracingProcessorthat captures agent traces, spans, and LLM generations to PostHog LLM Analytics.Changes
posthog/ai/openai_agents/processor.py- TracingProcessor implementationposthog/ai/openai_agents/__init__.py- exports andinstrument()helperpyproject.toml- addedposthog.ai.openai_agentsto setuptools packagesEvent Mapping
$ai_generation$ai_generation$ai_span(type=tool)$ai_span(type=agent)$ai_span(type=handoff)$ai_span(type=guardrail)$ai_span(type=custom)$ai_span(type=transcription)$ai_span(type=speech)$ai_span(type=speech_group)$ai_span(type=mcp_tools)Properties Captured
Core Properties
$ai_provider$ai_framework$ai_total_tokens$ai_error_type$ai_input_tokens/$ai_output_tokens$ai_model$ai_input/$ai_output_choices$ai_latency$ai_group_idAudio Pass-Through Properties
first_content_ataudio_input_formataudio_output_formatmodel_configUsage
Related
Test plan