Skip to content

Conversation

@harsh543
Copy link

@harsh543 harsh543 commented Feb 1, 2026

Problem

PR #1197 attempts to fix OpenAI Agents tracing but fails with OTel providers (Langfuse, Logfire) due to context detachment errors:

ValueError: <Token> was created in a different Context

The issue: OTel validates token detachment happens in the exact same contextvars.Context instance where the token was created. Even copy_context() creates a new instance with equivalent values, causing the check to fail.

Root Cause

Activities and child workflows complete in:

  • Different machines (distributed execution)
  • Different workflow tasks (different context instances)
  • Different callback invocations (new context per task)

Callbacks registered with copied context still execute in a new context instance, breaking OTel's validation.

Solution

Follow the pattern used in temporalio/contrib/opentelemetry.py - complete spans synchronously at the orchestration point:

def start_activity(self, input: StartActivityInput) -> ActivityHandle:
    trace = get_trace_provider().get_current_trace()
    if trace:
        with custom_span(name="temporal:startActivity", data={"activity": input.activity}):
            pass  # Completes immediately in same context
    
    set_header_from_context(input, temporalio.workflow.payload_converter())
    return self.next.start_activity(input)

Benefits

  1. OTel Compatible: No context detachment errors - span finishes in same context it started
  2. Distributed Execution: No callback needed = works across machines
  3. Simpler: Removes add_done_callback machinery, cleaner code (-26 lines, +19 lines)
  4. Correct Semantics: Zero-duration orchestration span (CLIENT) + real-duration execution span (SERVER)
  5. Replay Safe: No long-running spans across workflow tasks

Span Hierarchy

temporal:startActivity (0ns, workflow) ← Orchestration decision
  └─ temporal:executeActivity (2.5s, activity worker) ← Actual work

The zero-duration span is correct - it represents when the orchestration decision was made, not when the distributed work completed.

Testing

  • All existing tests should pass
  • No ValueError with default provider
  • Follows existing pattern from temporalio/contrib/opentelemetry.py

References


cc @cretz - This implements the synchronous span approach you suggested in the PR #1197 review. Let me know if you'd like any changes!

🤖 Generated with Claude Code

@tconley1428
Copy link
Contributor

I don't think this is the approach we want to take. I'm working on a PR at #1286 which makes more substantial changes to fix the scenario. The problem statement here

PR #1197 attempts to fix OpenAI Agents tracing but fails with OTel providers (Langfuse, Logfire) due to context detachment errors:

Is also very incomplete given our later investigations. There are a number of additional problems with running otel simultaneous to the agents sdk logging in Temporal beyond the context warning.

- Add try/finally to restore full_workflow_info_on_extra after test
- Add clear phase comments for first execution vs replay path
- Add explanatory note for replay suppression validation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@harsh543
Copy link
Author

harsh543 commented Feb 3, 2026

Good catch — I overcomplicated this. Reverted to a single test with clearer structure:

# --- First execution: logs should appear ---
...
# --- Clear logs and continue execution (replay path) ---
# When the new worker starts, it replays the workflow history (signals 1 & 2).
# Replay suppression should prevent those logs from appearing again.
capturer.log_queue.queue.clear()
...
# --- Replay execution: no duplicate logs ---
assert not capturer.find_log("Signal: signal 1")

Also wrapped the global state mutation in try/finally:

original = workflow.logger.full_workflow_info_on_extra
workflow.logger.full_workflow_info_on_extra = True
try:
    ...
finally:
    workflow.logger.full_workflow_info_on_extra = original

No new test, just better visibility into what the existing one validates.

@harsh543
Copy link
Author

harsh543 commented Feb 3, 2026

Changes to test_workflow_logging

Addressed feedback — kept a single test with improved structure:

1. Added try/finally for test hygiene

original_full_workflow_info_on_extra = workflow.logger.full_workflow_info_on_extra
workflow.logger.full_workflow_info_on_extra = True
try:
    ...
finally:
    workflow.logger.full_workflow_info_on_extra = original_full_workflow_info_on_extra

2. Added clear phase markers

  • # --- First execution: logs should appear ---
  • # --- Clear logs and continue execution (replay path) ---
  • # --- Replay execution: no duplicate logs ---

3. Added explanatory note for replay

# When the new worker starts, it replays the workflow history (signals 1 & 2).
# Replay suppression should prevent those logs from appearing again.

No separate replay test — visibility > duplication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Langfuse Tracing Not Working with Temporal OpenAI Agents Plugin

2 participants