Skip to content

feat(watsonx): implement emitting events in addition to current behavior #2896

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 5, 2025

Conversation

LuizDMM
Copy link
Contributor

@LuizDMM LuizDMM commented May 5, 2025

✅ PR Requirements

  • I have added tests that cover my changes.
  • If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
  • PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
  • (If applicable) I have updated the documentation accordingly.

📌 Issue Requirements

  • Make sure to keep the current way of emitting events and the new (event-based) way with the official semantic
  • Add a config to each instrumentation called use_legacy_attributes which defaults to true, and if set to false emit events instead of the prompt / completion attributes.
  • Propagate a new initialization parameter of the SDK called use_legacy_attributes that sets this config in all instrumentations.

Important

Adds event emission to Watsonx instrumentation with a config option to toggle legacy attributes, updating tests accordingly.

  • Behavior:
    • Adds event emission functionality to Watsonx instrumentation, using emit_event() in event_handler.py.
    • Introduces use_legacy_attributes config in Config class to toggle between legacy attributes and event-based approach.
    • Implements _emit_input_events() and _emit_response_events() in __init__.py to handle input and response events.
  • Configuration:
    • Adds use_legacy_attributes to Config in config.py, defaulting to True.
    • Uses should_emit_events() and is_content_enabled() in utils.py to control event emission.
  • Testing:
    • Updates tests in test_watsonx_metrics.py and test_generate.py to validate event emission and legacy behavior.
    • Adds fixtures in conftest.py for setting up test contexts with and without content capture.

This description was created by Ellipsis for c6de57e. You can customize this summary. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to c6de57e in 3 minutes and 34 seconds. Click for details.
  • Reviewed 932 lines of code in 8 files
  • Skipped 0 files when reviewing.
  • Skipped posting 12 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/__init__.py:506
  • Draft comment:
    Consider checking if 'args' is non-empty before accessing args[0] in _emit_input_events to avoid potential index errors for calls without positional arguments.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The comment raises a valid concern about potential index errors. However, looking at the broader context, this function is only called as a wrapper around specific Watsonx API methods that are documented to require a prompt parameter. The prompt must come either from kwargs or args[0]. If neither exists, that would indicate incorrect API usage rather than a bug in this code. The function is following the expected contract of the API it's wrapping. I could be wrong about the API contract - there might be valid cases where neither kwargs nor args contain a prompt. The code would be more defensive with the suggested check. While being more defensive is generally good, in this case we're wrapping a specific API that has a known contract. Adding checks for invalid API usage would obscure the normal flow. If the API is used incorrectly, failing fast with an index error is appropriate. The comment should be deleted. The current code appropriately handles the expected API contract, and adding defensive checks for invalid API usage would not be helpful.
2. packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/__init__.py:324
  • Draft comment:
    Potential IndexError: The _emit_input_events function assumes that if 'prompt' is not in kwargs then args[0] exists. Consider verifying that args is not empty before accessing args[0].
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The function is called from _wrap() which wraps Watsonx API methods like generate() and generate_text_stream(). These methods require a prompt parameter either as first positional arg or as kwarg. So in practice, either kwargs will have "prompt" or args[0] will exist. The code assumes this contract but doesn't validate it defensively. I could be wrong about the API contract - there could be some wrapped methods that don't require a prompt. Without seeing the full Watsonx API docs, I can't be 100% certain. However, this is an internal helper function used only by the wrapper code, not a public API. The wrapper code controls how it's called and ensures the contract is met. While defensive programming is good, this is an internal helper with a clear contract. The suggested fix would mask real errors if the contract is violated.
3. packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/__init__.py:506
  • Draft comment:
    Event emission is unconditionally invoked in _wrap via _emit_input_events and _emit_response_events. Since these methods internally check should_emit_events(), ensure that this design (always calling them regardless of legacy mode) is intentional.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The comment raises a valid point about architecture design, but looking deeper: 1. The event emission is actually controlled by Config settings 2. The internal checks in emit_event handle when events should be skipped 3. This is a reasonable pattern - have high level code be simple and push complexity down 4. No actual bug or issue is identified The comment identifies a potential design consideration that could be worth discussing. Maybe there is unnecessary overhead from always calling these methods. The current design is clean and maintainable - having the high level code be simple and pushing complexity down into the event emission layer is a good pattern. Any performance impact would be negligible. This comment should be deleted as it questions a reasonable design choice without identifying any actual problems that need fixing.
4. packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/event_handler.py:90
  • Draft comment:
    Ensure that the logic removing 'role' and 'tool_calls' in _emit_message_event matches the intended semantic conventions. The test expects the final logged body to be nested (e.g. {"content": {"content": "..."}}) – please verify that the wrapping provided by the Event logger produces this structure as expected.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the author to ensure that the logic matches the intended semantic conventions and to verify the structure produced by the Event logger. This falls under asking the author to double-check things and ensure behavior, which violates the rules.
5. packages/opentelemetry-instrumentation-watsonx/tests/traces/test_generate.py:84
  • Draft comment:
    Test name contains a duplicated 'with' in 'test_generate_with_with_events_no_content'. Consider renaming it to 'test_generate_with_events_no_content' for clarity.
  • Reason this comment was not posted:
    Marked as duplicate.
6. packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/event_handler.py:96
  • Draft comment:
    In the comment starting at line 95, the sentence "So, remove the role from the body if it is the same as the in the event name." contains a typographical error. Consider revising it to something like "So, remove the role from the body if it matches the role indicated in the event name." to improve clarity.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While the comment is technically correct about the typo, it's an extremely minor issue in a code comment that doesn't affect functionality. The meaning is still clear despite the typo. According to the rules, we should not make purely informative comments or comments about unimportant issues. The typo could potentially confuse future developers reading the documentation. Documentation quality is important for maintainability. While documentation quality matters, this is such a minor typo that it doesn't significantly impact understanding. The meaning remains clear despite the awkward wording. This comment should be deleted as it addresses an extremely minor documentation issue that doesn't impact code functionality or significantly affect understanding.
7. packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/utils.py:28
  • Draft comment:
    Typographical Error: The log message on line 28 uses 'OpenLLMetry', which appears to be a typo. It should be corrected to 'OpenTelemetry' to ensure clarity and consistency in error messages.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. packages/opentelemetry-instrumentation-watsonx/tests/metrics/conftest.py:31
  • Draft comment:
    Typographical error in comment: change 'skip test if could not found' to 'skip test if it could not be found' or 'skip test if not found'.
  • Reason this comment was not posted:
    Comment was on unchanged code.
9. packages/opentelemetry-instrumentation-watsonx/tests/metrics/conftest.py:71
  • Draft comment:
    Typographical error in comment: consider revising 'skip test if could not found' to 'skip test if it could not be found' or 'skip test if not found'.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While the grammar suggestion is valid, we need to consider if this is an important enough issue to flag. Comments are for documentation, so clarity is important, but this is a minor grammatical issue in test code. The meaning is still clear despite the awkward phrasing. The rules say to only keep comments that clearly require code changes and to avoid unimportant issues. The grammar error could potentially confuse non-native English speakers. Also, since it appears in multiple places, fixing it would improve consistency. While valid points, the meaning is still clear enough for test code, and this is an extremely minor issue that doesn't impact functionality or code quality in any meaningful way. This comment should be deleted as it addresses a minor grammatical issue that doesn't meaningfully impact code quality or functionality.
10. packages/opentelemetry-instrumentation-watsonx/tests/metrics/conftest.py:97
  • Draft comment:
    Typographical error in comment: 'skip test if could not found' should be changed to 'skip test if it could not be found' or 'skip test if not found'.
  • Reason this comment was not posted:
    Comment was on unchanged code.
11. packages/opentelemetry-instrumentation-watsonx/tests/traces/conftest.py:31
  • Draft comment:
    Typographical error: The comment 'skip test if could not found' is grammatically incorrect. Consider changing it to 'skip test if package not found' or a similar phrase for clarity.
  • Reason this comment was not posted:
    Comment was on unchanged code.
12. packages/opentelemetry-instrumentation-watsonx/tests/traces/test_generate.py:80
  • Draft comment:
    Typo in function name: 'test_generate_with_with_events_no_content' contains an extra 'with'. Consider renaming it to 'test_generate_with_events_no_content' for clarity.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The comment is correct - there is a duplicate 'with' in the function name. However, this is a test file and the function name, while having a typo, is still clear in its intent. The duplicate word doesn't impact functionality or readability significantly. According to the rules, we should not make comments that are purely informative or unimportant. The duplicate word could make the codebase slightly less professional looking. It might also make it harder to search for this test function if someone only searches for the correct name format. While true, these downsides are minor for a test file. The function is still easily findable through partial matches, and the duplicate word doesn't impact understanding or maintenance. Delete the comment. While technically correct, it's a minor naming issue in a test file that doesn't impact functionality or maintainability enough to warrant a comment.

Workflow ID: wflow_UMFrw3XdAZqxvPRi

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

LuizDMM added 3 commits May 11, 2025 21:22
…o current behavior

* Add "use_legacy_attributes" to Config and the Instrumentor constructor, defaulting to True;
* emit events for user prompts and AI responses, following [OpenTelemetry semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/\\\);
* introduce a privacy safeguard by checking the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT environment variable to enable or disable content capture in events;
* implement comprehensive tests to verify the new functionality and ensure that, when use_legacy_attributes == True, the existing behavior remains unchanged (tests aren't working due to incorrect package instrumentation);
- Introduce TypedDicts and dataclasses to standardize inputs for event handling;
- Move `event_logger` to Config to centralize configuration and reduce complexity;
- Add `emit_event` method to encapsulate all event emission logic via `Event` instances;
- Refactor instrumentation to use `emit_event`, improving clarity and maintainability.
@LuizDMM LuizDMM force-pushed the implement-events-watsonx branch from 8212a09 to ed7801b Compare May 12, 2025 00:22
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Function Parameter Mismatch Causes Errors

The _emit_response_events function is defined with only one parameter (response: dict) but is called with two parameters (responses, event_logger) in _handle_response and _handle_stream_response, causing a TypeError. Additionally, _emit_response_events calls emit_event() internally, which requires an event_logger parameter, but this parameter is not available within _emit_response_events's scope, leading to another TypeError.

packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/__init__.py#L348-L357

def _emit_response_events(response: dict):
for i, message in enumerate(response.get("results", [])):
emit_event(
ChoiceEvent(
index=i,
message={"content": message.get("generated_text"), "role": "assistant"},
finish_reason=message.get("stop_reason", "unknown"),
)
)

packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/__init__.py#L503-L504

if should_emit_events() and event_logger:
_emit_response_events(responses, event_logger)

packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/__init__.py#L519-L532

):
_set_model_stream_response_attributes(span, stream_response)
if should_emit_events() and event_logger:
_emit_response_events(
{
"results": [
{
"stop_reason": stream_stop_reason,
"generated_text": stream_generated_text,
}
]
},
)

Fix in CursorFix in Web


Bug: Missing Argument in `_handle_input()` Call

The call to _handle_input() is missing the response_counter argument, which is required by its function definition.

packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/__init__.py#L567-L568

_handle_input(span, event_logger, name, instance, args, kwargs)

Fix in CursorFix in Web


BugBot free trial expires on July 22, 2025
You have used $0.00 of your $50.00 spend limit so far. Manage your spend limit in the Cursor dashboard.

Was this report helpful? Give feedback by reacting with 👍 or 👎

@nirga nirga merged commit 865c4c9 into traceloop:main Jul 5, 2025
10 checks passed
amitalokbera pushed a commit to amitalokbera/openllmetry that referenced this pull request Jul 15, 2025
…ior (traceloop#2896)

Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants