test(agentic-ai): add event handling regression e2e test#7072
Open
test(agentic-ai): add event handling regression e2e test#7072
Conversation
Member
Author
|
ℹ️ Note: this will fail e2e until using Camunda 8.9.2 |
Contributor
There was a problem hiding this comment.
Pull request overview
Adds an e2e regression test suite in the Agentic-AI connector E2E module to validate the AHSP event-handling contract and detect the Camunda 8.9.1 variable-scope leak regression (inner-instance toolCall leaking to root). It also adjusts the shared connectors BPMN models to ensure the regression path is actually exercised (via <zeebe:output> mappings).
Changes:
- Add
L4JAiAgentJobWorkerEventsTestscovering buffered events, events during tool execution (wait vs interrupt), empty payload behavior, and multi-event ordering. - Update the AHSP connectors BPMN models to include an output mapping on
SuperfluxProduct(and revise the event BPMN to include a “Pending Tool” job + correlation-key-driven message subscription). - Add a shared
assertNoToolCallVariableLeak(...)helper and wire it into existing tool-calling coverage.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/resources/agentic-ai-ahsp-connectors.bpmn | Adds <zeebe:output> mapping for SuperfluxProduct to trigger the regression code path under test. |
| connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/resources/agentic-ai-ahsp-connectors-event.bpmn | Reworks the event-focused BPMN to support deterministic event publication/correlation and “in-flight tool” scenarios. |
| connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/java/.../L4JAiAgentJobWorkerToolCallingTests.java | Ensures existing tool-calling regression test asserts the variable leak invariant. |
| connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/java/.../L4JAiAgentJobWorkerEventsTests.java | Introduces the new event-handling regression E2E test suite. |
| connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/java/.../BaseL4JAiAgentJobWorkerTest.java | Adds assertNoToolCallVariableLeak(...) helper using variable search to detect root-scope leaks. |
b9ac9be to
3e87c47
Compare
New `L4JAiAgentJobWorkerEventSubprocessTests` covers the agentic-AI sub-process event-handling contract (event before activation, during tool execution with WAIT_FOR_TOOL_CALL_RESULTS / INTERRUPT_TOOL_CALLS / empty payload, plus a no-event control). Each scenario asserts the expected chat conversation and that `toolCall` does not leak to the root scope — the regression tracked by camunda/camunda#51939. Tests pass on 8.9.0 and the engine fix image, fail on 8.9.1.
Merges `L4JAiAgentJobWorkerEventSubprocessTests` (regression-focused, minimal BPMN) and the now-deleted `L4JAiAgentJobWorkerEventsTests` (realistic, connectors BPMN) into a single `L4JAiAgentJobWorkerEventsTests` running on the unified `agentic-ai-ahsp-connectors-event.bpmn`. Adds a `Pending_Tool` service task as a deterministic in-AHSP park point, gives the message subscription a variable-driven correlation key for per-test isolation, and adds an output mapping to `SuperfluxProduct` (in both `connectors.bpmn` and `connectors-event.bpmn`) so tool execution always exercises the regression-sensitive code path. Coverage: event before activation, event during execution (WAIT_FOR_TOOL_CALL_RESULTS / INTERRUPT_TOOL_CALLS, with payload and empty), and multi-event ordering. Each scenario asserts the chat conversation, agent metrics and response text, the user-feedback worker firing once, and that `toolCall` does not leak to the root scope — camunda/camunda#51939. The defense-in-depth leak check in `L4JAiAgentJobWorkerToolCallingTests` is now a real regression detector thanks to the SuperfluxProduct output mapping. All tests pass on the engine fix image and fail on 8.9.1.
Drop the hand-rolled `awaitPendingToolJobCreated` / `awaitEventSubprocessCompletions` search-request loops in favor of `CamundaAssert.assertThat(...).hasActiveElements(...)` and `.hasCompletedElement(elementId, times)`. Same semantics (the latter waits and fails if not exactly the given count), shorter helpers, no Awaitility dependency in the test class.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
3e87c47 to
75f63f2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds an e2e regression test suite for the agentic-AI ad-hoc sub-process (AHSP) event-handling contract, designed to catch a bug introduced in Camunda 8.9.1 where inner-instance variables (notably
toolCall/toolCallResultset by the AI Agent on tool activations) leak out of the inner-instance scope into the surrounding AHSP/root scope.L4JAiAgentJobWorkerEventsTestsruns on a single BPMN —agentic-ai-ahsp-connectors-event.bpmn— that mirrors the realistic connectors tool setup (SuperfluxProduct,Search_The_Web, etc.) and the user-feedback satisfaction loop, plus aPending_Toolservice task whose job tests intentionally hold to keep the AHSP open while events are published. Per-test message correlation isolation via a UUID-driveneventCorrelationKey.Scenarios:
WAIT_FOR_TOOL_CALL_RESULTS(default).INTERRUPT_TOOL_CALLS("Cancel tool calls").Each scenario asserts the chat conversation, agent metrics and response text, the user-feedback worker firing exactly once, and the leak invariant (
toolCallmust not exist at the process-instance root scope).In addition:
agentic-ai-ahsp-connectors.bpmn:SuperfluxProductwrites its script result to an intermediate variable, and an<zeebe:output>mapping projects it totoolCallResult. The presence of the output mapping is what triggers the regression code path on 8.9.1 — without it, the buggy parent+local merge inBpmnVariableMappingBehavioris never reached. Same change is mirrored inagentic-ai-ahsp-connectors-event.bpmn.L4JAiAgentJobWorkerToolCallingTests.executesAgentWithToolCallingAndUserFeedbackalready had a defense-in-depth leak check; it was previously a no-op (the connectors BPMN's tools had no output mappings), and is now a real regression detector thanks to theSuperfluxProductoutput mapping.Verification:
L4JAiAgentJobWorkerEventsTests(6 tests)L4JAiAgentJobWorkerToolCallingTests.executesAgentWithToolCallingAndUserFeedbackNote on 8.9.0
8.9.0 is not a stable verification target for this suite when run as a class. Individual tests pass, but in suite mode 3/6 tests time out due to a known CPT/gateway interaction (camunda/camunda#45177, camunda/camunda#45667): when CPT recreates the
CamundaClientbetween tests, REST long-poll job-activation requests aren't cleanly cancelled — the gateway delivers next-test jobs to the dead connection, the new test's worker polls and gets nothing, and the job sits locked until the activation timeout (~60s) expires.8.9.0 shipped a partial workaround (camunda/camunda#49836 — force gRPC + long-polling). The proper fix landed in 8.9.1 (camunda/camunda#49424 — cancel pending REST long-polls on cluster purge) along with related cleanup of HTTP connection-pool shutdown. The engine fix image (built on 8.9.1+) doesn't exhibit the lag, so it remains the meaningful verification target alongside 8.9.1 (which fails fast on the AHSP regression assertion).
Related issues
Adds e2e coverage for the regression tracked in camunda/camunda#51939.
Checklist
release, as this branch will be rebased onto main before the next release. Example backport labels:
backport stable/8.8: for changes that should be included in the next 8.8.x release.backport release-8.8.7: for changes that should be included in the specific release 8.8.7, and thisrelease has already been created. The release branch will be merged back into stable/8.8 later, so the change
will be included in future 8.8.x releases as well.