feat(rest): add task signal endpoints#1205
Conversation
Adds the task signal endpoints the Go SDK / CLI already call but OSS never implemented (#1197): POST /api/tasks/{workflowId}/{status}/signal (async) POST /api/tasks/{workflowId}/{status}/signal/sync (sync, returns SignalResponse) Before this, the SDK's SignalAsync hit POST /tasks/{wfId}/{status}/signal, which had no matching route, so Spring fell through to the all-variable update route /{workflowId}/{taskRefName}/{status} and tried to coerce the literal "signal" into TaskResult.Status -> MethodArgumentTypeMismatchException. Adding the literal /signal segment makes that pattern more specific, so it now wins; a MockMvc routing test (with PathPatternParser, mirroring production) locks this in. "Signal" finds the first non-terminal WAIT task in the workflow (descending into running sub-workflows) and applies the given status + output to it, via the new TaskService.signalTask. The sync variant then waits for the workflow to settle into its next blocking/terminal state and renders a SignalResponse per the returnStrategy, reusing the same poll-to-response logic as executeWorkflow -- extracted into WorkflowSignalResponder so both controllers share it. Tests: TaskServiceTest (signalTask found / not-found), TaskResourceTest (async, sync, not-found, route resolution). Full conductor-rest suite green.
There was a problem hiding this comment.
- How do we know this works end-to-end? The tests are almost a tautology. Completely mocked like
testSignalTask. - Orkes (the source-of-truth per the description) has E2E signal tests. We need equivalent coverage here.
signalTimeoutis missing fromSignalResponse. Orkes sets it on timeout; without it, a timed-out signal looks identical to a successful one.
Thanks for calling that out!
Agreed.
Thanks, will fix |
Two issues raised in PR #1205 review: 1. `SignalResponse` was missing the `signalTimeout` boolean that Orkes sets when the sync-signal poll times out. Without it a timed-out response looks identical to a successful one. Added `signalTimeout` to `SignalResponse` (absorbed by `WorkflowRun` and `TaskRun` via inheritance), propagated it through `NotificationResult.toResponse()`, and set it to `true` in `WorkflowSignalResponder`'s timeout fallback path. 2. Added `SignalTaskSpec` to the test-harness — a Spring Boot integration test backed by a real Redis testcontainer that exercises `TaskService.signalTask()` without any mocking: direct-WAIT-task signal, signal-with-no-blocker, signal-on-nonexistent-workflow, and sub-workflow descent.
|
Thanks for the review, @mp-orkes. Two changes in the latest commit:
E2E integration tests — Added
These cover the four paths in |
…flow ExecutionDAOFacade.getWorkflow() throws NotFoundException for unknown IDs (does not return null). The test was asserting null — corrected to thrown(). The e2e WorkflowRerunTests failure in the same CI run is pre-existing and unrelated to signal changes (it was already failing on the prior commit).
What
Adds the two task signal endpoints that the Go SDK /
conductorCLI already call but OSS never implemented:"Signal" finds the first non-terminal
WAITtask in the workflow (descending into running sub-workflows) and applies the given status + output to it — letting a caller complete a workflow's currently-blocked task without knowing its task reference name.Why
Fixes #1197 (item 2). The Go SDK's
SignalAsyncposts to/api/tasks/{workflowId}/{status}/signal. OSS had no such route, so Spring fell through to the all-variable update route/{workflowId}/{taskRefName}/{status}and tried to coerce the literal"signal"intoTaskResult.Status:Adding the literal
/signalsegment makes that pattern strictly more specific than the all-variable one, so Spring routes to it first. A MockMvc test usingPathPatternParser(the production matcher) assertsPOST /api/tasks/wf123/COMPLETED/signalresolves to the signal handler and not the update handler, locking the fix in.User impact: A first-time user running
conductor task signal --workflow-id <id> --status COMPLETED ...against an OSS server got an opaque 500 (MethodArgumentTypeMismatchException) because the request was silently matching the wrong endpoint. Now the signal reaches the workflow's waiting task as intended.Design
TaskService.signalTask): the find-the-blocked-task-and-update logic lives in the service, mirroring Orkes'OrkesTaskServiceImpl.updatePendingTask/getPendingTasksForWorkflow.SignalResponseperreturnStrategy— the same poll-to-response mechanism the existingexecuteWorkflowendpoint uses. That logic (the Flux poll + recursive blocking-task discovery +NotificationResultassembly) was duplicated inline inWorkflowResource; this PR extracts it into a sharedWorkflowSignalResponderand points bothexecuteWorkflowand the new sync-signal endpoint at it.Orkes parity
Source-of-truth Orkes files mirrored here:
io.orkes.conductor.server.rest.TaskResource#signalWorkflowTaskASync/#signalWorkflowTaskSync— endpoint paths/{workflowId}/{status}/signal[/sync],returnStrategy(defaultTARGET_WORKFLOW) andtimeoutMillis(default 5000) params.io.orkes.conductor.server.service.OrkesTaskServiceImpl#updatePendingTask/#getPendingTasksForWorkflow— "signal = update the first non-terminal blocking task, recursing into sub-workflows."OSS-side additions (no Orkes parallel)
WorkflowSignalResponder— OSS reaches the post-signal blocking state by polling (Flux.interval), whereas Orkes uses an enterpriseWorkflowExecutionMonitornotification mechanism. OSS already chose polling forexecuteWorkflow; this just shares it. The Orkes-only collaborators (OrkesRequestContext/orgId,OrkesAuthenticationpermission checks,OrkesWorkflowExecutor) are multi-tenancy/enterprise and intentionally omitted.Tests
TaskServiceTest:signalTaskfound / not-found.TaskResourceTest: async, sync (StepVerifier), sync not-found (404), and the route-resolution regression test.executeWorkflow/WorkflowResourceTestcases stay green; fullconductor-restsuite passes.