Skip to content

feat(rest): add task signal endpoints#1205

Open
nthmost-orkes wants to merge 8 commits into
mainfrom
feature/task-signal-endpoints
Open

feat(rest): add task signal endpoints#1205
nthmost-orkes wants to merge 8 commits into
mainfrom
feature/task-signal-endpoints

Conversation

@nthmost-orkes

Copy link
Copy Markdown
Contributor

What

Adds the two task signal endpoints that the Go SDK / conductor CLI already call but OSS never implemented:

POST /api/tasks/{workflowId}/{status}/signal        # async, returns 200
POST /api/tasks/{workflowId}/{status}/signal/sync   # sync, returns SignalResponse

"Signal" finds the first non-terminal WAIT task in the workflow (descending into running sub-workflows) and applies the given status + output to it — letting a caller complete a workflow's currently-blocked task without knowing its task reference name.

Why

Fixes #1197 (item 2). The Go SDK's SignalAsync posts to /api/tasks/{workflowId}/{status}/signal. OSS had no such route, so Spring fell through to the all-variable update route /{workflowId}/{taskRefName}/{status} and tried to coerce the literal "signal" into TaskResult.Status:

workflowId = <id>, taskRefName = COMPLETED, status = "signal"   →  MethodArgumentTypeMismatchException

Adding the literal /signal segment makes that pattern strictly more specific than the all-variable one, so Spring routes to it first. A MockMvc test using PathPatternParser (the production matcher) asserts POST /api/tasks/wf123/COMPLETED/signal resolves to the signal handler and not the update handler, locking the fix in.

User impact: A first-time user running conductor task signal --workflow-id <id> --status COMPLETED ... against an OSS server got an opaque 500 (MethodArgumentTypeMismatchException) because the request was silently matching the wrong endpoint. Now the signal reaches the workflow's waiting task as intended.

Design

  • Service layer (TaskService.signalTask): the find-the-blocked-task-and-update logic lives in the service, mirroring Orkes' OrkesTaskServiceImpl.updatePendingTask / getPendingTasksForWorkflow.
  • Sync response: the sync endpoint waits for the workflow to settle into its next blocking/terminal state and renders a SignalResponse per returnStrategy — the same poll-to-response mechanism the existing executeWorkflow endpoint uses. That logic (the Flux poll + recursive blocking-task discovery + NotificationResult assembly) was duplicated inline in WorkflowResource; this PR extracts it into a shared WorkflowSignalResponder and points both executeWorkflow and the new sync-signal endpoint at it.

Orkes parity

Source-of-truth Orkes files mirrored here:

  • io.orkes.conductor.server.rest.TaskResource#signalWorkflowTaskASync / #signalWorkflowTaskSync — endpoint paths /{workflowId}/{status}/signal[/sync], returnStrategy (default TARGET_WORKFLOW) and timeoutMillis (default 5000) params.
  • io.orkes.conductor.server.service.OrkesTaskServiceImpl#updatePendingTask / #getPendingTasksForWorkflow — "signal = update the first non-terminal blocking task, recursing into sub-workflows."

OSS-side additions (no Orkes parallel)

  • WorkflowSignalResponder — OSS reaches the post-signal blocking state by polling (Flux.interval), whereas Orkes uses an enterprise WorkflowExecutionMonitor notification mechanism. OSS already chose polling for executeWorkflow; this just shares it. The Orkes-only collaborators (OrkesRequestContext/orgId, OrkesAuthentication permission checks, OrkesWorkflowExecutor) are multi-tenancy/enterprise and intentionally omitted.

Tests

  • TaskServiceTest: signalTask found / not-found.
  • TaskResourceTest: async, sync (StepVerifier), sync not-found (404), and the route-resolution regression test.
  • Refactor safety: all 37 existing executeWorkflow/WorkflowResourceTest cases stay green; full conductor-rest suite passes.

Adds the task signal endpoints the Go SDK / CLI already call but OSS never
implemented (#1197):

  POST /api/tasks/{workflowId}/{status}/signal        (async)
  POST /api/tasks/{workflowId}/{status}/signal/sync   (sync, returns SignalResponse)

Before this, the SDK's SignalAsync hit POST /tasks/{wfId}/{status}/signal, which
had no matching route, so Spring fell through to the all-variable update route
/{workflowId}/{taskRefName}/{status} and tried to coerce the literal "signal"
into TaskResult.Status -> MethodArgumentTypeMismatchException. Adding the literal
/signal segment makes that pattern more specific, so it now wins; a MockMvc
routing test (with PathPatternParser, mirroring production) locks this in.

"Signal" finds the first non-terminal WAIT task in the workflow (descending into
running sub-workflows) and applies the given status + output to it, via the new
TaskService.signalTask. The sync variant then waits for the workflow to settle
into its next blocking/terminal state and renders a SignalResponse per the
returnStrategy, reusing the same poll-to-response logic as executeWorkflow --
extracted into WorkflowSignalResponder so both controllers share it.

Tests: TaskServiceTest (signalTask found / not-found), TaskResourceTest (async,
sync, not-found, route resolution). Full conductor-rest suite green.
@v1r3n v1r3n requested a review from mp-orkes June 24, 2026 06:01
@nthmost-orkes nthmost-orkes requested review from v1r3n and removed request for v1r3n June 24, 2026 07:01
@nthmost-orkes nthmost-orkes requested a review from v1r3n June 24, 2026 07:36

@mp-orkes mp-orkes left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • How do we know this works end-to-end? The tests are almost a tautology. Completely mocked like testSignalTask.
  • Orkes (the source-of-truth per the description) has E2E signal tests. We need equivalent coverage here.
  • signalTimeout is missing from SignalResponse. Orkes sets it on timeout; without it, a timed-out signal looks identical to a successful one.

@nthmost-orkes

Copy link
Copy Markdown
Contributor Author
  • How do we know this works end-to-end? The tests are almost a tautology. Completely mocked like testSignalTask.

Thanks for calling that out!

  • Orkes (the source-of-truth per the description) has E2E signal tests. We need equivalent coverage here.

Agreed.

  • signalTimeout is missing from SignalResponse. Orkes sets it on timeout; without it, a timed-out signal looks identical to a successful one.

Thanks, will fix

Two issues raised in PR #1205 review:

1. `SignalResponse` was missing the `signalTimeout` boolean that Orkes sets
   when the sync-signal poll times out. Without it a timed-out response looks
   identical to a successful one. Added `signalTimeout` to `SignalResponse`
   (absorbed by `WorkflowRun` and `TaskRun` via inheritance), propagated it
   through `NotificationResult.toResponse()`, and set it to `true` in
   `WorkflowSignalResponder`'s timeout fallback path.

2. Added `SignalTaskSpec` to the test-harness — a Spring Boot integration test
   backed by a real Redis testcontainer that exercises `TaskService.signalTask()`
   without any mocking: direct-WAIT-task signal, signal-with-no-blocker,
   signal-on-nonexistent-workflow, and sub-workflow descent.
@nthmost-orkes

Copy link
Copy Markdown
Contributor Author

Thanks for the review, @mp-orkes. Two changes in the latest commit:

signalTimeout field — Added to SignalResponse (matching Orkes), propagated through NotificationResult.toResponse() to both WorkflowRun and TaskRun, and set to true in WorkflowSignalResponder's timeout fallback path. Three new unit tests in NotificationResultTest cover propagation to WorkflowRun, propagation to TaskRun, and the false-by-default case.

E2E integration tests — Added SignalTaskSpec to the test-harness. It runs against a real Redis testcontainer (same setup as every other spec in that module) and exercises TaskService.signalTask() without mocking:

  1. Signal completes a WAIT task and the workflow advances
  2. Signal with no blocking task returns null
  3. Signal on a non-existent workflow returns null
  4. Signal descends into a sub-workflow to complete its blocked WAIT task

These cover the four paths in TaskServiceImpl.findPendingBlockingTask() against a real backing store.

…flow

ExecutionDAOFacade.getWorkflow() throws NotFoundException for unknown IDs
(does not return null). The test was asserting null — corrected to thrown().

The e2e WorkflowRerunTests failure in the same CI run is pre-existing and
unrelated to signal changes (it was already failing on the prior commit).
@nthmost-orkes nthmost-orkes requested a review from mp-orkes July 3, 2026 00:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistency between CLI/Go SDK and server REST APIs for workflow status and task signal operations

2 participants