Skip to content

Conversation

@yuandrew
Copy link
Contributor

@yuandrew yuandrew commented Nov 13, 2025

What was changed

Add a WorkerCapabilities field to allow activity/workflow/nexus only workers to be registered on the same client/NS/TQ

Why?

Even though there isn't really a valid/useful scenario where this should be used, we want to support this to minimize the breakage for users, since previously this scenario did not return an error. This is likely primarily used in tests or maybe code organization.

Checklist

  1. Closes

  2. How was this tested:

Added new tests

  1. Any docs updates needed?

Note

Allow registering workers with non-overlapping capabilities (workflow/activity/nexus) on the same client/namespace/task queue and reserve WFT slots only from workflow-capable workers.

  • Client worker registry (crates/client/src/worker_registry):
    • Introduce WorkerCapabilities and require non-empty capabilities; expose via ClientWorker::worker_capabilities.
    • Change registration to allow multiple workers on same namespace/task_queue/build_id if capabilities do not overlap; update error messaging.
    • Store providers as RegisteredWorkerInfo { worker_id, build_id, capabilities } instead of (Uuid, Option<String>).
    • WFT reservation filters to workflow-capable workers; selection order logic updated accordingly.
  • SDK core worker (crates/sdk-core/src/worker/mod.rs):
    • Derive capabilities from WorkerConfig (workflow/activities/nexus enabled) and pass through ClientWorkerRegistrator.
    • ClientWorkerRegistrator implements worker_capabilities().
  • Client exports/tests:
    • Re-export WorkerCapabilities in crates/client/src/lib.rs.
    • Update raw client/tests to include capabilities; add tests for coexistence, overlap rejection, non-workflow filtering, and validation.

Written by Cursor Bugbot for commit 2b33733. This will update automatically on new commits. Configure here.

@yuandrew yuandrew requested a review from a team as a code owner November 13, 2025 01:55
});

// Determine worker capabilities based on configuration
let capabilities = temporalio_client::WorkerCapabilities {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% confident this is correct in how we detect workflow/nexus only workers

Copy link
Member

@cretz cretz Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, I don't think langs tell Core today whether it will ever ask for workflow or Nexus work. This may be new (required) worker options. Never before has Core cared ahead of time which poll calls were going to be made.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually kinda want to switch to that model anyway. The whole implicit polling thing complicates shutdown and switching to an up-front model would make some nasty code go away (and actually save some resources too).

The only complication would be if we ever expect allowing people to dynamically start polling for a certain kind when they weren't before... but... I'm not sure that's a scenario I think matters at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only complication would be if we ever expect allowing people to dynamically start polling for a certain kind when they weren't before... but... I'm not sure that's a scenario I think matters at all.

That could potentially be something we allow users to configure from server commands?

But would it be that be that complicated? I guess that would mean more code is kept in callbacks and more mutex/atomic variables to track state for worker commands and shutdown and such

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point. We can but I'm honestly not sure if we should. More knobs and complexity for unclear benefit. Though, maybe, it's kind of the same mechanism as worker "pause" which people have asked for.

I'm kinda worried about pausing/resuming polling (mostly for workflows) because shutdown is already really hard to get right and this is like extra-special shutdown.

But, regardless, I don't want you to deal with any of this right now. IMO it makes sense to add info about whether or not the worker polls for the 3 task kinds. For now, I'm perfectly fine with requiring lang to say up-front if the worker should start life polling for which task types. It can easily do that by just checking what is registered, like it does now to decide if it should make poll calls at all. If it calls poll on something it didn't say should be turned on, just immediately return PollError::Shutdown

All the other stuff we can address later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants