Skip to content

Safe Deploys #3598

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 76 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
7318cdb
go versioning revisions
axfelix Jun 2, 2025
91fff10
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 2, 2025
42c552b
Update docs/develop/go/versioning.mdx
axfelix Jun 2, 2025
12eb933
add more detail to sdk versioning page intro
axfelix Jun 2, 2025
67897d3
add revisions to ts and python
axfelix Jun 2, 2025
b22a083
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 2, 2025
65c5207
rewrite java, copyedit others
axfelix Jun 4, 2025
d748412
add dotnet revisions
axfelix Jun 4, 2025
1913777
surgery on other parts of docs mentioning worker versioning
axfelix Jun 4, 2025
8a9e4c2
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 4, 2025
5588408
add new worker deployments pages
axfelix Jun 4, 2025
393877c
Merge branch 'safe-deploys' of github.com:temporalio/documentation in…
axfelix Jun 4, 2025
ae9f083
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 4, 2025
5155062
fix broken links
axfelix Jun 4, 2025
33c2a97
Merge branch 'safe-deploys' of github.com:temporalio/documentation in…
axfelix Jun 4, 2025
15ea33e
a few more broken links
axfelix Jun 4, 2025
d0eec66
surgery is fun
axfelix Jun 4, 2025
3bd5312
add top level worker deployments page
axfelix Jun 4, 2025
16f75c1
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 4, 2025
98a9e41
add bullets for worker deployments index page
axfelix Jun 4, 2025
21fbcfb
Merge branch 'safe-deploys' of github.com:temporalio/documentation in…
axfelix Jun 4, 2025
c905bc0
add safe deploys page and sdktabs component
axfelix Jun 4, 2025
7107d23
clarity edits to safe deploys
axfelix Jun 4, 2025
4e08623
commit yarn lock to see if it messes up the dprint CI
axfelix Jun 5, 2025
e04b48c
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 5, 2025
eb32ef2
Update docs/develop/determinism.mdx
axfelix Jun 5, 2025
adcc269
Update docs/develop/determinism.mdx
axfelix Jun 5, 2025
58a957b
Update docs/develop/go/versioning.mdx
axfelix Jun 5, 2025
e01e50a
Update docs/encyclopedia/worker-versioning-legacy.mdx
axfelix Jun 5, 2025
8c174f2
Update docs/encyclopedia/worker-versioning-legacy.mdx
axfelix Jun 5, 2025
71fb38a
Update docs/production-deployment/worker-deployments/kubernetes-contr…
axfelix Jun 5, 2025
5228cad
Update docs/develop/go/versioning.mdx
axfelix Jun 5, 2025
98d4997
edits based on feedback
axfelix Jun 5, 2025
8520afe
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 5, 2025
5670e62
put php versioning back in tree, fell out by accident
axfelix Jun 5, 2025
fd32dfc
Merge branch 'safe-deploys' of github.com:temporalio/documentation in…
axfelix Jun 5, 2025
e87be93
more feedback edits
axfelix Jun 5, 2025
ec16723
update runtime checking anchor
axfelix Jun 6, 2025
9843055
remove TKs from safe deploys for now
axfelix Jun 6, 2025
8b502c4
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 6, 2025
432a68c
handle feedback, rename features
axfelix Jun 16, 2025
dee8ae5
update slug
axfelix Jun 16, 2025
79a5fd5
update legacy links
axfelix Jun 16, 2025
e669eef
Merge branch 'main' into safe-deploys
axfelix Jun 16, 2025
fb1fb1f
further deemphasize workflow cutovers
axfelix Jun 18, 2025
6ee8f90
add yarn lock for build system
axfelix Jun 18, 2025
e5c0cce
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 18, 2025
4170487
better transition language
axfelix Jun 18, 2025
14fcaf3
Merge branch 'safe-deploys' of github.com:temporalio/documentation in…
axfelix Jun 18, 2025
087f00e
additional copyedits
axfelix Jun 23, 2025
e2ffaa0
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 23, 2025
46b5c6b
Update docs/develop/determinism.mdx
axfelix Jun 23, 2025
36c6178
Update docs/develop/dotnet/versioning.mdx
axfelix Jun 23, 2025
39779c3
Update docs/develop/determinism.mdx
axfelix Jun 23, 2025
09b002b
Update docs/develop/dotnet/versioning.mdx
axfelix Jun 23, 2025
39e36c6
Update docs/develop/dotnet/versioning.mdx
axfelix Jun 23, 2025
9665a71
Update docs/develop/go/versioning.mdx
axfelix Jun 23, 2025
5810586
Update docs/develop/go/versioning.mdx
axfelix Jun 23, 2025
9e15e46
Update docs/develop/go/versioning.mdx
axfelix Jun 23, 2025
8c7294b
propagate inline changes across SDKs
axfelix Jun 23, 2025
4adf833
migrate -> move
axfelix Jun 24, 2025
fcb6364
pin your workflows
axfelix Jun 25, 2025
b63d552
gradual ramping
axfelix Jun 25, 2025
21dc65e
clarity edits
axfelix Jun 25, 2025
6ad0665
update CLI syntax
axfelix Jun 25, 2025
0883cef
add limits and search attributes
axfelix Jun 26, 2025
cf7a629
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 26, 2025
e003e3a
fix markdown table
axfelix Jun 26, 2025
28c5c36
CI: Automatic .md and .mdx formatting
github-actions[bot] Jun 26, 2025
ececb74
Add back safe-deployments doc (for now), scope changes to just Worker…
drewhoskins-temporal Jun 26, 2025
032302b
Use house style for generic docs for pinned/auto-upgrade
drewhoskins-temporal Jun 26, 2025
471a867
Fix Default Versioning Behavior docs
drewhoskins-temporal Jun 26, 2025
ca2f89a
Fix search attributes technical docs
drewhoskins-temporal Jun 27, 2025
ac3f488
Improve overview doc flow into patching, present options upfront, har…
drewhoskins-temporal Jun 27, 2025
34fd41c
Per-lang docs: move testing section under patching; remove extraneous…
drewhoskins-temporal Jun 27, 2025
3ccd616
Capitalize Deployment Version per house style
drewhoskins-temporal Jun 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/cli/worker.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -480,7 +480,7 @@ metadata, or its creation/modification time.
temporal worker deployment describe-version [options]
```

For example, to describe a deployment version in a deployment
For example, to describe a Deployment Version in a deployment
`YourDeploymentName`, with Build ID `YourBuildID`, and in the default
namespace:

Expand Down
2 changes: 1 addition & 1 deletion docs/cli/workflow.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3928,7 +3928,7 @@ temporal workflow update-options \
```

or to pin the workflow execution to a Worker Deployment, set behavior
to `pinned`:
to Pinned:

```
temporal workflow update-options \
Expand Down
2 changes: 1 addition & 1 deletion docs/develop/dotnet/debugging.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ You can debug production Workflows using:

- [Web UI](/web-ui)
- [Temporal CLI](/cli)
- [Replay](/develop/dotnet/testing-suite#replay-test)
- [Replay](/develop/dotnet/testing-suite#replay)
- [Tracing](/develop/dotnet/observability#tracing)
- [Logging](/develop/dotnet/observability#logging)

Expand Down
4 changes: 2 additions & 2 deletions docs/develop/dotnet/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ Set up the testing suite and test Workflows and Activities.
- [Test frameworks](/develop/dotnet/testing-suite#test-frameworks): Testing provides a framework to facilitate Workflow and integration testing.
- [Testing Workflows](/develop/dotnet/testing-suite#testing-workflows): Ensure the functionality and reliability of your Workflows.
- [Testing Activities](/develop/dotnet/testing-suite#test-activities): Validate the execution and outcomes of your Activities.
- [Replay test](/develop/dotnet/testing-suite#replay-test): Replay recreates the exact state of a Workflow Execution.
- [Replay test](/develop/dotnet/testing-suite#replay): Replay recreates the exact state of a Workflow Execution.

## [Failure detection](/develop/dotnet/failure-detection)

Expand Down Expand Up @@ -115,7 +115,7 @@ Complete Activities asynchronously.

Change Workflow Definitions without causing non-deterministic behavior in running Workflows.

- [Use the .NET SDK Patching API](/develop/dotnet/versioning#dotnet-sdk-patching-api): Patching Workflows using the .NET SDK.
- [Use the .NET SDK Patching API](/develop/dotnet/versioning#patching): Patching Workflows using the .NET SDK.

## [Observability](/develop/dotnet/observability)

Expand Down
2 changes: 1 addition & 1 deletion docs/develop/dotnet/testing-suite.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,7 @@ The following important members are available on the environment to affect the a
- `WorkerShutdownTokenSource` - Token source for issuing Worker shutdown.
- `PayloadConverter` - Defaulted to default payload converter.

## Replay test {#replay-test}
## Replay test {#replay}

**How to do a Replay test using the Temporal .NET SDK**

Expand Down
233 changes: 60 additions & 173 deletions docs/develop/dotnet/versioning.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -27,40 +27,63 @@ tags:
- Patching
---

This page shows how to do the following:
import { CaptionedImage } from '@site/src/components';

- [Use the .NET SDK Patching API](#dotnet-sdk-patching-api)
- [Patching in new code](#using-patched-for-workflow-history-markers)
- [Understanding deprecated Patches in the .NET SDK](#deprecated-patches)
- [Safe Deployment of PostPatchActivity](#deploy-postpatchactivity)
Since Workflow Executions in Temporal can run for long periods — sometimes months or even years — it's common to need to make changes to a Workflow Definition, even while a particular Workflow Execution is in progress.

## Introduction to Versioning
The Temporal Platform requires that Workflow code is [deterministic](/workflow-definition#deterministic-constraints).
If you make a change to your Workflow code that would cause non-deterministic behavior on Replay, you'll need to use one of our Versioning methods to gracefully update your running Workflows.
With Versioning, you can modify your Workflow Definition so that new executions use the updated code, while existing ones continue running the original version.
There are two primary Versioning methods that you can use:

Because we design for potentially long running Workflows at scale, versioning with Temporal works differently. We explain more in this optional 30 minute introduction:
- [Versioning with Patching](#patching). This method works by adding branches to your code tied to specific revisions. It can be used to revise in-progress Workflows.
- [Worker Versioning](/production-deployment/worker-deployments/worker-versioning). The Worker Versioning feature allows you to tag your Workers and programmatically roll them out in versioned deployments, so that old Workers can run old code paths and new Workers can run new code paths.

<div style={{ position: "relative", paddingBottom: "56.25%", height: 0 }}>
<iframe
src="https://www.youtube.com/embed/kkP899WxgzY?autoplay=0"
style={{ position: "absolute", top: 0, left: 0, width: "100%", height: "100%" }}
frameborder="0"
allow="accelerometer; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen>
</iframe>
</div>
## Versioning with Patching {#patching}

## Use the .NET SDK Patching API {#dotnet-sdk-patching-api}
To understand why Patching is useful, it's helpful to first demonstrate cutting over an entire Workflow.

**How to use the .NET SDK Patching API using the Temporal .NET SDK**
### Workflow cutovers

In principle, the .NET SDK's patching mechanism operates similarly to other SDKs in a "feature-flag" fashion. However, the "versioning" API now uses the concept of "patching in" code.
Since incompatible changes only affect open Workflow Executions of the same type, you can avoid determinism errors by creating a whole new Workflow when making changes.
To do this, you can copy the Workflow Definition function, giving it a different name, and make sure that both names are registered with your Workers.

To understand this, you can break it down into three steps, which reflect three stages of migration:
For example, you would duplicate `SayHelloWorkflow` as `SayHelloWorkflowV2`:

- Running `PrePatchActivity` code while concurrently patching in `PostPatchActivity`.
- Running `PostPatchActivity` code with deprecation markers for `my-patch` patches.
- Running only the `PostPatchActivity` code.
```csharp
[Workflow]
public class SayHelloWorkflow
{
[WorkflowRun]
# this function contains the original code
}

[Workflow]
public class SayHelloWorkflowV2
{
[WorkflowRun]
# this function contains the updated code
}
```

You would then need to update the Worker configuration, and any other identifier strings, to register both Workflow Types:

```csharp
using var worker = new TemporalWorker(
client,
new TemporalWorkerOptions("greeting-tasks")
.AddWorkflow<SayHelloWorkflow>()
.AddWorkflow<SayHelloWorkflowV2>());
```

The downside of this method is that it requires you to duplicate code and to update any commands used to start the Workflow.
This can become impractical over time.
This method also does not provide a way to version any still-running Workflows -- it is essentially just a cutover, unlike Patching, which we will now demonstrate.

Let's walk through this process in sequence.
### Adding a patch

Patching essentially defines a logical branch for a specific change in the Workflow.
If your Workflow is not [pinned to a specific Worker Deployment Version](/production-deployment/worker-deployments/worker-versioning) or if you need to fix a bug in a running workflow, you can patch it.

Suppose you have an initial Workflow version called `PrePatchActivity`:

Expand Down Expand Up @@ -98,11 +121,10 @@ public class MyWorkflow
}
```

**Problem: You cannot deploy `PostPatchActivity` directly until you're certain there are no more running Workflows created using the `PrePatchActivity` code, otherwise you are likely to cause a nondeterminism error.**

The problem is that you cannot deploy `PostPatchActivity` directly until you're certain there are no more running Workflows created using the `PrePatchActivity` code, otherwise you are likely to cause a nondeterminism error.
Instead, you'll need to deploy `PostPatchActivity` and use the [Patched](https://dotnet.temporal.io/api/Temporalio.Workflows.Workflow.html#Temporalio_Workflows_Workflow_Patched_System_String_) method to determine which version of the code to execute.

Implementing patching involves three steps:
Patching is a three step process:

1. Use [Patched](https://dotnet.temporal.io/api/Temporalio.Workflows.Workflow.html#Temporalio_Workflows_Workflow_Patched_System_String_) to patch in new code and run it alongside the old code.
2. Remove the old code and apply [DeprecatePatch](https://dotnet.temporal.io/api/Temporalio.Workflows.Workflow.html#Temporalio_Workflows_Workflow_DeprecatePatch_System_String_).
Expand Down Expand Up @@ -139,7 +161,7 @@ public class MyWorkflow
}
```

### Understanding deprecated Patches in the .NET SDK {#deprecated-patches}
### Deprecating patches {#deprecated-patches}

After ensuring that all Workflows started with `PrePatchActivity` code have finished, you can [deprecate the patch](https://dotnet.temporal.io/api/Temporalio.Workflows.Workflow.html#Temporalio_Workflows_Workflow_DeprecatePatch_System_String_).

Expand All @@ -164,7 +186,7 @@ public class MyWorkflow
}
```

### Safe Deployment of PostPatchActivity {#deploy-postpatchactivity}
### Removing a patch {#deploy-postpatchactivity}

You can safely deploy `PostPatchActivity` once all Workflows labeled my-patch or earlier are finished, based on the previously mentioned assertion.

Expand All @@ -184,9 +206,12 @@ public class MyWorkflow
}
```

### Detailed Description of the Patched Function
Patching allows you to make changes to currently running Workflows.
It is a powerful method for introducing compatible changes without introducing non-determinism errors.

This video series examines into the behavior of the `patched()` function:
### Detailed Overview of the Patched Function

This video provides an in-depth overview of how the `patched()` function works:

<div style={{ position: "relative", paddingBottom: "56.25%", height: 0 }}>
<iframe
Expand All @@ -198,148 +223,10 @@ This video series examines into the behavior of the `patched()` function:
</iframe>
</div>

#### Behavior When Not Replaying

If the execution is not replaying, when it encounters a call to `patched()`, it first checks the event history.

- If the patch ID is not in the event history, the execution adds a marker to the event history, upserts a search attribute, and returns `true`.
This happens in the first block of the patch ID.
- If the patch ID is in the event history, the execution doesn't modify the history, and returns `true`.
This happens in a patch ID's subsequent blocks, because the event history was updated in the first block.

There is a caveat to this behavior, which we will cover below.

#### Behavior When Replaying With Marker Before-Or-At Current Location

If the execution is replaying and has a call to `patched()`, and if the event history has a marker from a call to `patched()` in the same place
(which means it will match the original event history), then it writes a marker to the replay event history and returns `true`.

This is similar to the behavior of the non-replay case, and
also happens in a given patch ID's first block.

If the code has a call to `patched()`, and the event history
has a marker with that Patch ID earlier in the history,
it will return `true` and will not modify the
replay event history.

This is also similar to the behavior of the non-replay case, and
also happens in a given patch ID's subsequent blocks.

#### Behavior When Replaying With Marker After Current Location

If the Event History's Marker Event is after the current execution point,
that means the new patch is too early.
The execution will encounter the new patch before the original.
The execution will
attempt to write the marker to the replay event
history, but it will throw a non-deterministic
exception because the replay and original event
histories don't match.

#### Behavior when replaying with no marker for that patch ID

During a Replay, if there is no marker for a given patch ID, the execution will return `false` and will not add a marker to
the event history. In addition, all future calls to `patched()`
with that ID will return `false` -- even after it is done replaying
and is running new code.

The [preceding section](#behavior-when-not-replaying) states that if the execution is not replaying,
the `patched()` function will always return `true`. If
the marker doesn't exist, it will be added, and if
the marker already exists, it won't be re-added.

However, this behavior doesn't occur if there was already
a call to `patched()` with that ID in the replay code, but not
in the event history. In this situation, the function won't return
`true`.

#### Potentially Unexpected Behaviors

Recapping the potentially unexpected behaviors that may occur during a Replay:

If the execution hits a call to `patched()`, but that patch ID isn't _at or before
that point_ in the event history, you may not realize that
the event history _after_ the current execution location matters.
This behavior occurs because:

- If that patch ID exists later, you get a non-determinism error
- If the patch doesn't exist later, you don't get a non-determinism error, and the call returns `false`

If the execution hits a call to `patched()` with an ID that
doesn't exist in the history, then not only will it return
`false` in that occurence, but it will also return `false` if
the execution surpasses the Replay threshold and is running new code.

#### Implications of these Behaviors

If you deploy new code while Workflows are executing,
any Workflows that were in the middle of executing will Replay
up to the point they were at when the Worker was shut down.
When they do this Replay, they will not follow the `patched()` branches in the code.
For the rest of the execution after they have replayed to the point
before the deployment and worker restart, they will either:

- Use new code if there was no call to `patched()` in the replay code
- If there was a call to `patched()` in the replay code, they will
run the non-patched code during and after replay

This might sound odd, but it's actually exactly what's needed because
that means that if the future patched code depends on earlier patched code,
then it won't use the new code -- it will use the old code instead.

But if
there's new code in the future, and there was no code earlier in the
body that required the new patch, then it can switch over to the new code,
which it will do.

Note that this behavior means that the Workflow _does not always run
the newest code_. It only does that if not replaying or if replay is
surpassed and there hasn't been a call to `patched()` (with that ID) throughout
the replay.

#### Recommendations

Based on this behavior and the implications, when patching in new code, always put the newest code at the top of an if-patched-block.

<!--SNIPSTART dotnet-patching-example-->

```csharp
if (patched('v3')) {
// This is the newest version of the code.
// put this at the top, so when it is running
// a fresh execution and not replaying,
// this patched statement will return true
// and it will run the new code.
} else if (patched('v2')) {
} else {
}
```

<!--SNIPEND-->

The following sample shows how `patched()` will behave in a conditional block that's arranged differently.
In this case, the code's conditional block doesn't have the newest code at the top.
Because `patched()` will return `true` when not Replaying (except with the preceding caveats), this snippet will run the `v2` branch instead of `v3` in new executions.

<!--SNIPSTART dotnet-patching-anti-example-->

```csharp
if (patched('v2')) {
// This is bad because when doing a new execution (i.e. not replaying),
// patched statements evaluate to True (and put a marker
// in the event history), which means that new executions
// will use v2, and miss v3 below
}
else if (patched('v3')) {}
else {}
```

<!--SNIPEND-->
### Testing a Workflow for replay safety.

### Best Practice of Using Classes as Arguments and Returns
To make sure your Workflow doesn't need a patch, or that you've patched it successfully, you should incorporate [Replay Testing](/develop/dotnet/testing-suite#replay).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"To make sure your Workflow doesn't need a patch" sounds like the default assumption is that it wouldn't. Does it make sense to say "To determine whether your Workflow needs a patch, or that..." ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest this is an example of why further qualifying replay testing beyond a general recommendation does not add clarity.


As a side note on the Patching API, its behavior is why Temporal recommends using a single object as arguments and returns from Signals, Queries, Updates, and Activities, rather than using multiple arguments/returns.
The Patching API's main use case is to support branching in an `if` block of a method body.
It is not designed to be used to set different methods or method signatures for different Workflow Versions.
## Worker Versioning

Because of this, Temporal recommends that each Signal, Activity, etc, accepts a single object and returns a single object, so the method signature can stay constant, and you can do your versioning logic using `patched()` within the method body.
Temporal's [Worker Versioning](/production-deployment/worker-deployments/worker-versioning) feature allows you to tag your Workers and programmatically roll them out in Deployment Versions, so that old Workers can run old code paths and new Workers can run new code paths. This way, you can pin your Workflows to specific revisions, avoiding the need for patching.
5 changes: 2 additions & 3 deletions docs/develop/go/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -98,9 +98,8 @@ Complete Activities asynchronously.

Change Workflow Definitions without causing non-deterministic behavior in running Workflows.

- [Temporal Go SDK Patching APIs](/develop/go/versioning#patching)
- [Sanity checking](/develop/go/versioning#sanity-checking)
- [How to use Worker Versioning in Go](/develop/go/versioning#worker-versioning)
- [Temporal Go SDK Versioning APIs](/develop/go/versioning#patching)
- [Runtime checking](/develop/go/versioning#runtime-checking)

## [Observability](/develop/go/observability)

Expand Down
Loading