-
Notifications
You must be signed in to change notification settings - Fork 255
Safe Deploys #3598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Safe Deploys #3598
Changes from all commits
7318cdb
91fff10
42c552b
12eb933
67897d3
b22a083
65c5207
d748412
1913777
8a9e4c2
5588408
393877c
ae9f083
5155062
33c2a97
15ea33e
d0eec66
3bd5312
16f75c1
98a9e41
21fbcfb
c905bc0
7107d23
4e08623
e04b48c
eb32ef2
adcc269
58a957b
e01e50a
8c174f2
71fb38a
5228cad
98d4997
8520afe
5670e62
fd32dfc
e87be93
ec16723
9843055
8b502c4
432a68c
dee8ae5
79a5fd5
e669eef
fb1fb1f
6ee8f90
e5c0cce
4170487
14fcaf3
087f00e
e2ffaa0
46b5c6b
36c6178
39779c3
09b002b
39e36c6
9665a71
5810586
9e15e46
8c7294b
4adf833
fcb6364
b63d552
21dc65e
6ad0665
0883cef
cf7a629
e003e3a
28c5c36
ececb74
032302b
471a867
ca2f89a
ac3f488
34fd41c
3ccd616
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,40 +27,63 @@ tags: | |
- Patching | ||
--- | ||
|
||
This page shows how to do the following: | ||
import { CaptionedImage } from '@site/src/components'; | ||
|
||
- [Use the .NET SDK Patching API](#dotnet-sdk-patching-api) | ||
- [Patching in new code](#using-patched-for-workflow-history-markers) | ||
- [Understanding deprecated Patches in the .NET SDK](#deprecated-patches) | ||
- [Safe Deployment of PostPatchActivity](#deploy-postpatchactivity) | ||
Since Workflow Executions in Temporal can run for long periods — sometimes months or even years — it's common to need to make changes to a Workflow Definition, even while a particular Workflow Execution is in progress. | ||
|
||
## Introduction to Versioning | ||
The Temporal Platform requires that Workflow code is [deterministic](/workflow-definition#deterministic-constraints). | ||
If you make a change to your Workflow code that would cause non-deterministic behavior on Replay, you'll need to use one of our Versioning methods to gracefully update your running Workflows. | ||
With Versioning, you can modify your Workflow Definition so that new executions use the updated code, while existing ones continue running the original version. | ||
There are two primary Versioning methods that you can use: | ||
|
||
Because we design for potentially long running Workflows at scale, versioning with Temporal works differently. We explain more in this optional 30 minute introduction: | ||
- [Versioning with Patching](#patching). This method works by adding branches to your code tied to specific revisions. It can be used to revise in-progress Workflows. | ||
axfelix marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- [Worker Versioning](/production-deployment/worker-deployments/worker-versioning). The Worker Versioning feature allows you to tag your Workers and programmatically roll them out in versioned deployments, so that old Workers can run old code paths and new Workers can run new code paths. | ||
|
||
<div style={{ position: "relative", paddingBottom: "56.25%", height: 0 }}> | ||
<iframe | ||
src="https://www.youtube.com/embed/kkP899WxgzY?autoplay=0" | ||
style={{ position: "absolute", top: 0, left: 0, width: "100%", height: "100%" }} | ||
frameborder="0" | ||
allow="accelerometer; clipboard-write; encrypted-media; gyroscope; picture-in-picture" | ||
allowfullscreen> | ||
</iframe> | ||
</div> | ||
## Versioning with Patching {#patching} | ||
|
||
## Use the .NET SDK Patching API {#dotnet-sdk-patching-api} | ||
To understand why Patching is useful, it's helpful to first demonstrate cutting over an entire Workflow. | ||
drewhoskins-temporal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
**How to use the .NET SDK Patching API using the Temporal .NET SDK** | ||
### Workflow cutovers | ||
|
||
In principle, the .NET SDK's patching mechanism operates similarly to other SDKs in a "feature-flag" fashion. However, the "versioning" API now uses the concept of "patching in" code. | ||
Since incompatible changes only affect open Workflow Executions of the same type, you can avoid determinism errors by creating a whole new Workflow when making changes. | ||
drewhoskins-temporal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
To do this, you can copy the Workflow Definition function, giving it a different name, and make sure that both names are registered with your Workers. | ||
|
||
To understand this, you can break it down into three steps, which reflect three stages of migration: | ||
For example, you would duplicate `SayHelloWorkflow` as `SayHelloWorkflowV2`: | ||
|
||
- Running `PrePatchActivity` code while concurrently patching in `PostPatchActivity`. | ||
- Running `PostPatchActivity` code with deprecation markers for `my-patch` patches. | ||
- Running only the `PostPatchActivity` code. | ||
```csharp | ||
[Workflow] | ||
public class SayHelloWorkflow | ||
{ | ||
[WorkflowRun] | ||
# this function contains the original code | ||
} | ||
|
||
[Workflow] | ||
public class SayHelloWorkflowV2 | ||
{ | ||
[WorkflowRun] | ||
# this function contains the updated code | ||
} | ||
``` | ||
|
||
You would then need to update the Worker configuration, and any other identifier strings, to register both Workflow Types: | ||
|
||
```csharp | ||
using var worker = new TemporalWorker( | ||
client, | ||
new TemporalWorkerOptions("greeting-tasks") | ||
.AddWorkflow<SayHelloWorkflow>() | ||
.AddWorkflow<SayHelloWorkflowV2>()); | ||
``` | ||
|
||
The downside of this method is that it requires you to duplicate code and to update any commands used to start the Workflow. | ||
This can become impractical over time. | ||
This method also does not provide a way to version any still-running Workflows -- it is essentially just a cutover, unlike Patching, which we will now demonstrate. | ||
|
||
Let's walk through this process in sequence. | ||
### Adding a patch | ||
|
||
Patching essentially defines a logical branch for a specific change in the Workflow. | ||
If your Workflow is not [pinned to a specific Worker Deployment Version](/production-deployment/worker-deployments/worker-versioning) or if you need to fix a bug in a running workflow, you can patch it. | ||
|
||
Suppose you have an initial Workflow version called `PrePatchActivity`: | ||
|
||
|
@@ -98,11 +121,10 @@ public class MyWorkflow | |
} | ||
``` | ||
|
||
**Problem: You cannot deploy `PostPatchActivity` directly until you're certain there are no more running Workflows created using the `PrePatchActivity` code, otherwise you are likely to cause a nondeterminism error.** | ||
|
||
The problem is that you cannot deploy `PostPatchActivity` directly until you're certain there are no more running Workflows created using the `PrePatchActivity` code, otherwise you are likely to cause a nondeterminism error. | ||
Instead, you'll need to deploy `PostPatchActivity` and use the [Patched](https://dotnet.temporal.io/api/Temporalio.Workflows.Workflow.html#Temporalio_Workflows_Workflow_Patched_System_String_) method to determine which version of the code to execute. | ||
|
||
Implementing patching involves three steps: | ||
Patching is a three step process: | ||
|
||
1. Use [Patched](https://dotnet.temporal.io/api/Temporalio.Workflows.Workflow.html#Temporalio_Workflows_Workflow_Patched_System_String_) to patch in new code and run it alongside the old code. | ||
2. Remove the old code and apply [DeprecatePatch](https://dotnet.temporal.io/api/Temporalio.Workflows.Workflow.html#Temporalio_Workflows_Workflow_DeprecatePatch_System_String_). | ||
|
@@ -139,7 +161,7 @@ public class MyWorkflow | |
} | ||
``` | ||
|
||
### Understanding deprecated Patches in the .NET SDK {#deprecated-patches} | ||
### Deprecating patches {#deprecated-patches} | ||
|
||
After ensuring that all Workflows started with `PrePatchActivity` code have finished, you can [deprecate the patch](https://dotnet.temporal.io/api/Temporalio.Workflows.Workflow.html#Temporalio_Workflows_Workflow_DeprecatePatch_System_String_). | ||
|
||
|
@@ -164,7 +186,7 @@ public class MyWorkflow | |
} | ||
``` | ||
|
||
### Safe Deployment of PostPatchActivity {#deploy-postpatchactivity} | ||
### Removing a patch {#deploy-postpatchactivity} | ||
|
||
You can safely deploy `PostPatchActivity` once all Workflows labeled my-patch or earlier are finished, based on the previously mentioned assertion. | ||
|
||
|
@@ -184,9 +206,12 @@ public class MyWorkflow | |
} | ||
``` | ||
|
||
### Detailed Description of the Patched Function | ||
Patching allows you to make changes to currently running Workflows. | ||
It is a powerful method for introducing compatible changes without introducing non-determinism errors. | ||
|
||
This video series examines into the behavior of the `patched()` function: | ||
### Detailed Overview of the Patched Function | ||
|
||
This video provides an in-depth overview of how the `patched()` function works: | ||
|
||
<div style={{ position: "relative", paddingBottom: "56.25%", height: 0 }}> | ||
<iframe | ||
|
@@ -198,148 +223,10 @@ This video series examines into the behavior of the `patched()` function: | |
</iframe> | ||
</div> | ||
|
||
#### Behavior When Not Replaying | ||
|
||
If the execution is not replaying, when it encounters a call to `patched()`, it first checks the event history. | ||
|
||
- If the patch ID is not in the event history, the execution adds a marker to the event history, upserts a search attribute, and returns `true`. | ||
This happens in the first block of the patch ID. | ||
- If the patch ID is in the event history, the execution doesn't modify the history, and returns `true`. | ||
This happens in a patch ID's subsequent blocks, because the event history was updated in the first block. | ||
|
||
There is a caveat to this behavior, which we will cover below. | ||
|
||
#### Behavior When Replaying With Marker Before-Or-At Current Location | ||
|
||
If the execution is replaying and has a call to `patched()`, and if the event history has a marker from a call to `patched()` in the same place | ||
(which means it will match the original event history), then it writes a marker to the replay event history and returns `true`. | ||
|
||
This is similar to the behavior of the non-replay case, and | ||
also happens in a given patch ID's first block. | ||
|
||
If the code has a call to `patched()`, and the event history | ||
has a marker with that Patch ID earlier in the history, | ||
it will return `true` and will not modify the | ||
replay event history. | ||
|
||
This is also similar to the behavior of the non-replay case, and | ||
also happens in a given patch ID's subsequent blocks. | ||
|
||
#### Behavior When Replaying With Marker After Current Location | ||
|
||
If the Event History's Marker Event is after the current execution point, | ||
that means the new patch is too early. | ||
The execution will encounter the new patch before the original. | ||
The execution will | ||
attempt to write the marker to the replay event | ||
history, but it will throw a non-deterministic | ||
exception because the replay and original event | ||
histories don't match. | ||
|
||
#### Behavior when replaying with no marker for that patch ID | ||
|
||
During a Replay, if there is no marker for a given patch ID, the execution will return `false` and will not add a marker to | ||
the event history. In addition, all future calls to `patched()` | ||
with that ID will return `false` -- even after it is done replaying | ||
and is running new code. | ||
|
||
The [preceding section](#behavior-when-not-replaying) states that if the execution is not replaying, | ||
the `patched()` function will always return `true`. If | ||
the marker doesn't exist, it will be added, and if | ||
the marker already exists, it won't be re-added. | ||
|
||
However, this behavior doesn't occur if there was already | ||
a call to `patched()` with that ID in the replay code, but not | ||
in the event history. In this situation, the function won't return | ||
`true`. | ||
|
||
#### Potentially Unexpected Behaviors | ||
|
||
Recapping the potentially unexpected behaviors that may occur during a Replay: | ||
|
||
If the execution hits a call to `patched()`, but that patch ID isn't _at or before | ||
that point_ in the event history, you may not realize that | ||
the event history _after_ the current execution location matters. | ||
This behavior occurs because: | ||
|
||
- If that patch ID exists later, you get a non-determinism error | ||
- If the patch doesn't exist later, you don't get a non-determinism error, and the call returns `false` | ||
|
||
If the execution hits a call to `patched()` with an ID that | ||
doesn't exist in the history, then not only will it return | ||
`false` in that occurence, but it will also return `false` if | ||
the execution surpasses the Replay threshold and is running new code. | ||
|
||
#### Implications of these Behaviors | ||
|
||
If you deploy new code while Workflows are executing, | ||
any Workflows that were in the middle of executing will Replay | ||
up to the point they were at when the Worker was shut down. | ||
When they do this Replay, they will not follow the `patched()` branches in the code. | ||
For the rest of the execution after they have replayed to the point | ||
before the deployment and worker restart, they will either: | ||
|
||
- Use new code if there was no call to `patched()` in the replay code | ||
- If there was a call to `patched()` in the replay code, they will | ||
run the non-patched code during and after replay | ||
|
||
This might sound odd, but it's actually exactly what's needed because | ||
that means that if the future patched code depends on earlier patched code, | ||
then it won't use the new code -- it will use the old code instead. | ||
|
||
But if | ||
there's new code in the future, and there was no code earlier in the | ||
body that required the new patch, then it can switch over to the new code, | ||
which it will do. | ||
|
||
Note that this behavior means that the Workflow _does not always run | ||
the newest code_. It only does that if not replaying or if replay is | ||
surpassed and there hasn't been a call to `patched()` (with that ID) throughout | ||
the replay. | ||
|
||
#### Recommendations | ||
|
||
Based on this behavior and the implications, when patching in new code, always put the newest code at the top of an if-patched-block. | ||
|
||
<!--SNIPSTART dotnet-patching-example--> | ||
|
||
```csharp | ||
if (patched('v3')) { | ||
// This is the newest version of the code. | ||
// put this at the top, so when it is running | ||
// a fresh execution and not replaying, | ||
// this patched statement will return true | ||
// and it will run the new code. | ||
} else if (patched('v2')) { | ||
} else { | ||
} | ||
``` | ||
|
||
<!--SNIPEND--> | ||
|
||
The following sample shows how `patched()` will behave in a conditional block that's arranged differently. | ||
In this case, the code's conditional block doesn't have the newest code at the top. | ||
Because `patched()` will return `true` when not Replaying (except with the preceding caveats), this snippet will run the `v2` branch instead of `v3` in new executions. | ||
|
||
<!--SNIPSTART dotnet-patching-anti-example--> | ||
|
||
```csharp | ||
if (patched('v2')) { | ||
// This is bad because when doing a new execution (i.e. not replaying), | ||
// patched statements evaluate to True (and put a marker | ||
// in the event history), which means that new executions | ||
// will use v2, and miss v3 below | ||
} | ||
else if (patched('v3')) {} | ||
else {} | ||
``` | ||
|
||
<!--SNIPEND--> | ||
### Testing a Workflow for replay safety. | ||
|
||
### Best Practice of Using Classes as Arguments and Returns | ||
To make sure your Workflow doesn't need a patch, or that you've patched it successfully, you should incorporate [Replay Testing](/develop/dotnet/testing-suite#replay). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "To make sure your Workflow doesn't need a patch" sounds like the default assumption is that it wouldn't. Does it make sense to say "To determine whether your Workflow needs a patch, or that..." ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would suggest this is an example of why further qualifying replay testing beyond a general recommendation does not add clarity. |
||
|
||
As a side note on the Patching API, its behavior is why Temporal recommends using a single object as arguments and returns from Signals, Queries, Updates, and Activities, rather than using multiple arguments/returns. | ||
The Patching API's main use case is to support branching in an `if` block of a method body. | ||
It is not designed to be used to set different methods or method signatures for different Workflow Versions. | ||
## Worker Versioning | ||
|
||
Because of this, Temporal recommends that each Signal, Activity, etc, accepts a single object and returns a single object, so the method signature can stay constant, and you can do your versioning logic using `patched()` within the method body. | ||
Temporal's [Worker Versioning](/production-deployment/worker-deployments/worker-versioning) feature allows you to tag your Workers and programmatically roll them out in Deployment Versions, so that old Workers can run old code paths and new Workers can run new code paths. This way, you can pin your Workflows to specific revisions, avoiding the need for patching. |
Uh oh!
There was an error while loading. Please reload this page.