Description
Describe the bug
When shutting down the host with WorkflowHost.Stop()
/.StopAsync()
, the CancellationToken
supplied to all operations is cancelled immediately. This can lead to the state of just completed workflows/steps not being persisted in time before being cancelled themselves.
To Reproduce
Stop the host on LifeCycleEvent
WorkflowCompleted
, e.g.:
[Fact]
public async Task Scenario()
{
var tcs = new TaskCompletionSource<object>();
Host.OnLifeCycleEvent += (evt) => OnLifeCycleEvent(evt, tcs);
var workflowId = StartWorkflow(null);
await tcs.Task;
GetStatus(workflowId).Should().Be(WorkflowStatus.Complete);
}
private async void OnLifeCycleEvent(LifeCycleEvent evt, TaskCompletionSource<object> tcs)
{
if (evt is WorkflowCompleted)
{
await Host.StopAsync(CancellationToken.None);
tcs.SetResult(new());
}
}
Expected behavior
The workflow's Completed
state is persisted. But for several persistence providers it is not, because IPersistenceProvider.PersistWorkflow()
is cancelled here.
Additional context
I noticed this for the concrete example above but could imagine that several other persistence operations are affected as well. Generally, I would question if persistence operations should be cancellable at all.
I have created some tests to reproduce the issue here: https://github.com/mamidenn/workflow-core/blob/fix-race-condition-on-stop/test/WorkflowCore.IntegrationTests/Scenarios/StopScenario.cs
I will gladly open a PR to fix this issue but would like to get your feedback on what kind of solution you would prefer. I can generally think of
- not passing the
WorkflowHost
'sCancellationToken
to the persistence operations - Removing the
CancellationToken
parameter from all write operations inIPersistenceProvider