Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NewEvents table cluttered with TimerFired events #93

Closed
moldovangeorge opened this issue May 6, 2022 · 2 comments · Fixed by #97
Closed

NewEvents table cluttered with TimerFired events #93

moldovangeorge opened this issue May 6, 2022 · 2 comments · Fixed by #97
Milestone

Comments

@moldovangeorge
Copy link
Contributor

Our application uses timers heavily, having multiple async flows that have a deadline, and we use timers for establishing that deadline until a certain service can call back to finish an action.
We follow the best practices around working with timers and we cancel the timers if we are no longer waiting for them (if we received the callback event before the deadline).
One standard way of using timers looks something like that :


var cts = new CancellationTokenSource();
var timeoutTask = context.CreateTimer(context.CurrentUtcDateTime.AddHours(someTime), new TEventType(), cts.Token);
var winner = await Task.WhenAny(_callBackEvent.Task, timeoutTask);
if (winner == _callBackEvent.Task)
{
    cts.Cancel();
    var eventResult = _callBackEvent.Task.Result;
    return eventResult;
}

throw new TimeoutException();

After a run of a workflow, the timers remain un-deleted in the NewEvents table, even though the instances related to them are finished (Completed, Failed, etc). Is this by design, or is it something in the way we are using timers that generates this behavior?

@cgillum
Copy link
Member

cgillum commented May 6, 2022

I’ll need to investigate, but I think this might be expected for canceled timers. Canceling a timer doesn’t actually delete anything from the Durable store, it just tells the orchestration to not wait for it when transitioning into a completed state. I’ll be interested to know if they stay there forever or if they get deleted after their scheduled fire-time…

@moldovangeorge
Copy link
Contributor Author

moldovangeorge commented May 9, 2022

Looking at the code I don't see how would these events ever be deleted ( which is confirmed by our experience so far) :

  • The only point where a delete is performed against the NewEvents table is in the _CheckpointOrchestration procedure :
-- We return the list of deleted messages so that the caller can issue a 
  -- warning about missing messages
  DELETE E
  OUTPUT DELETED.InstanceID, DELETED.SequenceNumber
  FROM dt.NewEvents E WITH (FORCESEEK(PK_NewEvents(TaskHub, InstanceID, SequenceNumber)))
      INNER JOIN @DeletedEvents D ON 
          D.InstanceID = E.InstanceID AND
          D.SequenceNumber = E.SequenceNumber AND
          E.TaskHub = @TaskHub
  • But for performing an orchestration checkpoint, that orchestration needs to be picked up by the worker, and the filter in the _LockNextOrchestration prevents finished orchestrations from being picked up (which is correct IMO) :
    -- Lock the first active instance that has pending messages.
    -- Delayed events from durable timers will have a non-null VisibleTime value.
    -- Non-active instances will never have their messages or history read.
    UPDATE TOP (1) Instances WITH (READPAST)
    SET
        [LockedBy] = @LockedBy,
	    [LockExpiration] = @LockExpiration,
        @instanceID = I.[InstanceID],
        @parentInstanceID = I.[ParentInstanceID],
        @version = I.[Version]
    FROM 
        dt.Instances I WITH (READPAST) INNER JOIN NewEvents E WITH (READPAST) ON
            E.[TaskHub] = @TaskHub AND
            E.[InstanceID] = I.[InstanceID]
    WHERE
        I.TaskHub = @TaskHub AND
        I.[RuntimeStatus] IN ('Pending', 'Running') AND
	    (I.[LockExpiration] IS NULL OR I.[LockExpiration] < @now) AND
        (E.[VisibleTime] IS NULL OR E.[VisibleTime] < @now)

So I think that any Events that remain linked to a finished orchestration will remain un-deleted forever. Would a new Purge procedure for cleaning these orphan Events at deployment time be a good add-on?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants