-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Describe the solution you'd like
It was originally thought that if nothing in user code was referencing a suspended fiber anymore, it would get garbage collected (this is how tasks work in .NET). However, threads actually keep a strong reference to suspended fibers and we reuse threads. On workflow eviction, any number of fibers may be suspended, including the primary one.
The only way to remove a strong reference to a fiber on a thread is to complete the fiber, and the only way to complete the fiber is resume until complete (potentially raising an exception inside it to force it to resume). So we should go over known fibers and raise a non-standard-error exception inside them. This needs to take an approach similar to temporalio/sdk-python#499 where we ignore any side-effects that could be caused by raising (e.g. don't make an activity command if the user did it inside ensure). Make sure there is a test that tries to make uncollected fibers in any way it can and confirm. The test_confirm_garbage_collect test (that we had to skip pending this issue) has some utilities/designs here.
EDIT: These statements about threads holding strong references to fibers are no longer deemed accurate, see first comment.