Fix scheduler affinity #405

ispeters · 2022-03-12T04:01:22Z

We have been storing a task<>'s scheduler as an any_scheduler_ref,
which has proven to be a source of use-after-free bugs. This change
switches all the any_scheduler_refs to any_schedulers, fixing the
lifetime issues.

We have been storing a `task<>`'s scheduler as an `any_scheduler_ref`, which has proven to be a source of use-after-free bugs. This change switches all the `any_scheduler_ref`s to `any_scheduler`s, fixing the lifetime issues.

janondrusek

Tested internally.

* Fix scheduler affinity We have been storing a `task<>`'s scheduler as an `any_scheduler_ref`, which has proven to be a source of use-after-free bugs. This change switches all the `any_scheduler_ref`s to `any_scheduler`s, fixing the lifetime issues.

This commit implements scheduler affinity -- aka "sticky" scheduling -- in `unifex::task<>`. The idea is that it is impossible for a child operation to cause the current coroutine to resume on the wrong execution context. * `task<>`-based coroutines track and propagate the current scheduler * `at_coroutine_exit` remembers current scheduler from when the cleanup action is scheduled * `schedule` always returns an instance of `sender_for<schedule, the_sender>`, which is also a `scheduler_provider` * scheduler affinity when co_await-ing senders in a `task<>`-returning coroutine * scheduler affinity when co_await-ing awaitables in a `task<>`-returning coroutine * awaitables and senders that are `blocking_kind::always_inline` don't get a thunk * More senders and awaitables support compile-time blocking queries * `co_await schedule(sched)` is magic in a `task<>`-returning coroutine: it changes execution context and schedules a cleanup action to transition back to the original scheduler Move implementation of special co_await behavior of scheduler senders out of task.hpp Hoist untyped RAII containers for coroutine_handle<> out of task<> and its awaiter (facebookexperimental#329) While looking at the binary size impact of adopting coroutines with `unifex::task<>`, I noticed that a number of operations on `coroutine_handle<T>` are expressed in `unifex::task<>` as if they depend on `T` when they don't. The consequence is extra code. This diff creates a `coro_holder` class that uniquely owns a `coroutine_handle<>` and makes `unifex::task<>` inherit from it. We technically lose some type safety, but it's still correct by construction. This change saves about 1.5 kilobytes in one of our apps. Similar to the above, I noticed binary duplication due to false template parameter dependencies in `unifex::task<>`'s awaiter type. This diff hoists a non-type-specific RAII container for a `coroutine_handle<>` that stores the handle as a `std::uintptr_t` so that `task`'s awaiter can use the low bit as a dirty flag. This change saves another ~1.5 kilobytes in one of our apps. Fix scheduler affinity (facebookexperimental#405) * Fix scheduler affinity We have been storing a `task<>`'s scheduler as an `any_scheduler_ref`, which has proven to be a source of use-after-free bugs. This change switches all the `any_scheduler_ref`s to `any_scheduler`s, fixing the lifetime issues. Make task<>'s thunk-on-resume unstoppable (facebookexperimental#495) * Make task<>'s thunk-on-resume unstoppable When awaiting an async Sender that swallows done signals (such as let_done(never_sender{}, just)), the user-level code looks like it swallows done signals: ``` // never cancels co_await let_done(never_sender{}, just); ``` However, `task<>`'s Scheduler affinity implementation transforms the above code into this: ``` co_await typed_via(let_done(never_sender{}, just), <current scheduler>); ``` The `schedule()` operation inside the injected `typed_via` can emit done if the current stop token has had stop requested, leading to very non-obvious cancellation behaviour that can't be worked around. This diff introduces a pair of regression tests that capture the above scenario, and the analogous scenario of awaiting an async Awaitable that completes with done. The next diff will fix these failing tests. * Change task<>'s thunk-on-resume to be unstoppable This diff fixes the broken tests in the previous diff. Respect blocking_kind in `let_value()` (facebookexperimental#381) * `let_value()` would always assume `blocking_kind::maybe`, which results in potentially unnecessary reschedule on resumption * replicate `blocking_kind` customization from `finally()` fix `variant_sender` blocking kind add `unifex::v2::async_scope` (facebookexperimental#463) * simpler than `unifex::v1::async_scope` (`nest()` and `join()`) * does not support cancellation fixing linter error move deduction guide to namespace scope for gcc-10 in scheduler concept, check copy_constructability after requiring call to schedule() work around gcc-10 bugs Co-authored-by: Eric Niebler <eniebler@boost.org> Co-authored-by: Ian Petersen <ispeters@gmail.com>

This commit implements scheduler affinity -- aka "sticky" scheduling -- in `unifex::task<>`. The idea is that it is impossible for a child operation to cause the current coroutine to resume on the wrong execution context. * `task<>`-based coroutines track and propagate the current scheduler * `at_coroutine_exit` remembers current scheduler from when the cleanup action is scheduled * `schedule` always returns an instance of `sender_for<schedule, the_sender>`, which is also a `scheduler_provider` * scheduler affinity when co_await-ing senders in a `task<>`-returning coroutine * scheduler affinity when co_await-ing awaitables in a `task<>`-returning coroutine * awaitables and senders that are `blocking_kind::always_inline` don't get a thunk * More senders and awaitables support compile-time blocking queries * `co_await schedule(sched)` is magic in a `task<>`-returning coroutine: it changes execution context and schedules a cleanup action to transition back to the original scheduler Move implementation of special co_await behavior of scheduler senders out of task.hpp Hoist untyped RAII containers for coroutine_handle<> out of task<> and its awaiter (facebookexperimental#329) While looking at the binary size impact of adopting coroutines with `unifex::task<>`, I noticed that a number of operations on `coroutine_handle<T>` are expressed in `unifex::task<>` as if they depend on `T` when they don't. The consequence is extra code. This diff creates a `coro_holder` class that uniquely owns a `coroutine_handle<>` and makes `unifex::task<>` inherit from it. We technically lose some type safety, but it's still correct by construction. This change saves about 1.5 kilobytes in one of our apps. Similar to the above, I noticed binary duplication due to false template parameter dependencies in `unifex::task<>`'s awaiter type. This diff hoists a non-type-specific RAII container for a `coroutine_handle<>` that stores the handle as a `std::uintptr_t` so that `task`'s awaiter can use the low bit as a dirty flag. This change saves another ~1.5 kilobytes in one of our apps. Fix scheduler affinity (facebookexperimental#405) * Fix scheduler affinity We have been storing a `task<>`'s scheduler as an `any_scheduler_ref`, which has proven to be a source of use-after-free bugs. This change switches all the `any_scheduler_ref`s to `any_scheduler`s, fixing the lifetime issues. Make task<>'s thunk-on-resume unstoppable (facebookexperimental#495) * Make task<>'s thunk-on-resume unstoppable When awaiting an async Sender that swallows done signals (such as let_done(never_sender{}, just)), the user-level code looks like it swallows done signals: ``` // never cancels co_await let_done(never_sender{}, just); ``` However, `task<>`'s Scheduler affinity implementation transforms the above code into this: ``` co_await typed_via(let_done(never_sender{}, just), <current scheduler>); ``` The `schedule()` operation inside the injected `typed_via` can emit done if the current stop token has had stop requested, leading to very non-obvious cancellation behaviour that can't be worked around. This diff introduces a pair of regression tests that capture the above scenario, and the analogous scenario of awaiting an async Awaitable that completes with done. The next diff will fix these failing tests. * Change task<>'s thunk-on-resume to be unstoppable This diff fixes the broken tests in the previous diff. Respect blocking_kind in `let_value()` (facebookexperimental#381) * `let_value()` would always assume `blocking_kind::maybe`, which results in potentially unnecessary reschedule on resumption * replicate `blocking_kind` customization from `finally()` fix `variant_sender` blocking kind (facebookexperimental#474) add `unifex::v2::async_scope` (facebookexperimental#463) * simpler than `unifex::v1::async_scope` (`nest()` and `join()`) * does not support cancellation fixing linter error (facebookexperimental#414) move deduction guide to namespace scope for gcc-10 in scheduler concept, check copy_constructability after requiring call to schedule() work around gcc-10 bugs Co-authored-by: Eric Niebler <eniebler@boost.org> Co-authored-by: Ian Petersen <ispeters@gmail.com> Co-authored-by: Ondrej Lehecka <lehecka@fb.com>

This commit implements scheduler affinity -- aka "sticky" scheduling -- in `unifex::task<>`. The idea is that it is impossible for a child operation to cause the current coroutine to resume on the wrong execution context. * `task<>`-based coroutines track and propagate the current scheduler * `at_coroutine_exit` remembers current scheduler from when the cleanup action is scheduled * `schedule` always returns an instance of `sender_for<schedule, the_sender>`, which is also a `scheduler_provider` * scheduler affinity when co_await-ing senders in a `task<>`-returning coroutine * scheduler affinity when co_await-ing awaitables in a `task<>`-returning coroutine * awaitables and senders that are `blocking_kind::always_inline` don't get a thunk * More senders and awaitables support compile-time blocking queries * `co_await schedule(sched)` is magic in a `task<>`-returning coroutine: it changes execution context and schedules a cleanup action to transition back to the original scheduler Move implementation of special co_await behavior of scheduler senders out of task.hpp Hoist untyped RAII containers for coroutine_handle<> out of task<> and its awaiter (facebookexperimental#329) While looking at the binary size impact of adopting coroutines with `unifex::task<>`, I noticed that a number of operations on `coroutine_handle<T>` are expressed in `unifex::task<>` as if they depend on `T` when they don't. The consequence is extra code. This diff creates a `coro_holder` class that uniquely owns a `coroutine_handle<>` and makes `unifex::task<>` inherit from it. We technically lose some type safety, but it's still correct by construction. This change saves about 1.5 kilobytes in one of our apps. Similar to the above, I noticed binary duplication due to false template parameter dependencies in `unifex::task<>`'s awaiter type. This diff hoists a non-type-specific RAII container for a `coroutine_handle<>` that stores the handle as a `std::uintptr_t` so that `task`'s awaiter can use the low bit as a dirty flag. This change saves another ~1.5 kilobytes in one of our apps. Fix scheduler affinity (facebookexperimental#405) * Fix scheduler affinity We have been storing a `task<>`'s scheduler as an `any_scheduler_ref`, which has proven to be a source of use-after-free bugs. This change switches all the `any_scheduler_ref`s to `any_scheduler`s, fixing the lifetime issues. Make task<>'s thunk-on-resume unstoppable (facebookexperimental#495) * Make task<>'s thunk-on-resume unstoppable When awaiting an async Sender that swallows done signals (such as let_done(never_sender{}, just)), the user-level code looks like it swallows done signals: ``` // never cancels co_await let_done(never_sender{}, just); ``` However, `task<>`'s Scheduler affinity implementation transforms the above code into this: ``` co_await typed_via(let_done(never_sender{}, just), <current scheduler>); ``` The `schedule()` operation inside the injected `typed_via` can emit done if the current stop token has had stop requested, leading to very non-obvious cancellation behaviour that can't be worked around. This diff introduces a pair of regression tests that capture the above scenario, and the analogous scenario of awaiting an async Awaitable that completes with done. The next diff will fix these failing tests. * Change task<>'s thunk-on-resume to be unstoppable This diff fixes the broken tests in the previous diff. Respect blocking_kind in `let_value()` (facebookexperimental#381) * `let_value()` would always assume `blocking_kind::maybe`, which results in potentially unnecessary reschedule on resumption * replicate `blocking_kind` customization from `finally()` fix `variant_sender` blocking kind (facebookexperimental#474) add `unifex::v2::async_scope` (facebookexperimental#463) * simpler than `unifex::v1::async_scope` (`nest()` and `join()`) * does not support cancellation fixing linter error (facebookexperimental#414) move deduction guide to namespace scope for gcc-10 in scheduler concept, check copy_constructability after requiring call to schedule() work around gcc-10 bugs avoid warning about missing braces in initializer Co-authored-by: Eric Niebler <eniebler@boost.org> Co-authored-by: Ian Petersen <ispeters@gmail.com> Co-authored-by: Ondrej Lehecka <lehecka@fb.com>

This commit implements scheduler affinity -- aka "sticky" scheduling -- in `unifex::task<>`. The idea is that it is impossible for a child operation to cause the current coroutine to resume on the wrong execution context. * `task<>`-based coroutines track and propagate the current scheduler * `at_coroutine_exit` remembers current scheduler from when the cleanup action is scheduled * `schedule` always returns an instance of `sender_for<schedule, the_sender>`, which is also a `scheduler_provider` * scheduler affinity when co_await-ing senders in a `task<>`-returning coroutine * scheduler affinity when co_await-ing awaitables in a `task<>`-returning coroutine * awaitables and senders that are `blocking_kind::always_inline` don't get a thunk * More senders and awaitables support compile-time blocking queries * `co_await schedule(sched)` is magic in a `task<>`-returning coroutine: it changes execution context and schedules a cleanup action to transition back to the original scheduler Move implementation of special co_await behavior of scheduler senders out of task.hpp Hoist untyped RAII containers for coroutine_handle<> out of task<> and its awaiter (facebookexperimental#329) While looking at the binary size impact of adopting coroutines with `unifex::task<>`, I noticed that a number of operations on `coroutine_handle<T>` are expressed in `unifex::task<>` as if they depend on `T` when they don't. The consequence is extra code. This diff creates a `coro_holder` class that uniquely owns a `coroutine_handle<>` and makes `unifex::task<>` inherit from it. We technically lose some type safety, but it's still correct by construction. This change saves about 1.5 kilobytes in one of our apps. Similar to the above, I noticed binary duplication due to false template parameter dependencies in `unifex::task<>`'s awaiter type. This diff hoists a non-type-specific RAII container for a `coroutine_handle<>` that stores the handle as a `std::uintptr_t` so that `task`'s awaiter can use the low bit as a dirty flag. This change saves another ~1.5 kilobytes in one of our apps. Fix scheduler affinity (facebookexperimental#405) * Fix scheduler affinity We have been storing a `task<>`'s scheduler as an `any_scheduler_ref`, which has proven to be a source of use-after-free bugs. This change switches all the `any_scheduler_ref`s to `any_scheduler`s, fixing the lifetime issues. Make task<>'s thunk-on-resume unstoppable (facebookexperimental#495) * Make task<>'s thunk-on-resume unstoppable When awaiting an async Sender that swallows done signals (such as let_done(never_sender{}, just)), the user-level code looks like it swallows done signals: ``` // never cancels co_await let_done(never_sender{}, just); ``` However, `task<>`'s Scheduler affinity implementation transforms the above code into this: ``` co_await typed_via(let_done(never_sender{}, just), <current scheduler>); ``` The `schedule()` operation inside the injected `typed_via` can emit done if the current stop token has had stop requested, leading to very non-obvious cancellation behaviour that can't be worked around. This diff introduces a pair of regression tests that capture the above scenario, and the analogous scenario of awaiting an async Awaitable that completes with done. The next diff will fix these failing tests. * Change task<>'s thunk-on-resume to be unstoppable This diff fixes the broken tests in the previous diff. Respect blocking_kind in `let_value()` (facebookexperimental#381) * `let_value()` would always assume `blocking_kind::maybe`, which results in potentially unnecessary reschedule on resumption * replicate `blocking_kind` customization from `finally()` fix `variant_sender` blocking kind (facebookexperimental#474) add `unifex::v2::async_scope` (facebookexperimental#463) * simpler than `unifex::v1::async_scope` (`nest()` and `join()`) * does not support cancellation fixing linter error (facebookexperimental#414) move deduction guide to namespace scope for gcc-10 in scheduler concept, check copy_constructability after requiring call to schedule() work around gcc-10 bugs avoid warning about missing braces in initializer back out change to awaiter_type_t Co-authored-by: Eric Niebler <eniebler@boost.org> Co-authored-by: Ian Petersen <ispeters@gmail.com> Co-authored-by: Ondrej Lehecka <lehecka@fb.com>

This commit implements scheduler affinity -- aka "sticky" scheduling -- in `unifex::task<>`. The idea is that it is impossible for a child operation to cause the current coroutine to resume on the wrong execution context. * `task<>`-based coroutines track and propagate the current scheduler * `at_coroutine_exit` remembers current scheduler from when the cleanup action is scheduled * `schedule` always returns an instance of `sender_for<schedule, the_sender>`, which is also a `scheduler_provider` * scheduler affinity when co_await-ing senders in a `task<>`-returning coroutine * scheduler affinity when co_await-ing awaitables in a `task<>`-returning coroutine * awaitables and senders that are `blocking_kind::always_inline` don't get a thunk * More senders and awaitables support compile-time blocking queries * `co_await schedule(sched)` is magic in a `task<>`-returning coroutine: it changes execution context and schedules a cleanup action to transition back to the original scheduler Move implementation of special co_await behavior of scheduler senders out of task.hpp Hoist untyped RAII containers for coroutine_handle<> out of task<> and its awaiter (#329) While looking at the binary size impact of adopting coroutines with `unifex::task<>`, I noticed that a number of operations on `coroutine_handle<T>` are expressed in `unifex::task<>` as if they depend on `T` when they don't. The consequence is extra code. This diff creates a `coro_holder` class that uniquely owns a `coroutine_handle<>` and makes `unifex::task<>` inherit from it. We technically lose some type safety, but it's still correct by construction. This change saves about 1.5 kilobytes in one of our apps. Similar to the above, I noticed binary duplication due to false template parameter dependencies in `unifex::task<>`'s awaiter type. This diff hoists a non-type-specific RAII container for a `coroutine_handle<>` that stores the handle as a `std::uintptr_t` so that `task`'s awaiter can use the low bit as a dirty flag. This change saves another ~1.5 kilobytes in one of our apps. Fix scheduler affinity (#405) * Fix scheduler affinity We have been storing a `task<>`'s scheduler as an `any_scheduler_ref`, which has proven to be a source of use-after-free bugs. This change switches all the `any_scheduler_ref`s to `any_scheduler`s, fixing the lifetime issues. Make task<>'s thunk-on-resume unstoppable (#495) * Make task<>'s thunk-on-resume unstoppable When awaiting an async Sender that swallows done signals (such as let_done(never_sender{}, just)), the user-level code looks like it swallows done signals: ``` // never cancels co_await let_done(never_sender{}, just); ``` However, `task<>`'s Scheduler affinity implementation transforms the above code into this: ``` co_await typed_via(let_done(never_sender{}, just), <current scheduler>); ``` The `schedule()` operation inside the injected `typed_via` can emit done if the current stop token has had stop requested, leading to very non-obvious cancellation behaviour that can't be worked around. This diff introduces a pair of regression tests that capture the above scenario, and the analogous scenario of awaiting an async Awaitable that completes with done. The next diff will fix these failing tests. * Change task<>'s thunk-on-resume to be unstoppable This diff fixes the broken tests in the previous diff. Respect blocking_kind in `let_value()` (#381) * `let_value()` would always assume `blocking_kind::maybe`, which results in potentially unnecessary reschedule on resumption * replicate `blocking_kind` customization from `finally()` fix `variant_sender` blocking kind (#474) add `unifex::v2::async_scope` (#463) * simpler than `unifex::v1::async_scope` (`nest()` and `join()`) * does not support cancellation fixing linter error (#414) move deduction guide to namespace scope for gcc-10 in scheduler concept, check copy_constructability after requiring call to schedule() work around gcc-10 bugs avoid warning about missing braces in initializer back out change to awaiter_type_t Co-authored-by: Eric Niebler <eniebler@boost.org> Co-authored-by: Ian Petersen <ispeters@gmail.com> Co-authored-by: Ondrej Lehecka <lehecka@fb.com>

This commit implements scheduler affinity -- aka "sticky" scheduling -- in `unifex::task<>`. The idea is that it is impossible for a child operation to cause the current coroutine to resume on the wrong execution context. * `task<>`-based coroutines track and propagate the current scheduler * `at_coroutine_exit` remembers current scheduler from when the cleanup action is scheduled * `schedule` always returns an instance of `sender_for<schedule, the_sender>`, which is also a `scheduler_provider` * scheduler affinity when co_await-ing senders in a `task<>`-returning coroutine * scheduler affinity when co_await-ing awaitables in a `task<>`-returning coroutine * awaitables and senders that are `blocking_kind::always_inline` don't get a thunk * More senders and awaitables support compile-time blocking queries * `co_await schedule(sched)` is magic in a `task<>`-returning coroutine: it changes execution context and schedules a cleanup action to transition back to the original scheduler Move implementation of special co_await behavior of scheduler senders out of task.hpp Hoist untyped RAII containers for coroutine_handle<> out of task<> and its awaiter (facebookexperimental#329) While looking at the binary size impact of adopting coroutines with `unifex::task<>`, I noticed that a number of operations on `coroutine_handle<T>` are expressed in `unifex::task<>` as if they depend on `T` when they don't. The consequence is extra code. This diff creates a `coro_holder` class that uniquely owns a `coroutine_handle<>` and makes `unifex::task<>` inherit from it. We technically lose some type safety, but it's still correct by construction. This change saves about 1.5 kilobytes in one of our apps. Similar to the above, I noticed binary duplication due to false template parameter dependencies in `unifex::task<>`'s awaiter type. This diff hoists a non-type-specific RAII container for a `coroutine_handle<>` that stores the handle as a `std::uintptr_t` so that `task`'s awaiter can use the low bit as a dirty flag. This change saves another ~1.5 kilobytes in one of our apps. Fix scheduler affinity (facebookexperimental#405) * Fix scheduler affinity We have been storing a `task<>`'s scheduler as an `any_scheduler_ref`, which has proven to be a source of use-after-free bugs. This change switches all the `any_scheduler_ref`s to `any_scheduler`s, fixing the lifetime issues. Make task<>'s thunk-on-resume unstoppable (facebookexperimental#495) * Make task<>'s thunk-on-resume unstoppable When awaiting an async Sender that swallows done signals (such as let_done(never_sender{}, just)), the user-level code looks like it swallows done signals: ``` // never cancels co_await let_done(never_sender{}, just); ``` However, `task<>`'s Scheduler affinity implementation transforms the above code into this: ``` co_await typed_via(let_done(never_sender{}, just), <current scheduler>); ``` The `schedule()` operation inside the injected `typed_via` can emit done if the current stop token has had stop requested, leading to very non-obvious cancellation behaviour that can't be worked around. This diff introduces a pair of regression tests that capture the above scenario, and the analogous scenario of awaiting an async Awaitable that completes with done. The next diff will fix these failing tests. * Change task<>'s thunk-on-resume to be unstoppable This diff fixes the broken tests in the previous diff. Respect blocking_kind in `let_value()` (facebookexperimental#381) * `let_value()` would always assume `blocking_kind::maybe`, which results in potentially unnecessary reschedule on resumption * replicate `blocking_kind` customization from `finally()` fix `variant_sender` blocking kind (facebookexperimental#474) add `unifex::v2::async_scope` (facebookexperimental#463) * simpler than `unifex::v1::async_scope` (`nest()` and `join()`) * does not support cancellation fixing linter error (facebookexperimental#414) move deduction guide to namespace scope for gcc-10 in scheduler concept, check copy_constructability after requiring call to schedule() work around gcc-10 bugs avoid warning about missing braces in initializer back out change to awaiter_type_t Co-authored-by: Eric Niebler <eniebler@boost.org> Co-authored-by: Ian Petersen <ispeters@gmail.com> Co-authored-by: Ondrej Lehecka <lehecka@fb.com>

Fix scheduler affinity

17b6ece

We have been storing a `task<>`'s scheduler as an `any_scheduler_ref`, which has proven to be a source of use-after-free bugs. This change switches all the `any_scheduler_ref`s to `any_scheduler`s, fixing the lifetime issues.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 12, 2022

Add missing test change

fec06d9

ispeters merged commit 9b4f273 into broken-stdlib Mar 12, 2022

ispeters deleted the fix_scheduler_affinity branch March 12, 2022 04:37

janondrusek approved these changes Mar 12, 2022

View reviewed changes

janondrusek mentioned this pull request Apr 28, 2023

task<> scheduler affinity (#290) #506

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix scheduler affinity #405

Fix scheduler affinity #405

ispeters commented Mar 12, 2022

janondrusek left a comment

Fix scheduler affinity #405

Fix scheduler affinity #405

Conversation

ispeters commented Mar 12, 2022

janondrusek left a comment

Choose a reason for hiding this comment