-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ABI-stabilize core::task::Waker
#3404
base: master
Are you sure you want to change the base?
Conversation
A library break in types is, generally, strictly unacceptable, and has already been foreclosed by previous RFCs, unless the matter concerns a soundness violation or it is the only path forward for a language evolution. Editions are not permitted to change the standard library API in the way you propose, because of the Epoch RFC that defined them:
|
Then it's a good thing that's not the one I consider best (or even good). The proposed implementation PR opts for the stable union implementation |
I'll edit the RFC to remove the API break option, since it really isn't one :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the idea to freeze Waker fns to extern "C" fn
.
- Enforcing
extern "C"
in vtable benefits only FFI scenario. More usual non-FFI code doesn't really benefits from it. - Instead, non-FFI code needs to write extra
extern "C"
, and are deoptimized due to implementation limitation: extra indirection call, missed extern-Rust specific future optimization and etc. - It also introduces future compatibility concerns, since we would never change the ABI anymore even after edition bumps (ABI-stabilize
core::task::Waker
#3404 (comment)).
|
||
Futures have now become a core feature of Rust, and returning boxed futures from trait-methods has become a very common pattern when working with `async`. | ||
|
||
However, due to Rust's unstable default ABI, trait-objects are never FFI-safe (note that this doesn't stop users trying). Multiple crates (such as [`abi_stable`](https://crates.io/crates/abi_stable) and [`stabby`](https://crates.io/crates/stabby)) attempt to solve this problem by providing proc-macros that generate stable v-tables for the traits they annotate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth mentioning async-ffi which covers exactly the FFI-compatible waker scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I wasn't aware of this one, I shall look it up this instant :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After checking it out, it uses the first technique I envisioned for stabby
: allocating a new Waker on poll
. stabby
now only allocates on the first call to waker.clone()
unsafe extern "C" fn clone_adapter(clone: unsafe fn(*const ()) -> RawWaker, data: *const ()) -> RawWaker { | ||
clone(data) | ||
} | ||
unsafe extern "C" fn other_adapter(other: unsafe fn(*const ()), data: *const ()) { | ||
other(data) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces indirection penalty to all existing ordinary async code. It sounds unacceptable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I see no other way of guaranteeing that calling unsafe fn
is safe to call as soon as futures can come from the FFI (even if they themselves are ABI-stable).
I think the penalty is negligible considering that the adapters will likely stay in cache, and wakers typically have work that involves either locking or lock-less data structures, both being likely to strongly outweigh the cost of that indirection. I arguably don't have any measure to back this assumption up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the penalty is negligible considering that the adapters will likely stay in cache, and wakers typically have work that involves either locking or lock-less data structures, both being likely to strongly outweigh the cost of that indirection. I arguably don't have any measure to back this assumption up.
It's still not zero-cost for majority non-FFI users.
Also if you think extra indirection is acceptable, then so does the indirection penalties of current vtable wrapping technique. But that is one of reasons why you want this RFC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will study async-ffi
further to see if their technique bypasses that, but my issue with the current waker isn't the indirection so much as the need to allocate upon cloning the waker... I'll come back to you once I've explored this again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of async-ffi
, the allocation happens as soon as poll
is called, regardless of whether poll
would've clone the waker or not.
I'm not particularly attached to
Don't all constructors on structures that may at some point require new fields that can't be defaulted have the same future compatibility concern? Arguably this one is more subtle due to the fact that fields shouldn't be reordered once done, but that's a minor concern with fields this uniform. Alternative proposal (throwing spaghetti at the wall here):Wakers could be made generic over the vtable layout trait WakerVTable {
type CloneFn: Fn(*const ()) -> RawWaker<Self>;
type OtherFn: Fn(*const ());
unsafe fn clone_fn(&self) -> Self::CloneFn;
unsafe fn wake_fn(&self) -> Self::OtherFn;
unsafe fn wake_by_ref_fn(&self) -> Self::OtherFn;
unsafe fn drop_fn(&self) -> Self::OtherFn;
}
// RawWaker => RawWaker<VTable: WakerVTable = RawWakerVTable>
// Waker => Waker<VTable: WakerVTable = RawWakerVTable>
// Context => Context<VTable: WakerVTable = RawWakerVTable>
// Future => Future<VTable: WakerVTable = RawWakerVTable>
impl WakerVTable for RawWakerVTable {
type CloneOutput = RawWaker;
type CloneFn = fn(*const ()) -> Self::CloneOutput; // not sure how to have the unsafe here...
type OtherFn = fn(*const ());
unsafe fn clone_fn(&self) -> Self::CloneFn {unsafe {crate::mem::transmute(self.clone)}
// ...
} This way, it's possible to design new vtable layouts, the state machine generation can stay entirely identical, code that uses But that way, new code can be written to support other v-table types, such as Support for FFI-safe wakers would be incremental: executors would need to have new methods to spawn futures, which would have to either be stored separately or in an enum; hand-written futures would need to add the new generic to their impl of Future to be compatible with the FFI-safe wakers. But overall, I don't think this breaks anything, and doesn't cost anything to projects that are OK with not being FFI-safe. |
@oxalica could you weigh in on that "alternative proposal" just above? This would even let alternative vtable layout such as the stable one be provided by external crates, removing the need for Combined with I might need mentoring to implement a prototype though, because the (also, sorry for pinging you, I'm not sure if you'd rather I not in the future, do let me know) |
This is an extremely out-there suggestion, but perhaps there could longer-term be some mechanism for Rust to interpret certain Rust-ABI functions as actually being another ABI, and modifying compilation accordingly? Basically, since Rust's ABI is explicitly unstable (and probably will be forever, instead relying on explicit ABI annotation for stable ABIs), we essentially have free reign to choose the ABI depending on the circumstances. So, we could have the ability to say "actually, the functions inside Waker should use the C ABI internally" so that functions without the C ABI are the ones which incur the branch penalty, and C-ABI functions are free. Of course, this is probably not a very useful optimisation, but I figured I'd ponder here if that would be a potential solution. |
@clarfonthey Doing this automatically sounds weird to me, but I'm currently writing another, much more ambitious (to ambitious for me to stand any chance of implementing it without a lot of mentoring) RFC that's similar: what if when running rustc, you could ask it to use a given layout/ABI combination for all things that don't have an explicitly specified one? This would let people instantly make all of their library ffi-safe, at the cost of losing out on the latest optimizations. As better layouts and ABI are found, and provided the compiler team is willing to support them, this would let people use the most recently stabilized efficient layouts globally, instead of needing to mark all types down to core like they currently have to |
We currently have
|
@workingjubilee Thanks for kinda confirming my suspicion, since most of the calling conventions listed in the nomicon are Microsoft calling conventions, I was wondering which ones could really be treated as stable, and how the Microsoft ones were translated on other systems. I see you've reacted to the There's a papercut: assuming you have a Future that's completely generic over the Waker (be it handwritten or generated by an fn returns_impl_future<Waker>() -> impl Future<W> {...}
fn returns_boxed_future<Waker>() -> Box<dyn Future<W>> {...} Presumably, Still, that seems to me like an actually much better proposal than this current one, as it actually expands the scope of how executors would be allowed to work. Should this be a new RFC? |
This is kinda off-topic for this RFC, but just throwing another idea out there: right now, the documentation states that ABI is unstable including between two runs of the same compiler on the same machine:
The point I'm trying to reach is: would the compiler be able to provide a form of hash which, barring hash collision, would identify the way it would represent types as well as what ABI it would use for If such a hash were to exist, we could at least make the "blessed compiler" approach truly safe: for example, Rust could (provided it was asked to) add an undefined symbol How to have Rust consistently use the same ABI is possibly a whole other can of worms, but this could allow safe usage of FFI-unsafe Rust across the FFI-boundary, even if producing artifacts that match may be complex/unfeasible depending on the version/flags used. |
that would imho improperly limit existing cdylibs that do stuff like so: #[derive(Debug, Default)]
pub struct MyState { // type acts like extern type to all users of this cdylib
state: String, // something that triggers improper_cdecl_type
}
#[no_mangle]
pub extern "C" fn alloc_my_state() -> Box<MyState> {
Box::new(MyState::default())
}
#[no_mangle]
pub extern "C" fn free_my_state(_v: Option<Box<MyState>>) {}
#[no_mangle]
pub extern "C" fn modify_my_state(this: &mut MyState, a: u32) {
this.state += char::from_u32(a).unwrap();
}
#[no_mangle]
pub unsafe extern "C" fn debug_my_state(this: &MyState, f: unsafe extern "C" fn(*const u8, usize, *mut ()), f_state: *mut ()) {
let s = format!("{this:?}");
unsafe { f(s.as_ptr(), s.len(), f_state) }
} |
Indeed, |
Optimization levels can impact effective ABI. |
As far as I know, for a given architecture, specifying one of the Windows ABIs either uses the specified Windows calling convention, if there is a version of that Windows CC for the architecture, or it just... doesn't make any sense at all and may produce... unexpected results. |
That's actually exactly what I expected, thanks for confirming that, imma gonna add that to our canary for our projects (that currently use the "blessed compiler" approach: the fact that our user only complain when they use a different version of the compiler leads me to think we got lucky a lot with how they set opt-level). Any other things that spring to mind (excluding the layout randomization flag)? If not, do we know that's all of it? Because if we know compiler-version + opt-level are the only things beside explicit randomization that could cause ABI changes with the same signatures, the "perfect linkage canaries" (those that would give 100% guarantee of ABI-agreement) are actually doable today as a crate 😄 Request for mentorshipAbout the generic future thing: I'd love to give it a try, but the future lang-feature requires that there be exactly 0 generics on the trait. I tried changing that requirement in
|
generally what's done is by using e.g. here a new intrinsic is only used when |
...uh. Whether the function is publicly visible or not. ( I think not in the Rust sense but in the "exported from the binary" sense? ) There's an optimization pass that LLVM runs that can change the calling convention of functions that are known to not be publicly exposed. |
Makes sense, I wouldn't even rely on a non-publicly visible function existing in the binary (since it could also have been inlined on every call-site). Sorry for pestering you so much. Hopefully I'll pay you back for your time by making the Rust-to-Rust-dynlink experience better by trying to regroup all of this :) |
Implementing FFI-safe futures for
stabby
has given me the opportunity to find out just how hardWaker
s are to deal with when attempting to pass futures across the FFI boundary.This RFC aims to make dealing with them not only much easier, but also much more performant than currently possible, removing a tree-sized splinter from the thumb of projects that need to pass futures across the FFI boundary.
Disclaimer: this RFC proposes breaking API for completeness' sake only: my firm opinion is that Rust's backward compatibility guarantees should be protected, and the proposed non-breaking solution's runtime cost, while existent, is likely to become negligible as soon as the common executors switch to the newly proposed constructor for waker vtables.
Rendered