Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow owned data in MappedMutexGuard #290

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

eira-fransham
Copy link

Closes #289

{
let raw = &s.mutex.raw;
let data = f(unsafe { &mut *s.mutex.data.get() });
let data = unsafe { &mut *s.mutex.data.get() };
mem::forget(s);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will invalidate the data reference.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is true, forget doesn't invalidate anything. For example, Box::leak leaks the value and returns a borrow of any length.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. This doesn't actually invalidate the borrow because s.mutex is a reference.

let data = match f(unsafe { &mut *s.mutex.data.get() }) {
let data = unsafe { &mut *s.mutex.data.get() };

let data = match f(data) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will leak the mutex guard if f(data) panics.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to work out what the correct behaviour here would actually be - I've tried to work out a way to cause an invalid reference using the f closure but as far as I see it the lifetime bounds make it fine. I think just using mem::forget(s) after the match instead of using mem::ManuallyDrop is the correct behaviour.

I would have preferred to use catch_unwind/resume_unwind to make the control flow more explicit but unfortunately that's not available with no_std right now.


let data = unsafe { ptr::read(&mut s.data) };

let data = f(data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a potential double-drop here: if f panics then data will be dropped again despite having already been read from s.data.

})
let data = unsafe { ptr::read(&mut s.data) };

f(data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

@eira-fransham
Copy link
Author

@bjorn3 Thank you for your review. I've also noticed that I accidentally based this PR on my other PR implementing the bytemuck traits instead of on master, so I'll fix that too. Would you say that you're in favour of the general idea of the PR, even if there are issues with the implementation? I believe that it's a minor breaking change since it will affect anyone explicitly specifying the type and lifetime parameters of MappedMutexGuard::map/try_map.

@bjorn3
Copy link
Contributor

bjorn3 commented Jun 23, 2021

I think it would be fine, but don't know enough about parking-lot's internals to judge.

@Amanieu
Copy link
Owner

Amanieu commented Jun 23, 2021

I'm very much in favor of this change. In fact I would like to make this a breaking change and directly change MappedMutexGuard instead of adding OwnedMappedMutexGuard.

Could you also extend this to support other lock types (rwlock, reentrant mutex, etc)?

@yume-chan
Copy link

I also want this. My mapper f returns a Box<T> so I need the guard to own the data. However, instead of Mutex I'm using RwLock.

@Amanieu
Copy link
Owner

Amanieu commented Aug 9, 2021

@Vurich Are you still working on this PR?

@eira-fransham
Copy link
Author

I’m sorry, I was working on this as part of a project at work but we went with a solution that no longer needed this change anyway, so I forgot about it. I can finish up this PR and we can get it merged.

@Amanieu
Copy link
Owner

Amanieu commented Aug 9, 2021

Thanks, that would be great!

@Amanieu
Copy link
Owner

Amanieu commented Aug 23, 2021

Hi, are you still updating this PR? Otherwise I can take over, finish it and get it merged.

where
F: FnOnce(&mut T) -> &mut U,
F: FnOnce(&'a mut T) -> U,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost certainly wrong: 'a is the lifetime of the Mutex object, which can be 'static.

Suggested change
F: FnOnce(&'a mut T) -> U,
F: FnOnce(&mut T) -> U,

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'a is the lifetime of the MutexGuard's lock of the Mutex, which isn't the same and to my knowledge it is valid to get an &'a mut T from a MutexGuard<'a, _, T> since for the duration of 'a the only reachable path to the T is via the MutexGuard, and the MutexGuard is consumed in this method. Can you come up with an example of a piece of invalid code that would be enabled by this?

Copy link
Author

@eira-fransham eira-fransham Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I gave it some thought and I think I'll remove the lifetime, it's the more-conservative and more-obviously-correct option and it still allows for 99% of valid usecases, most importantly MutexGuard::map(foo, |a| &mut a.some_field). We can generalise f to be a 'a-taking function later if it proves to be a pain point, but it's not worth the possibility of causing UB.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    #[test]
    fn test_map() {
        let mut outer = &mut 0;
        static M: Mutex<i32> = Mutex::const_new(super::RawMutex::INIT, 0);
        let guard = super::MutexGuard::map(M.lock(), |inner: &'static mut i32| {
            outer = inner;
        });
        drop(guard);
        *outer = 1;
    }

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'a is the lifetime of the Mutex. The lifetime of the lock is the lifetime of self.

Copy link
Author

@eira-fransham eira-fransham Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach appears to work and disallows the code we want to disallow while allowing the code we want to allow, but the ergonomics are quite bad (I'll explain what the problem is below):

pub trait FnOnceShim<'a, T: 'a> {
    type Output: 'a;

    fn call(self, input: T) -> Self::Output;
}

impl<'a, F, In, Out> FnOnceShim<'a, In> for F
where
    F: FnOnce(In) -> Out,
    In: 'a,
    Out: 'a,
{
    type Output = Out;

    fn call(self, input: In) -> Self::Output {
        self(input)
    }
}

impl<'a, R: RawMutex + 'a, T: ?Sized + 'a> MutexGuard<'a, R, T> {
    // ...

    pub fn map<F>(
        s: Self,
        f: F,
    ) -> MappedMutexGuard<'a, R, <F as FnOnceShim<'a, &'a mut T>>::Output>
    where
        for<'any> F: FnOnceShim<'any, &'any mut T>,
    {
        // ...
    }
    
    // ...
}

impl<'a, R: RawMutex + 'a, T: 'a> MappedMutexGuard<'a, R, T> {
    // ...
    
    pub fn map<F>(s: Self, f: F) -> MappedMutexGuard<'a, R, <F as FnOnceShim<'a, T>>::Output>
    where
        for<'any> F: FnOnceShim<'any, T>,
    {
        // ...
    }

    // ...
}

The issue is that Rust will no longer automatically convert the closure to the most-general version of it like it will when being passed as an F: FnOnce(..) -> ... Therefore, you need to do something like this:

MutexGuard::map(M.lock(), { |inner| inner } as fn(&mut usize) -> &mut usize)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it help if you changed the bounds to contain both FnOnce and FnOnceShim? Even though it is redundant, it may help with type inference.

Copy link
Author

@eira-fransham eira-fransham Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately that doesn't work - you would have to bound for<'any> F: FnOnce<(&'any mut T,)> because using the stable F: FnOnce(..) -> .. syntax you need to specify the return type, which means the caller must specify a concrete lifetime. The only bound that would work is:

pub fn map<U, F>(
    s: Self,
    f: F,
) -> MappedMutexGuard<'a, R, <F as FnOnceShim<'a, &'a mut T>>::Output>
where
    for<'any> F: FnOnceShim<'any, &'any mut T>,
    F: FnOnce(&'a mut T) -> U,
{
    // ..
}

But then we're back to square one since that will make Rust infer the smaller bound, which is the problem we were having in the first place. I'll see if the shim fn traits in the fn_ops crate fixes the issue.

Copy link
Author

@eira-fransham eira-fransham Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: you can also work around this with impl Trait, but it's only slightly less clunky

let m = Mutex::new((0, 0));

fn make_mapper() -> impl for<'any> FnOnce(&'any mut (usize, usize)) -> &'any mut usize {
    |inner: &mut (_, _)| &mut inner.0
}

let mapper = make_mapper();

let guard = MutexGuard::map(m.lock(), mapper);

And you can also use a local function with manually-specified trait bounds, but client code needs to write this manually since if you're returning e.g. a Ref<'_, T> then you will need to specify that (discovered in rust-lang/rust#58052)

fn annotate<T, U, F>(f: F) -> F
where
    F: FnOnce(&mut T) -> &mut U,
{
    f
}

let m = Mutex::new((0, 0));

let mapper = annotate(|inner: &mut (_, _)| &mut inner.0);

let guard = MutexGuard::map(m.lock(), mapper);

Copy link
Author

@eira-fransham eira-fransham Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I hit a new problem - in order to support try_map, we essentially either need a new trait specialised to try_map or we need to use the try_trait_v2 and control_flow_enum features. Here's what those options look like:

New trait

pub trait FnOnceOptionShim<'a, T: 'a> {
    type Output: 'a;

    fn call(self, input: T) -> Option<Self::Output>;
}

impl<'a, F, In, Out> FnOnceOptionShim<'a, In> for F
where
    F: FnOnce(In) -> Option<Out>,
    In: 'a,
    Out: 'a,
{
    type Output = Out;

    fn call(self, input: In) -> Option<Self::Output> {
        self(input)
    }
}

pub fn try_map<F>(
    s: Self,
    f: F,
) -> Result<
    MappedRwLockReadGuard<'a, R, <F as FnOnceOptionShim<'a, &'a T>>::Output>,
    Self,
>
where
    for<'any> F: FnOnceOptionShim<'any, &'any T>,
{
    let raw = &s.rwlock.raw;
    let data = match f.call(unsafe { &*s.rwlock.data.get() }) {
        Some(data) => data,
        None => return Err(s),
    };
    mem::forget(s);

    Ok(MappedRwLockReadGuard {
        raw,
        data,
        marker: PhantomData,
    })
}

try_trait_v2 + control_flow_enum

pub fn try_map<F>(
    s: Self,
    f: F,
) -> Result<
    MappedRwLockReadGuard<
        'a,
        R,
        <<F as FnOnceShim<'a, &'a T>>::Output as core::ops::Try>::Output,
    >,
    Self,
>
where
    for<'any> F: FnOnceShim<'any, &'any T>,
    for<'any> <F as FnOnceShim<'any, &'any T>>::Output: core::ops::Try,
{
    use core::ops::{ControlFlow, Try};

    let raw = &s.rwlock.raw;
    let data = match Try::branch(FnOnceShim::call(f, unsafe { &*s.rwlock.data.get() })) {
        ControlFlow::Continue(data) => data,
        ControlFlow::Break(_) => return Err(s),
    };
    mem::forget(s);

    Ok(MappedRwLockReadGuard {
        raw,
        data,
        marker: PhantomData,
    })
}

where
F: FnOnce(&mut T) -> Option<&mut U>,
F: FnOnce(&'a mut T) -> Option<U>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

data: *mut T,
marker: PhantomData<&'a mut T>,
// We use `&'a mut` to make this type invariant over `'a`
marker: PhantomData<&'a mut ()>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it actually need to be invariant here? I think we can safely remove this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept it invariant because that was how it was in the original code, and it's the most conservative option. Without it this type would be variant, which is probably correct but there's a chance of it being subtly broken.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this made T invariant, not 'a (see https://doc.rust-lang.org/nomicon/subtyping.html#variance).

The proper way to get invariance on 'a is PhantomData<fn(&'a ()) -> &'a ()>

@@ -780,10 +788,12 @@ impl<R: RawMutex, T: ?Sized> Drop for ArcMutexGuard<R, T> {
/// could introduce soundness issues if the locked object is modified by another
/// thread.
#[must_use = "if unused the Mutex will immediately unlock"]
pub struct MappedMutexGuard<'a, R: RawMutex, T: ?Sized> {
pub struct MappedMutexGuard<'a, R: RawMutex, T: ?Sized + 'a> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think T has to live as long as 'a, we can remove the 'a bound on T.

Only R needs the bound because we have a &'a R.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It definitely doesn't need the 'a bound on the struct itself, and I think you're right that it isn't needed elsewhere either.

{
use core::ptr;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the top of the file.


let data = unsafe { ptr::read(&mut s.data) };

let data = f(data);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a potential double-drop here: if f panics then data will be dropped again despite having already been read from s.data.

let raw = s.raw;
let data = f(unsafe { &mut *s.data });

let data = unsafe { ptr::read(&mut s.data) };
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let data = unsafe { ptr::read(&mut s.data) };
let data = unsafe { ptr::read(&s.data) };

@Amanieu
Copy link
Owner

Amanieu commented Aug 30, 2021

Unfortunately the need to cast the closure to a fn is a dealbreaker: it makes the interface unergonomic and doesn't work with closures that capture state.

@eira-fransham
Copy link
Author

eira-fransham commented Aug 31, 2021

Unfortunately the need to cast the closure to a fn is a dealbreaker: it makes the interface unergonomic and doesn't work with closures that capture state.

You don't need to cast to fn, you just need to force Rust to infer the most-general lifetime for the argument. There are a couple of ways to do this, but they all work in essentially the same way:

// Option 1
fn annotate<F, T, U>(f: F) -> F
where
    F: FnOnce(&mut T) -> &mut U,
{
    f
}

let func = annotate(|foo: &mut (_, _)| &mut foo.0);

// Option 2
fn create_mapping_func<A, B>() -> impl FnOnce(&mut (A, B)) -> A {
    |foo| &mut foo.0
}

let func = create_mapping_func();

// Option 3
let func: &mut dyn FnMut(&mut (_, _)) -> &mut _ = &mut |foo| &mut foo.0;

// Option 4
let func: fn(&mut (_, _)) -> &mut _ = (|foo| &mut foo.0) as _;

All of these are pretty bad though, sadly. Rust has a long-standing issue open to fix this, but it has yet to be addressed rust-lang/rust#70263

If it was only required to do this for simple cases, then there would be no problem. The issue is that it's required for all cases. Maybe we could add map_mut and map_ref methods which specialise for FnOnce(&mut T) -> &mut U and FnOnce(&T) -> &U respectively? Again, not great, but it would mean that simple cases still function correctly. You would probably need to add the same for Mapped*Guard, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider allowing owned data in MappedMutexGuard
5 participants