Skip to content

Conversation

@lthls
Copy link

@lthls lthls commented May 22, 2025

Here is a proposal for achieving the goals of concurrency-safe lazy values while still providing a guarantee that the function will be called at most once.

The third example from the documentation would look like this:

let entropy =
  (* we use a mibibyte of random data from /dev/urandom *)
  let init_mutex = Mutex.create () in
  let race_behaviour : Lazy.Atomic_repeating.race_behaviour =
    Synchronise {
      wait = (fun () -> Mutex.protect init_mutex Fun.id);
      broadcast = (fun () -> Mutex.unlock init_mutex)
    }
  in
  Lazy.Atomic_repeating.from_fun ~race_behaviour (fun () ->
    Mutex.lock init_mutex;
    In_channel.with_open_bin "/dev/urandom" (fun chan ->
      In_channel.really_input_string chan (1024 * 1024)
    )
  )

I also included a non-blocking version of force (which returns an option), as I think it could be useful.

@gasche
Copy link
Owner

gasche commented May 24, 2025

I have mixed feelings about this.

It asks users to explain how they want to wait on an ongoing computation, and then how they want to signal that a computation is done to re-activate the waiters. I think that this is a reasonable interface for an agnostic approach to blocking. This is a positive point about your PR, but it also means that your implementation is trying to solve a different problem from mine and should be considered independently : we could have a separate Lazy.Atomic_blocking module with something like your proposal.

I am also not sure that your proposal is better than other proposed designs to explain how to wait, for example the one proposed by Leo on Discuss. Your proposal should be contrasted and compared to those alternative proposals for an agnostic-blocking approach, rather than just to mine, to make an informed decision.

I'm not sure that your API is quite right. For example there is no way to have a function Mutex.t -> race_behaviour that encapsulates the idea that we use a given Mutex to provide blocking behavior. This is not enough because if we want to use a Mutex we also need to change the forcing thunk. Maybe your interface should be enriched with a third callback to be called by the Forcing winner. (So maybe block, unblock (instead of broadcast) and wait?).

I have mixed feelings when contrasting our two examples with mutexes:

  • The interface of your approach sounds natural, but its implementation using mutexes does not look natural to me, because it does not has syntactically-apparent critical sections. Locking in the winner is done in an asymmetric way (the thunk only locks), and the use of locking to wait is a bit of a trick.
  • On the other hand, I like that my approach uses a standard exclusion-in-a-critical section mechanism for the mutex, but the fact that the final result must be stored out-of-band is also a bit tricky. (Arguably users have to reimplement more of the thunk mechanism themselves in that approach.)

I'm confident that my recipe can be ported by users to other concurrency libraries in a simple way, by replacing mutexes with another, scheduler-specific exclusion mechanism. Yours can also be ported in a simple way, maybe even simpler, for libraries that have clear and pleasant waiting-oriented data structures, but its implementation using locks is meh. (Less painful to use than pthreads condition variables.)

Regarding a non-blocking variant of force: I also started my API design with one, and then I realized that I was unsure about the use-cases, and I decided to get rid of it to have something simpler. This can easily be added later if we agree on the implementation without it. I think you could/should also drop it for simplicity.

In summary, my conclusion: this is a proposal for an alternative Lazy.Atomic_blocking module which has merit, but should be understood as a proposal to do scheduler-agnostic blocking and be compared with other proposals in that space. You could/should open your own upstream PR.

@gasche
Copy link
Owner

gasche commented May 24, 2025

(Nitpick: I find the name "race_behaviour" rather uncomfortable due to the other meaning of "race" it brings to mind. I would recommend calling this "blocking_behaviour" instead or "sync_ops" or whatever.)

@lthls
Copy link
Author

lthls commented May 24, 2025

In summary, my conclusion: this is a proposal for an alternative Lazy.Atomic_blocking module which has merit, but should be understood as a proposal to do scheduler-agnostic blocking and be compared with other proposals in that space. You could/should open your own upstream PR.

My point when submitting this PR was to show that repeating was not necessary to provide concurrency-safe lazy thunks. You might also notice that the Fail case is just the regular Lazy.t behaviour (i.e. our regular lazy values are concurrency-safe, but the behaviour for forcing races is usually not the one we want).
So you're completely right that my API is not good, that the mutex example is not satisfying, and so on, but my aim was just to show that you don't have to give up on your PR: Pierre and I are opposed to repeating computations, but other solutions are fine with us.

I actually wish we could have discussed that in person (or by visio); it's hard to properly discuss this using only Github comments.

(Nitpick: I find the name "race_behaviour" rather uncomfortable due to the other meaning of "race" it brings to mind. I would recommend calling this "blocking_behaviour" instead or "sync_ops" or whatever.)

Yes, I agree. I left it as it is because I wasn't really planning to push this through, but if I had to actually make a PR upstream I would change it.

@gasche
Copy link
Owner

gasche commented May 24, 2025

I'm happy to let this topic wait for our next in-person conversation. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants