Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Order-only dependencies #5514

Open
snowleopard opened this issue Mar 10, 2022 · 8 comments
Open

[RFC] Order-only dependencies #5514

snowleopard opened this issue Mar 10, 2022 · 8 comments
Assignees
Labels
proposal RFC's that are awaiting discussion to be accepted or rejected

Comments

@snowleopard
Copy link
Collaborator

It is sometimes useful to specify dependencies of a build action in two steps:

  • Before executing the action, specify the set of its possible dependencies P. The set P is used only for scheduling, i.e., everything in P must be built before running the action. Such dependencies are often called order-only dependencies.

  • After executing the action, specify the set of its actual dependencies A, where A must be a subset of P. These dependencies are used to determine when the action will need to be rebuilt.

Here are a couple of use-cases:

  • It is common to depend on all C headers in a project (i.e. headers/*.h), to make sure they are up-to-date, and then compile C sources using gcc with the flags -MMD -MF to obtain the set of actual headers that were needed for compilation.

  • A recently discussed use-case in OCaml is to improve build incrementality with respect to library dependencies. For example, if a library B depends on a library A, we can use order-only dependencies to ensure that all modules of A have been built before compiling modules of B, and then use ocamldep to find actual dependencies of each module of B, to avoid recompiling all B's modules whenever any module of A changes.

Order-only dependencies are not new, for example, the Shake build system provides orderOnly and needed primitives to specify order-only and actual dependencies, respectively. See Section 3.5 of this paper for more details.

For now, this RFC only describes the idea but I'll update it with more details after we figure out how to add this feature to Dune.

@snowleopard snowleopard self-assigned this Mar 10, 2022
@snowleopard snowleopard added the proposal RFC's that are awaiting discussion to be accepted or rejected label Mar 10, 2022
@bobot
Copy link
Collaborator

bobot commented Mar 13, 2022

Do you have an idea of the rough API that would give?

@snowleopard
Copy link
Collaborator Author

Do you have an idea of the rough API that would give?

At the level of the build engine, we already have the goal primitive that allows expressing order-only dependencies in build rules:

(** [goal t] ignores all facts that have been accumulated about the dependencies
of [t]. For example, [goal (path p)] declares that a path [p] contributes to
the "goal" of the resulting action builder, which means [p] must be built,
but the contents of [p] is irrelevant. *)
val goal : 'a t -> 'a t

This API in principle seems sufficient, although one could imagine packaging it up into a more explicit combinator, with some associated documentation/examples. I haven't yet had time to think about how that combinator would look like -- any suggestions are welcome!

We could also consider exposing order-only dependencies at the level of the dune language, e.g. adding order_only_deps to the rule stanza, but I'm not sure it's really worth it.

@nojb
Copy link
Collaborator

nojb commented Apr 2, 2022

I tried to make this work a bit but got stuck, perhaps we can have a look during the next dev meeting.

@bobot
Copy link
Collaborator

bobot commented Apr 4, 2022

The order only part is only the first part, we need a way to add the actual dependency as dependency of the 'a t which is computing it. Perhaps something like : ('a * Deps.t) t -> 'a t

@snowleopard
Copy link
Collaborator Author

@bobot We already have this one :)

val dyn_deps : ('a * Dep.Set.t) t -> 'a t

@bobot
Copy link
Collaborator

bobot commented Apr 4, 2022

I'm not sure it is enough, because dyn_deps adds dependency "at the end" instead of "at the start". Perhaps the type I gave is not correct, (unit -> ('a * Deps.t) t) -> 'a t could be better? Or perhaps if I put everything in one function:

deps_after_computation: Deps.t -> (unit -> ('a * Deps.t) t) -> 'a t

deps_after_computation deps0 f: deps0 are order only dependencies. f () returns (v,deps1) with deps1 included in deps0. If deps1 change then f () should be recomputed.

@snowleopard
Copy link
Collaborator Author

snowleopard commented Apr 4, 2022

I think we can use goal "at the start" and any other dependency-creating primitive "at the end" to achieve what we want.

For example, consider this snippet of code:

let if_then_else path_bool path_true path_false : string Action_builder.t =
  let open Action_builder.O in
  (* Declare some order-only dependencies at first *)
  let* () =
    Action_builder.goal
      (Action_builder.paths [ path_bool; path_true; path_false ])
  in
  (* Now start depending on actual paths as needed. *)
  let* bool = Action_builder.contents path_bool in
  match bool with
  | "true" -> Action_builder.contents path_true
  | _ -> Action_builder.contents path_false

I think this will create order-only dependencies for all three paths path_bool, path_true and path_false but only two actual dependencies: on path_bool and one of path_true and path_false, depending on the contents of path_bool, and will also return the corresponding string.

Does this make sense? (It's been a while since I looked at Action_builder, so perhaps I'm missing some subtlety.)

@bobot
Copy link
Collaborator

bobot commented Apr 5, 2022

I think this example is too much simplified because only path_bool change the dependencies. In the use cases the content of path_true could also make the reading of path_false needed. More importantly the use cases are black box (calling an external program) which disallow an explicit description of the dependencies of each reads, indeed in the if_then_else example the order-only dependencies are not needed.

The code would be more similar to:

let call_blackbox call paths : 'a Action_builder.t =
  let open Action_builder.O in
  (* Declare some order-only dependencies at first *)
  let* () = Action_builder.goal  (Action_builder.paths paths)  in
  (** forward declaration of dependencies *)
  let magic_deps, change_magic_deps = mk_magic_deps () in
  (** if the real dependencies, defined later, change we should restart here *)
  let+ () = magic_deps in
  let* v, deps = call () in
  assert ( List.included deps paths);
  (** define the dependencies only now *)
  change_magic_deps deps;
  v

Perhaps my understand of dyn_deps or other currently existing dependency-creating primitive is wrong. My understanding is whatever such primitive we use after let* v, deps = call () in will not restart call ().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal RFC's that are awaiting discussion to be accepted or rejected
Projects
None yet
Development

No branches or pull requests

3 participants