Description
This is related to the discussion in JuliaLang/Pkg.jl#1285 but the plan I propose doesn't touch Pkg at all, so I'm writing it up here. @Wimmerer contacted me about wanting to work on this feature and we had a back and forth about what needs to be done. Here's what I propose as a concrete solution to the glue code problem, aka conditional dependencies.
The problem we want to address occurs when there are packages, say A
and B
, which don't depend on each other, but when both are loaded at the same time, there is are some additional definitions—usually of methods—that are needed to make them work well together. For example, suppose A
provides a function A.f
and B
defines a type B.T
and there's a method A.f(::B.T)
that would be useful to have. This method cannot be defined in A
since B.T
isn't available there and it can't be defined in B
since A.f
isn't available there. We need a way to provide that definition at the point when both A
and B
have been loaded.
The solution comes in two parts.
Part 1: post-load hooks for sets of packages
The first part is to implement and expose a mechanism whereby you can register a callback to be called when a set of packages have been loaded. The packages should be identified by UUID and it should be an arbitrary set of packages, not just one or two packages. This logic goes right after a new package has been loaded and should check for each registered hook, whether the set of loaded package UUIDs is a superset of the set of required package UUIDs, and if it is, then call the hook callback and then delete it.
Part 2: glue/*.jl
entry-points
The second part leverages the first mechanism and uses it to load some glue code that is included in packages that are loaded. The idea is that package A
will have a top-level directory called glue
and a file glue/B.jl
which contains the code that implements the methods that are useful when both A
and B
are loaded. In more detail:
- When Julia loads a new package, say
A
, it should look at allglue/*.jl
files - For each one, split the name on commas—those are the names of the glue dependencies
- e.g.
glue/B.jl
in the packageA
would be the file providing glue definitions forA
andB
- If any part of a name isn't a valid package name, ignore the file and continue
- e.g.
- To resolve each glue dependency name to a package UUID, look it up in a new section of
A
's project file entitled[glue]
with the same format as the[deps]
and[extras]
sections, i.e. mapping names to UUIDs- If a name doesn't exist there, error or maybe ignore?
- Use the contents of the glue file to define a hook that will trigger when the set of glue dependencies have all been loaded, using the mechanism created in part 1.
So, glue/B.jl
in the package A
would be the glue for A
and B
and glue/B,C.jl
would be glue for the packages B
and C
together, i.e. definitions that depend on all three of A
, B
and C
.
The next question is how to use the glue file to define a hook. We want the behavior of the glue files to be fairly constrained, so I propose that the hook generated look something like this:
()->@eval Module() begin
import A, B
include($(abspath(glue_file)))
end
This evaluates the glue code in an anonymous module that imports A
and B
. I'm not sure how easy it would be to rig it up so that this is done in a context where only A
and B
can be loaded and if that's really necessary or desirable, but otherwise the code can load anything that can be loaded in Main, which probably isn't ideal or sensible for glue hooks.
This hook could probably be expressed generically and be written as glue_hook(glue_path, A, B)
or something like that, which might avoid code some code generation. Or in the other direction, we could insert the contents of the glue file into the above template and generate that function. But I suspect that's not what we want. Probably better to be late binding and do less code generation.
The actual contents of a glue file would be pretty simple. For example, the glue file for defining the method A.f(::B.T)
would simply contain that method definition:
A.f(::B.T) = definition
This would either be glue/B.jl
in package A or glue/A.jl
package B.
Questions
What happens if glue/B.jl
in package A and glue/A.jl
in package B both exist? Load them both.
What about the order? Eh, whatever order they happen in is probably fine, but we could maybe have a defined ordering based on UUIDs or something.
What if they have conflicting definitions and it breaks? That's a normal package incompatibility between those versions of A
and B
, although it's a slightly weird one because neither A
nor B
depends on the other but their versions are incompatible. I'm not certain if we can express that in [compat]
—it's possible that we can't. At the very least, [compat]
probably needs to know about the [glue]
section, which may end up being a reason to put glue dependencies in the [extras]
section since I think [compat]
may already know about that.
Why not make glue packages real registered external packages that live outside of both A
and B
? Because that complicates things massively. In that design, Pkg needs to know about glue packages and needs to makes sure that whenever A
and B
are both in a manifest, then Glue_A_B
is also included in the manifest. Also, registering glue packages and versioning them separately seems like a lot of overhead.
What if the glue behavior depends on what features the particular version of B
that gets loaded has? The glue code can do whatever metaprogramming it needs to, reflecting on B
, it’s version, and features.
Edit: couple of added questions
Why does the glue code go in A/glue/
instead of somewhere in A/src/
—isn't it part of the A
package? No, not really. If you just load A
then you load what's in A/src
and you don't load anything from A/glue
at all. The glue stuff is really external lightweight packages that depend on A
, which is why it makes sense to put them outside of the main codebase of A
.
What's really so bad about making glue packages explicit separate packages? For one thing, we don't even have a way of expressing that to Julia's version resolver and it's unclear how we would even do this. The resolver currently understands two things: (1) that a version of a package depends on some other set of packages and (2) that some versions of packages are incompatible with each other. The conditional dependency pattern can't be expressed in terms of these two features—it's a new kind of thing entirely. You need to express that for some subset of pairs of versions of two packages, you need to load yet another package if you've chosen both those versions. What about for other pairs of version of those two packages? Are those incompatible or are they just pairs for which the glue isn't necessary or doesn't work? Both are valid concepts. So now you need to express several things:
- That there are certain n-tuples of packages, which, if all present in a resolver solution, require that yet another package be present in the solution.
- Which versions of those packages that reverse dependency applies to. Do you express this as a Cartesian n-product of version specs? There are sets of version tuples that can't be expressed that way. Do you allow a union of Cartesian n-products of version specs? Or allow cutting out n-products in a nested fashion. Note that this is just for specifying which version of the n-tuple of packages the reverse depedency applies to.
- You still need to express compatibility. Which versions of each dependency is the glue package compatible with. The good news is that this would just be a normal compatibility constraint that we already know how to express and solve for. The bad news is that this is a different and separate concept from (2): you have to know when you need the glue package at all before you can decide which versions will work.
This is a lot of complicated new crap to cram into the registry and teach the resolver about. How does this proposal avoid this problem? It locks the glue code to the version of one or both reverse dependencies and then it's just the usual problem of picking those such that they're compatible.