Description
As far as I can see, the only reason to do linking is because samplers need it. Samplers moreover interface with a vector of floats, not a VarInfo. So really the only place when we need linked variables is when we call getindex_internal
or something like vi[:]
. If I'm mistaken about this, and there's a different need, all of the below may be moot, so do say so.
This lead me to wonder why we bother storing linked variables at all, rather than converting at getindex
/setindex
time. Currently our philosophy seems to be that samplers give and take vectors of linked floats, and when interfacing with a sampler we typically call vi[:]
and unflatten(vi, vec_of_floats)
to read/write those vectors of floats from/to the internal storage of the VarInfo, where the variables are linked if link
has been called on vi
at some point in its history. Then whenever we need the unlinked versions of variables, such as when calling logpdf
, we convert the internal, linked storage to the unlinked one.
An alternative approach would be to say that VarInfo only ever stores unlinked variables, and rather than calling link
on vi
once at the beginning of sampling and using getindex
/setindex
, if you would want linked variables you would need to call getindex_linked
and setindex_linked
or unflatten_linked
, and the linking would happen on the fly in that call.
Needs linked Needs unlinked
┌──────────┐ ┌───────────┐ ┌─────────────┐
│ │ │ │ │ │
│ Sampler │ ◄─────► │ VarInfo │ ◄─────► │ logpdf call │
│ │ │ │ │ │
└──────────┘ ▲ └───────────┘ ▲ └─────────────┘
│ │
│ │
Alternative: Current:
(Un)link here (Un)link here
I'm not sure if the alternative is better, but I do see some benefits to it:
- VarInfos would have less state that they carry. Currently when you call
getindex_internal
orunflatten
you need to know/remember whether this VarInfo has been linked at some point in its history (or check for it) to know what you'll get. - Sampler interface code would be more explicit, at the call-site, as to whether it wants linked variables or not. This would make the interface code a bit easier to understand.
- VarInfo would be simpler. We could drop the
istrans
flags, and never worry about things like having some variables linked and others not. Note that we would still need to carry the linking transformations in the VarInfo, to know how to transform when someone calls e.g.getindex_linked
.
This could also help us move towards something like SimpleVarInfo
or the @parameters
struct of TuringLang/Turing.jl#2492, where the storage in VarInfo would more closely match how the model sees the variables. I haven't thought about this carefully, but I could imagine this leading to some performance gains and maybe code simplification.
This is currently more an idea to be discussed than a proposal to implement. Any thoughts? Downsides to linking at getindex
/setindex
time?