Do we need to store linked variables in VarInfo?

As far as I can see, the only reason to do linking is because samplers need it. Samplers moreover interface with a vector of floats, not a VarInfo. So really the only place when we need linked variables is when we call `getindex_internal` or something like `vi[:]`. If I'm mistaken about this, and there's a different need, all of the below may be moot, so do say so.

This lead me to wonder why we bother storing linked variables at all, rather than converting at `getindex`/`setindex` time. Currently our philosophy seems to be that samplers give and take vectors of linked floats, and when interfacing with a sampler we typically call `vi[:]` and `unflatten(vi, vec_of_floats)` to read/write those vectors of floats from/to the internal storage of the VarInfo, where the variables are linked if `link` has been called on `vi` at some point in its history. Then whenever we need the unlinked versions of variables, such as when calling `logpdf`, we convert the internal, linked storage to the unlinked one.

An alternative approach would be to say that VarInfo only ever stores unlinked variables, and rather than calling `link` on `vi` once at the beginning of sampling and using `getindex`/`setindex`, if you would want linked variables you would need to call `getindex_linked` and `setindex_linked` or `unflatten_linked`, and the linking would happen on the fly in that call.

```
Needs linked                                Needs unlinked 
                                                           
┌──────────┐         ┌───────────┐          ┌─────────────┐
│          │         │           │          │             │
│ Sampler  │ ◄─────► │  VarInfo  │  ◄─────► │ logpdf call │
│          │         │           │          │             │
└──────────┘   ▲     └───────────┘     ▲    └─────────────┘
               │                       │                   
               │                       │                   
                                                           
          Alternative:             Current:                
          (Un)link here            (Un)link here           
```

I'm not sure if the alternative is better, but I do see some benefits to it:
* VarInfos would have less state that they carry. Currently when you call `getindex_internal` or `unflatten` you need to know/remember whether this VarInfo has been linked at some point in its history (or check for it) to know what you'll get.
* Sampler interface code would be more explicit, at the call-site, as to whether it wants linked variables or not. This would make the interface code a bit easier to understand.
* VarInfo would be simpler. We could drop the `istrans` flags, and never worry about things like having some variables linked and others not. Note that we would still need to carry the linking transformations in the VarInfo, to know how to transform when someone calls e.g. `getindex_linked`.

This could also help us move towards something like `SimpleVarInfo` or the `@parameters` struct of https://github.com/TuringLang/Turing.jl/issues/2492, where the storage in VarInfo would more closely match how the model sees the variables. I haven't thought about this carefully, but I could imagine this leading to some performance gains and maybe code simplification.

This is currently more an idea to be discussed than a proposal to implement. Any thoughts? Downsides to linking at `getindex`/`setindex` time?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do we need to store linked variables in VarInfo? #836

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Do we need to store linked variables in VarInfo? #836

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions