Skip to content

Conversation

@phipsgabler
Copy link
Member

Currently, a VarInfo prints to a huge blob showing all kinds of stuff. Which is useful on its own in the REPL, but not as part of another printed structure.

I moved that version to show(::IO, ::MIME"text/plain", ::UntypedVarInfo), and added a dummy replacement

Base.show(io::IO, vi::UntypedVarInfo) = print(io, "VarInfo of ", vi.metadata.vns)

Please suggest what things should actually printed in the short version.

@devmotion
Copy link
Member

Maybe just print the number of variables and possibly the log probability in the short version?

@codecov
Copy link

codecov bot commented Apr 10, 2020

Codecov Report

Merging #66 into master will decrease coverage by 2.99%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #66      +/-   ##
==========================================
- Coverage   79.42%   76.43%   -3.00%     
==========================================
  Files          13       13              
  Lines         841      853      +12     
==========================================
- Hits          668      652      -16     
- Misses        173      201      +28     
Impacted Files Coverage Δ
src/varinfo.jl 84.42% <0.00%> (-3.85%) ⬇️
src/compiler.jl 79.86% <0.00%> (-9.40%) ⬇️
src/utils.jl 53.06% <0.00%> (-3.55%) ⬇️
src/varname.jl 76.47% <0.00%> (+0.96%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9fbe09a...6903e3e. Read the comment docs.

@phipsgabler
Copy link
Member Author

VarInfo (2 variables, logp = -3.304)?

@devmotion
Copy link
Member

Yes, something like this. I guess that should be the most important information, shouldn't it? But maybe it's still to verbose?

@phipsgabler
Copy link
Member Author

For me everything that short is fine -- it shouldn't clutter printing of nested structures.

I'll leave this open some time to wait for feedback.

@phipsgabler phipsgabler changed the title Add shorter non-display printing Add shorter non-display printing of VarInfo Apr 10, 2020
@yebai
Copy link
Member

yebai commented Apr 13, 2020

VarInfo (2 variables, logp = -3.304)?

Maybe add:

  • model dimensionality
  • whether variables are in unconstrained space or not
  • observed variable names (just the symbol is fine, we don't need indexing here).

@devmotion
Copy link
Member

Hmm I'm not sure, the intention here is to have a really short condensed way of printing a VarInfo that will be used, e.g., when printing arrays of VarInfo objects. If an expression returns a single VarInfo object, the REPL will use the longer version implemented by show(io, ::MIME"plain/text", ::VarInfo) to display it.

@phipsgabler
Copy link
Member Author

phipsgabler commented Apr 13, 2020

I concur with David, although I think that adding the variables names is not a bad idea.

@yebai
Copy link
Member

yebai commented Apr 13, 2020

I see David's point, but model dimensionality is something valuable and can be packed into a few words.

@phipsgabler
Copy link
Member Author

phipsgabler commented Apr 13, 2020

@yebai What exactly do you mean by that? The sum of the sizes of all variables? Is this currently used anywhere in the VarInfo code?

@yebai
Copy link
Member

yebai commented Apr 13, 2020

Yes, exactly.

@phipsgabler
Copy link
Member Author

What about this:

VarInfo (λ (size 1), m (size 1); logp: -3.612)

where size = length(md.ranges[md.idcs[vn]])?

@yebai
Copy link
Member

yebai commented Apr 13, 2020

Looks good, maybe reformat the dimensionality message

VarInfo (λ: scalar, m: scalar, x: (3,4); logp: -3.612)

Also, this might gets messy when we have a lot of parameters. That's where a single model dimensionality might be less verbose.

@phipsgabler
Copy link
Member Author

phipsgabler commented Apr 13, 2020

Ok, I tried to extract the size:

function _vname_info(vn, vi)
    md = vi.metadata
    s = size(md.vals[md.ranges[md.idcs[vn]]])
    if s == ()
        return "$(getsym(vn))"
    else
        return "$(getsym(vn)) $s"
    end
end

But since the values are only given through ranges, there seems to be no way to distinguish scalars from size 1 vectors? E.g. for a gdemo-like model:

VarInfo (λ (1,), m (1,); logp: -3.612)

Or is there?

@yebai
Copy link
Member

yebai commented Apr 13, 2020

But since the values are only given through ranges, there seems to be no way to distinguish scalars from size 1 vectors?
VarInfo (λ (1,), m (1,); logp: -3.612)

That works.

@devmotion
Copy link
Member

But since the values are only given through ranges, there seems to be no way to distinguish scalars from size 1 vectors?

No, AFAIK one can only distinguish between scalars and vectors with one element by checking if the corresponding distribution is uni- or multivariate. IMO that's one of the inconveniences caused by vectorizing everything in VarInfo, and leads to some heuristics (and ugly code) in the implementation of the ESS and MH sampler in Turing.

@phipsgabler
Copy link
Member Author

phipsgabler commented Apr 14, 2020

Ok, based on this feedback I think it's best to show just the overall dimension of the model, plus a limited number of symbols of variable names (given by a constant _MAX_VARS_SHOWN = 4). The latter makes the VarInfo more easily to recognize, but prevents clutter for large models.

The current way of implementating this does not distinguish between one @varname x of size 3 and [@varname(x[1]), @varname(x[2]), @varname(x[3])] of 3 scalars, see test1 and test2 below.

julia> @model function test3()
           x ~ Bernoulli()
           y ~ Bernoulli()
           z ~ Bernoulli()
           v ~ Bernoulli()
           w ~ MvNormal(zeros(3), 1.2)
       end
ModelGen{var"###generator#628",(),(),Tuple{}}(##generator#628, NamedTuple())

julia> vi = VarInfo(); test3()(vi); show(vi)
VarInfo (5 variables (w, y, v, z, ...), dimension 7; logp: -11.837)

julia> @model function test2()
           s ~ Gamma(1.0, 1.0)
           x = zeros(3)
           for i in 1:3
               x[i] ~ Normal(0.0, s)
           end
       end
ModelGen{var"###generator#644",(),(),Tuple{}}(##generator#644, NamedTuple())

julia> vi = VarInfo(); test2()(vi); show(vi)
VarInfo (2 variables (s, x), dimension 4; logp: 3.142)

julia> @model function test1()
           s ~ Gamma(1.0, 1.0)
           x ~ MvNormal(zeros(3), s)
       end
ModelGen{var"###generator#660",(),(),Tuple{}}(##generator#660, NamedTuple())

julia> vi = VarInfo(); test1()(vi); show(vi)
VarInfo (2 variables (s, x), dimension 4; logp: -3.651)

@yebai
Copy link
Member

yebai commented Apr 14, 2020

This looks nice. To distinguish parameters and data, maybe group them and annotate using a semicolon, e.g.

VarInfo (6 variables (w, y, v, z, ...; d), dimension 7; logp: -11.837)

where w,y,v,z... are model parameters, and d is observed data.

@phipsgabler
Copy link
Member Author

But the observed data is not contained in the VarInfo at all, so where would I get it from -- or am I mistaken?

@devmotion
Copy link
Member

No, I think you're right, it's not available (and hence probably should not be printed?).

@yebai
Copy link
Member

yebai commented Apr 14, 2020

But the observed data is not contained in the VarInfo at all, so where would I get it from -- or am I mistaken?

Hmm, I didn't release this - sorry for the noise. We might want to store the observed variable information in VarInfo in the future, but it is out of this PR's scope.

@yebai yebai merged commit 9be5881 into TuringLang:master Apr 14, 2020
@phipsgabler phipsgabler deleted the phg/varinfo_show branch April 14, 2020 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants