- 
                Notifications
    You must be signed in to change notification settings 
- Fork 36
Improvements to VarNamedVector #1098
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Benchmark Report for Commit d30eca8Computer InformationBenchmark Results | 
| Codecov Report❌ Patch coverage is  
 Additional details and impacted files@@                       Coverage Diff                       @@
##           mhauru/varnamedvector-speed    #1098      +/-   ##
===============================================================
- Coverage                        81.39%   81.17%   -0.23%     
===============================================================
  Files                               40       40              
  Lines                             3763     3793      +30     
===============================================================
+ Hits                              3063     3079      +16     
- Misses                             700      714      +14     ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
 | 
| DynamicPPL.jl documentation for PR #1098 is available at: | 
| More benchmarks. These are structured so that the VNV version is always right after the Metadata version. Two of the Metadata ones crashed and are marked as such. Note that for Metadata calling  Cases where VNV is notably slower are the very simplest model (probably not worth our attenion), submodels (same thing, the model is just so tiny that overheads dominate), and loop univariate 10k, where there's a 10% performance regression between Metadata and VNV. I'll look into that a bit more. However, I just checked and the loop univariate 10k case is 570% slower on  | 
| DynamicPPL.loosen_types!! | ||
| DynamicPPL.tighten_types | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two don't actually ever mutate their inputs. Hence the !! is questionable. However, they sometimes return the original object, sometimes an object that shares memory with the original object, so you should use them like you use !! functions: Always catch the return value and never assume that the return value is independent of the input.
| if isnothing(params) | ||
| params = varinfo[:] | ||
| end | ||
| params = map(identity, params) # Concretise | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a little benchmark of which of these two ways is faster for turning e.g. Vector{Any}[1, 2] into Vector{Int}[1, 2]. The answer is that the comprehension is a tiny bit faster. Since I was changing that in VNV, I also changed it everywhere else where it's used.
| Create a new sub-metadata for an NTVarInfo. The type is chosen by the types of existing | ||
| SubMetas. | ||
| """ | ||
| @generated function _new_submetadata( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The generated function is needed because we need to check the types in the NamedTuple to see if any of the other sub-metadatas are VNVs, to know whether to make a new one be a VNV or not. This should have always been done, you can see how Metadata used to be hardcoded as the sub-metadata type in push!! before this PR. I hope to get rid of this later by either having only one possible sub-metadata type, or having the VarInfo have a field that specifies this (like what make_leaf does in #1074).
| # making that change here opens some other cans of worms related to how VarInfo uses | ||
| # BangBang, that I don't want to deal with right now. | ||
| VarNamedVector() = VarNamedVector{VarName,Real}() | ||
| VarNamedVector() = VarNamedVector{Union{},Union{},Union{}}() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably the most important and interesting change. By default VNVs are now created with an element type that allows storing nothing. push!! and setindex!! then expand that element type as needed. You should be able to push!! a value of any type into any VNV, and the container types will just flex. This is paving the way towards not having "untyped" VarInfos at all, but rather having them all be as typed as possible.
| 2.0 4.0 | ||
| ``` | ||
| """ | ||
| function loosen_types!!( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked and both this and tighten_types!! are compiled to no-ops if the types are already loose enough/as tight as possible.
I should probably write a test to check this. Will need to think about how to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions, not really holdups.
Out of curiosity, do you know if there's one change that was responsible for most of the performance improvement, or is it just a bit of everything?
| See the extended help with `??VarNamedVector` for more details. | ||
| """ | ||
| num_inactive::OrderedDict{Int,Int} | ||
| num_inactive::Dict{Int,Int} | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe premature optimisation, but a Dict{Int,Int} is isomorphic to a sparse Vector{Int}. I wonder if a SparseVector (or, actually, a dense Vector{Int} if we don't expect the vector to be too long) might be faster than a Dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, should maybe try that out, this is performance sensitive code. The length of the Vector is not a problem, we already store multiple Vectors of that length. I think this will come down to which operations we do and how often. If there are often inactive entries, then probably a Vector is just fastest. However, I think most often there are none, and then the isempty(vnv.num_inactive) may be faster with a Dict.
I think the fastest solution, and I wonder if this would be worth it, would be to have num_inactive::VI where {VI<:Union{Nothing,Vector{Int}}, and then we can compile away all the checks when we know that there are no inactive entries (num_inactive === nothing). That adds some code complexity though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened an issue: #1101
        
          
                src/varnamedvector.jl
              
                Outdated
          
        
      | julia> getindex_internal(vnv, :) | ||
| 8-element Vector{Real}: | ||
| 8-element Vector{Float64}: | ||
| 46.0 | ||
| 48.0 | ||
| 1 | ||
| 2 | ||
| 3 | ||
| 4 | ||
| 5 | ||
| 6 | ||
| 1.0 | ||
| 2.0 | ||
| 3.0 | ||
| 4.0 | ||
| 5.0 | ||
| 6.0 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably makes things faster, but is this really 'correct'? The values that were inserted with setindex!! were Ints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very good question, and I was wondering as well. The question comes down to whether to use this
julia> promote_type(Int, Float64)
Float64like we do now, or this
julia> Base.promote_typejoin(Int, Float64)
RealFor an untyped VNV, converting Ints to Float64s probably makes sense, because it makes any model that mixes the two types a lot faster. However, if we move to always using typed VarInfos, maybe we wouldn't need that, assuming that people don't mix Ints and Floats in a single vector. From a correctness point of view I agree that Int would be better, although you could argue that we should respect Julia's existing choice that promote_type(Int, Float64) === Float64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is the 'classic' situation where over-tightening eltype makes a difference:
using DynamicPPL, Distributions, Random
struct RandDie <: Distributions.DiscreteUnivariateDistribution end
Base.rand(rng::Random.AbstractRNG, ::RandDie) = rand(rng, 1:6)
Distributions.logpdf(::RandDie, x::Int) = (1 <= x <= 6) ? log(1/6) : -Inf
@model function roll_die()
    die_roll ~ RandDie()
    x ~ Normal()
end
model = roll_die()
# Works
md = last(DynamicPPL.init!!(model, VarInfo(DynamicPPL.Metadata()), InitFromPrior()))
DynamicPPL.evaluate!!(model, md)
# Fails
vnv = last(DynamicPPL.init!!(model, VarInfo(DynamicPPL.VarNamedVector()), InitFromPrior()))
DynamicPPL.evaluate!!(model, vnv)
#=
ERROR: MethodError: no method matching logpdf(::RandDie, ::Float64)
The function `logpdf` exists, but no method is defined for this combination of argument types.
Closest candidates are:
  logpdf(::Chernoff, ::Real)
   @ Distributions ~/.julia/packages/Distributions/psM3H/src/univariate/continuous/chernoff.jl:155
  logpdf(::Hypergeometric, ::Real)
   @ Distributions ~/.julia/packages/Distributions/psM3H/src/univariates.jl:645
  logpdf(::RandDie, ::Int64)
   @ Main REPL[14]:1
  ...
=#There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(See also https://pysm.dev/FlexiChains.jl/stable/why/#Heterogeneous-parameter-types which describes the same problem when MCMCChains over-concretises types -- I checked all existing Distributions subtypes of DiscreteUnivariateDistribution and all of them define a method for logpdf(dist, ::Real), so this wouldn't happen e.g. with Poisson, but in principle it can cause issues)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that looks like a bug. Good spot. I'll change to promote_typejoin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A note for the future if this comes up again: The difference between promote_typejoin and typejoin is that the former treats Nothing and Missing as exceptions, and returns things like
julia> Base.promote_typejoin(Missing, Float64)
Union{Missing, Float64}rather than
julia> typejoin(Missing, Float64)
AnyI think Unions might be desirable, but it probably doesn't matter much, and promote_typejoin isn't exported, so typejoin it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| function set_transformed!!(vi::VarInfo, val::Bool) | ||
| for vn in keys(vi) | ||
| set_transformed!!(vi, val, vn) | ||
| vi = set_transformed!!(vi, val, vn) | ||
| end | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how Julia works on this front, but is it ever possible that changing what vi is bound to might invalidate the iteration over keys(vi)? I think probably not, but ...??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's not so much rebinding vi that's the issue, but if set_transformed!! were to mutate vi, then that might be problematic. I suppose it can't mutate it in a way such that keys(vi) would result in something different, so this probably can't error. But I'm a bit paranoid about this kind of code structure. Obviously, this predates this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine, the keys(vi) call I think is evaluated once when the loop is started, and it'll keep its reference to the original vi even if the name vi is then bound to something else. Like in
julia> d = Dict(:a => 1)
Dict{Symbol, Int64} with 1 entry:
  :a => 1
julia> ks = keys(d)
KeySet for a Dict{Symbol, Int64} with 1 entry. Keys:
  :a
julia> d = Dict(:b => 1)
Dict{Symbol, Int64} with 1 entry:
  :b => 1
julia> collect(ks)
1-element Vector{Symbol}:
 :akeys(vi) is a lazy iterator though, so what I would feel dicey about is changing the set of keys in vi mid-iteration. This is a thing:
julia> d = Dict(:a => 1)
Dict{Symbol, Int64} with 1 entry:
  :a => 1
julia> ks = keys(d)
KeySet for a Dict{Symbol, Int64} with 1 entry. Keys:
  :a
julia> d[:b] = 2
2
julia> collect(ks)
2-element Vector{Symbol}:
 :a
 :bwhich can result in weirdness:
julia> d = Dict{Any,Int}(:a => 1, :b => 2)
Dict{Any, Int64} with 2 entries:
  :a => 1
  :b => 2
julia> for k in keys(d)
           d[repr(k)] = d[k]
       end
julia> d
Dict{Any, Int64} with 7 entries:
  :a               => 1
  :b               => 2
  "\":b\""         => 2
  ":b"             => 2
  "\"\\\":a\\\"\"" => 1
  ":a"             => 1
  "\":a\""         => 1There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think from a programming perspective, maybe the safest way would be to do a version of keys(vi) that's eagerly evaluated, maybe collect(keys(vi))?
But also from a human perspective it's quite easy to see that keys(vi) can't be changed, so also happy to let it slide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep it as is, because I'm pretty convinced it's okay (and not due to an implementation detail, but because set_transformed!! has no business touching the keys), and lazy iterators are nice. Could add the call to collect if it bothers you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't bother me enough to want to do it.
| 
 The  julia> d = Dict((i % 2 == 0 ? repr(i) : Symbol(i)) => i for i in 1:100);
julia> typeof(d)
Dict{Any, Int64}
julia> od = OrderedDict(d...);
julia> @btime Dict(d...);
  18.666 μs (210 allocations: 14.59 KiB)
julia> @btime OrderedDict(od...);
  78.458 μs (534 allocations: 182.11 KiB)Keeping element types more tight was also significant for some cases for untyped VarInfo. For some other models I'm not sure what the important change was. | 
| Happy to merge, just curious as to what effects  | 
| Same benchmarks as before, except made Smorgasbord consistently unlinked and added a couple that were previously missing: I can't see any significant changes to before, except that the LDA one now crashes with  ERROR: LoadError: MethodError: (::ReverseDiff.ForwardOptimize{typeof(+)})(::ReverseDiff.TrackedReal{Real, Real, ReverseDiff.TrackedArray{…}}, ::ReverseDiff.TrackedReal{Float64, Float64, Nothing}) is ambiguous.
Candidates:
  (self::ReverseDiff.ForwardOptimize{F})(x::Real, t::ReverseDiff.TrackedReal{V, D}) where {F, V, D}
    @ ReverseDiff ~/.julia/packages/ReverseDiff/rKZaG/src/macros.jl:109
  (self::ReverseDiff.ForwardOptimize{F})(t::ReverseDiff.TrackedReal{V, D}, x::Real) where {F, V, D}
    @ ReverseDiff ~/.julia/packages/ReverseDiff/rKZaG/src/macros.jl:121
Possible fix, define
  (::ReverseDiff.ForwardOptimize{F})(::ReverseDiff.TrackedReal{V, D}, ::ReverseDiff.TrackedReal{V, D}) where {F, V, D, V, D}For future reference, mostly for myself, here's the selection of benchmarks: chosen_combinations = [
    (
        "Simple assume observe",
        Models.simple_assume_observe(randn(rng)),
        :typed,
        :mooncake,
        false,
    ),
    (
        "Simple assume observe",
        Models.simple_assume_observe(randn(rng)),
        :typed_vector,
        :mooncake,
        false,
    ),
    (
        "Simple assume observe",
        Models.simple_assume_observe(randn(rng)),
        :untyped,
        :mooncake,
        false,
    ),
    (
        "Simple assume observe",
        Models.simple_assume_observe(randn(rng)),
        :untyped_vector,
        :mooncake,
        false,
    ),
    ("Smorgasbord", smorgasbord_instance, :typed, :reversediff, false),
    ("Smorgasbord", smorgasbord_instance, :typed_vector, :reversediff, false),
    ("Smorgasbord", smorgasbord_instance, :untyped, :reversediff, false),
    ("Smorgasbord", smorgasbord_instance, :untyped_vector, :reversediff, false),
    ("Loop univariate 1k", loop_univariate1k, :typed, :mooncake, true),
    ("Loop univariate 1k", loop_univariate1k, :typed_vector, :mooncake, true),
    ("Loop univariate 1k", loop_univariate1k, :untyped, :mooncake, true),
    ("Loop univariate 1k", loop_univariate1k, :untyped_vector, :mooncake, true),
    ("Multivariate 1k", multivariate1k, :typed, :mooncake, true),
    ("Multivariate 1k", multivariate1k, :typed_vector, :mooncake, true),
    ("Multivariate 1k", multivariate1k, :untyped, :mooncake, true),
    ("Multivariate 1k", multivariate1k, :untyped_vector, :mooncake, true),
    ("Loop univariate 10k", loop_univariate10k, :typed, :mooncake, true),
    ("Loop univariate 10k", loop_univariate10k, :typed_vector, :mooncake, true),
    ("Loop univariate 10k", loop_univariate10k, :untyped, :mooncake, true),
    ("Loop univariate 10k", loop_univariate10k, :untyped_vector, :mooncake, true),
    ("Multivariate 10k", multivariate10k, :typed, :mooncake, true),
    ("Multivariate 10k", multivariate10k, :typed_vector, :mooncake, true),
    ("Multivariate 10k", multivariate10k, :untyped, :mooncake, true),
    ("Multivariate 10k", multivariate10k, :untyped_vector, :mooncake, true),
    ("Dynamic", Models.dynamic(), :typed, :mooncake, true),
    ("Dynamic", Models.dynamic(), :typed_vector, :mooncake, true),
    # ("Dynamic", Models.dynamic(), :untyped, :mooncake, true),
    ("Dynamic", Models.dynamic(), :untyped_vector, :mooncake, true),
    ("Submodel", Models.parent(randn(rng)), :typed, :mooncake, true),
    ("Submodel", Models.parent(randn(rng)), :typed_vector, :mooncake, true),
    ("Submodel", Models.parent(randn(rng)), :untyped, :mooncake, true),
    ("Submodel", Models.parent(randn(rng)), :untyped_vector, :mooncake, true),
    ("LDA", lda_instance, :typed, :reversediff, true),
    ("LDA", lda_instance, :typed_vector, :reversediff, true),
    ("LDA", lda_instance, :untyped, :reversediff, true),
    ("LDA", lda_instance, :untyped_vector, :reversediff, true),
] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to merge, although it looks to me like Dynamic also has maybe 20% worse performance on typed VNV compared to typed Metadata.
| I think our benchmark models just don't hit a case where this would make a difference. Here's one that does: julia> @model function f()
           x ~ Normal()
           y ~ Poisson()
           0.0 ~ Normal(x, float(y))
       end
       m = f()
       vi_md = DynamicPPL.untyped_varinfo(m)
       vi_vnv = DynamicPPL.untyped_vector_varinfo(m)
       @btime DynamicPPL.evaluate!!(m, vi_md)
       @btime DynamicPPL.evaluate!!(m, vi_vnv);
  2.625 μs (49 allocations: 2.03 KiB)
  2.500 μs (49 allocations: 2.19 KiB)Whereas using    2.606 μs (49 allocations: 2.03 KiB)
  369.362 ns (7 allocations: 416 bytes) | 
| If it makes you feel better, on   | 
* Remove unnecessary consistency checks for VarNamedVector * Fix benchmark setting Co-authored-by: Penelope Yong <penelopeysm@gmail.com> * Fix typo * Add two benchmarks * Improvements to VarNamedVector (#1098) * Change VNV to use Dict rather than OrderedDict * Change concretisation from map(identity, x) to a comprehension * Improve tighten_types!! and loosen_types!! * Fix use of set_transformed!! * Fix push!! for VarInfos * Change the default element types in VNV to be Union{} * In untyped_vector_varinfo, don't rely on Metadata * Code style * Run formatter * In VNV, use typejoin rather than promote_type * Bump patch version to 0.38.4 --------- Co-authored-by: Penelope Yong <penelopeysm@gmail.com>
This PR is an assortment of somewhat unrelated improvements to VarNamedVector. I've broken the changes up into sensible, roughly atomic units in the commits. I recommend reviewing this PR commit-by-commit, it'll make more sense that way.
The total outcome of this is that VarNamedVector is now on par with Metadata for performance at least for the Smorgasbord model:
I'll do more benchmarking tomorrow.