Skip to content

Commit a7ba0b5

Browse files
committed
Benchmarking (#248)
Changes to DPPL can often have quite significant effects for compilation time and performance of both itself and downstream packages. It's also sometimes difficult to discover these performance regressions. E.g. in #221 we made a small simplification to the compiler and it ended up taking quite a while to figure out what was going wrong and had to test several models to identify the issue. So, this is a WIP PR for including a small set of models which we can `weave` into a document where we can look at the changes. It's unclear to me whether this should go in DPPL itself or in a separate package. I found it useful myself and figured I'd put it here so we can start maybe get some "standard" benchmarks to run for testing purposes. IMO we don't need many of them, as we will add more as we go along. For each model the following will be included in the document: - Benchmarked evaluation of the model on untyped and typed `VarInfo`. - Timing of the compilation of the model in the typed `VarInfo`. - Lowered code for the model. - If `:prefix` is provided to `weave`, the string-representation of `code_typed` for the evaluation of the model will be saved to a file `$(prefix)_(model.name)`. Furthermore, if `:prefix_old` is provided, pointing to `:prefix` used for a previous run (likely using a different version of DPPL), we will `diff` the `code_typed` for the two models by loading the saved files.
1 parent 892b971 commit a7ba0b5

File tree

5 files changed

+349
-0
lines changed

5 files changed

+349
-0
lines changed

benchmarks/Project.toml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
name = "DynamicPPLBenchmarks"
2+
uuid = "d94a1522-c11e-44a7-981a-42bf5dc1a001"
3+
version = "0.1.0"
4+
5+
[deps]
6+
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
7+
DiffUtils = "8294860b-85a6-42f8-8c35-d911f667b5f6"
8+
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
9+
DynamicPPL = "366bfd00-2699-11ea-058f-f148b4cae6d8"
10+
LibGit2 = "76f85450-5226-5b5a-8eaa-529ad045b433"
11+
Markdown = "d6f4376e-aef5-505a-96c1-9c027394607a"
12+
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
13+
Weave = "44d3d7a6-8a23-5bf8-98c5-b353f8df5ec9"

benchmarks/README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
To run the benchmarks, simply do:
2+
```sh
3+
julia --project -e 'using DynamicPPLBenchmarks; weave_benchmarks();'
4+
```
5+
6+
```julia
7+
help?> weave_benchmarks
8+
search: weave_benchmarks
9+
10+
weave_benchmarks(input="benchmarks.jmd"; kwargs...)
11+
12+
Weave benchmarks present in benchmarks.jmd into a single file.
13+
14+
Keyword arguments
15+
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
16+
17+
• benchmarkbody: JMD-file to be rendered for each model.
18+
19+
• include_commit_id=false: specify whether to include commit-id in the default name.
20+
21+
• name: the name of directory in results/ to use as output directory.
22+
23+
• name_old=nothing: if specified, comparisons of current run vs. the run pinted to by name_old
24+
will be included in the generated document.
25+
26+
• include_typed_code=false: if true, output of code_typed for the evaluator of the model will be
27+
included in the weaved document.
28+
29+
• Rest of the passed kwargs will be passed on to Weave.weave.
30+
```

benchmarks/benchmark_body.jmd

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
```julia
2+
@time model_def(data)();
3+
```
4+
5+
```julia
6+
m = time_model_def(model_def, data);
7+
```
8+
9+
```julia
10+
suite = make_suite(m);
11+
results = run(suite)
12+
results
13+
```
14+
15+
```julia; echo=false; results="hidden";
16+
BenchmarkTools.save(joinpath("results", WEAVE_ARGS[:name], "$(m.name)_benchmarks.json"), results)
17+
```
18+
19+
```julia; wrap=false
20+
if WEAVE_ARGS[:include_typed_code]
21+
typed = typed_code(m)
22+
end
23+
```
24+
25+
```julia; echo=false; results="hidden"
26+
if WEAVE_ARGS[:include_typed_code]
27+
# Serialize the output of `typed_code` so we can compare later.
28+
haskey(WEAVE_ARGS, :name) && serialize(joinpath("results", WEAVE_ARGS[:name],"$(m.name).jls"), string(typed));
29+
end
30+
```
31+
32+
```julia; wrap=false; echo=false;
33+
if haskey(WEAVE_ARGS, :name_old)
34+
# We want to compare the generated code to the previous version.
35+
import DiffUtils
36+
typed_old = deserialize(joinpath("results", WEAVE_ARGS[:name_old], "$(m.name).jls"));
37+
DiffUtils.diff(typed_old, string(typed), width=130)
38+
end
39+
```

benchmarks/benchmarks.jmd

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# Benchmarks
2+
3+
## Setup
4+
5+
```julia
6+
using BenchmarkTools, DynamicPPL, Distributions, Serialization
7+
```
8+
9+
```julia
10+
import DynamicPPLBenchmarks: time_model_def, make_suite, typed_code, weave_child
11+
```
12+
13+
## Models
14+
15+
### `demo1`
16+
17+
```julia
18+
@model function demo1(x)
19+
m ~ Normal()
20+
x ~ Normal(m, 1)
21+
22+
return (m = m, x = x)
23+
end
24+
25+
model_def = demo1;
26+
data = 1.0;
27+
```
28+
29+
```julia; results="markup"; echo=false
30+
weave_child(WEAVE_ARGS[:benchmarkbody], mod = @__MODULE__, args = WEAVE_ARGS)
31+
```
32+
33+
### `demo2`
34+
35+
```julia
36+
@model function demo2(y)
37+
# Our prior belief about the probability of heads in a coin.
38+
p ~ Beta(1, 1)
39+
40+
# The number of observations.
41+
N = length(y)
42+
for n in 1:N
43+
# Heads or tails of a coin are drawn from a Bernoulli distribution.
44+
y[n] ~ Bernoulli(p)
45+
end
46+
end
47+
48+
model_def = demo2;
49+
data = rand(0:1, 10);
50+
```
51+
52+
```julia; results="markup"; echo=false
53+
weave_child(WEAVE_ARGS[:benchmarkbody], mod = @__MODULE__, args = WEAVE_ARGS)
54+
```
55+
56+
### `demo3`
57+
58+
```julia
59+
@model function demo3(x)
60+
D, N = size(x)
61+
62+
# Draw the parameters for cluster 1.
63+
μ1 ~ Normal()
64+
65+
# Draw the parameters for cluster 2.
66+
μ2 ~ Normal()
67+
68+
μ = [μ1, μ2]
69+
70+
# Comment out this line if you instead want to draw the weights.
71+
w = [0.5, 0.5]
72+
73+
# Draw assignments for each datum and generate it from a multivariate normal.
74+
k = Vector{Int}(undef, N)
75+
for i in 1:N
76+
k[i] ~ Categorical(w)
77+
x[:,i] ~ MvNormal([μ[k[i]], μ[k[i]]], 1.)
78+
end
79+
return k
80+
end
81+
82+
model_def = demo3
83+
84+
# Construct 30 data points for each cluster.
85+
N = 30
86+
87+
# Parameters for each cluster, we assume that each cluster is Gaussian distributed in the example.
88+
μs = [-3.5, 0.0]
89+
90+
# Construct the data points.
91+
data = mapreduce(c -> rand(MvNormal([μs[c], μs[c]], 1.), N), hcat, 1:2);
92+
```
93+
94+
```julia; echo=false
95+
weave_child(WEAVE_ARGS[:benchmarkbody], mod = @__MODULE__, args = WEAVE_ARGS)
96+
```
Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
module DynamicPPLBenchmarks
2+
3+
using DynamicPPL
4+
using BenchmarkTools
5+
6+
using Weave: Weave
7+
using Markdown: Markdown
8+
9+
using LibGit2: LibGit2
10+
using Pkg: Pkg
11+
12+
export weave_benchmarks
13+
14+
function time_model_def(model_def, args...)
15+
return @time model_def(args...)
16+
end
17+
18+
function benchmark_untyped_varinfo!(suite, m)
19+
vi = VarInfo()
20+
# Populate.
21+
m(vi)
22+
# Evaluate.
23+
suite["evaluation_untyped"] = @benchmarkable $m($vi, $(DefaultContext()))
24+
return suite
25+
end
26+
27+
function benchmark_typed_varinfo!(suite, m)
28+
# Populate.
29+
vi = VarInfo(m)
30+
# Evaluate.
31+
suite["evaluation_typed"] = @benchmarkable $m($vi, $(DefaultContext()))
32+
return suite
33+
end
34+
35+
function typed_code(m, vi=VarInfo(m))
36+
rng = DynamicPPL.Random.MersenneTwister(42)
37+
spl = DynamicPPL.SampleFromPrior()
38+
ctx = DynamicPPL.SamplingContext(rng, spl, DynamicPPL.DefaultContext())
39+
40+
results = code_typed(m.f, Base.typesof(m, vi, ctx, m.args...))
41+
return first(results)
42+
end
43+
44+
"""
45+
make_suite(model)
46+
47+
Create default benchmark suite for `model`.
48+
"""
49+
function make_suite(model)
50+
suite = BenchmarkGroup()
51+
benchmark_untyped_varinfo!(suite, model)
52+
benchmark_typed_varinfo!(suite, model)
53+
54+
return suite
55+
end
56+
57+
"""
58+
weave_child(indoc; mod, args, kwargs...)
59+
60+
Weave `indoc` with scope of `mod` into markdown.
61+
62+
Useful for weaving within weaving, e.g.
63+
```julia
64+
weave_child(child_jmd_path, mod = @__MODULE__, args = WEAVE_ARGS)
65+
```
66+
together with `results="markup"` and `echo=false` will simply insert
67+
the weaved version of `indoc`.
68+
69+
# Notes
70+
- Currently only supports `doctype == "github"`. Other outputs are "supported"
71+
in the sense that it works but you might lose niceties such as syntax highlighting.
72+
"""
73+
function weave_child(indoc; mod, args, kwargs...)
74+
# FIXME: Make this work for other output formats than just `github`.
75+
doc = Weave.WeaveDoc(indoc, nothing)
76+
doc = Weave.run_doc(doc; doctype="github", mod=mod, args=args, kwargs...)
77+
rendered = Weave.render_doc(doc)
78+
return display(Markdown.parse(rendered))
79+
end
80+
81+
"""
82+
pkgversion(m::Module)
83+
84+
Return version of module `m` as listed in its Project.toml.
85+
"""
86+
function pkgversion(m::Module)
87+
projecttoml_path = joinpath(dirname(pathof(m)), "..", "Project.toml")
88+
return Pkg.TOML.parsefile(projecttoml_path)["version"]
89+
end
90+
91+
"""
92+
default_name(; include_commit_id=false)
93+
94+
Construct a name from either repo information or package version
95+
of `DynamicPPL`.
96+
97+
If the path of `DynamicPPL` is a git-repo, return name of current branch,
98+
joined with the commit id if `include_commit_id` is `true`.
99+
100+
If path of `DynamicPPL` is _not_ a git-repo, it is assumed to be a release,
101+
resulting in a name of the form `release-VERSION`.
102+
"""
103+
function default_name(; include_commit_id=false)
104+
dppl_path = abspath(joinpath(dirname(pathof(DynamicPPL)), ".."))
105+
106+
# Extract branch name and commit id
107+
local name
108+
try
109+
githead = LibGit2.head(LibGit2.GitRepo(dppl_path))
110+
branchname = LibGit2.shortname(githead)
111+
112+
name = replace(branchname, "/" => "_")
113+
if include_commit_id
114+
gitcommit = LibGit2.peel(LibGit2.GitCommit, githead)
115+
commitid = string(LibGit2.GitHash(gitcommit))
116+
name *= "-$(commitid)"
117+
end
118+
catch e
119+
if e isa LibGit2.GitError
120+
@info "No git repo found for $(dppl_path); extracting name from package version."
121+
name = "release-$(pkgversion(DynamicPPL))"
122+
else
123+
rethrow(e)
124+
end
125+
end
126+
127+
return name
128+
end
129+
130+
"""
131+
weave_benchmarks(input="benchmarks.jmd"; kwargs...)
132+
133+
Weave benchmarks present in `benchmarks.jmd` into a single file.
134+
135+
# Keyword arguments
136+
- `benchmarkbody`: JMD-file to be rendered for each model.
137+
- `include_commit_id=false`: specify whether to include commit-id in the default name.
138+
- `name`: the name of directory in `results/` to use as output directory.
139+
- `name_old=nothing`: if specified, comparisons of current run vs. the run pinted to
140+
by `name_old` will be included in the generated document.
141+
- `include_typed_code=false`: if `true`, output of `code_typed` for the evaluator
142+
of the model will be included in the weaved document.
143+
- Rest of the passed `kwargs` will be passed on to `Weave.weave`.
144+
"""
145+
function weave_benchmarks(
146+
input=joinpath(dirname(pathof(DynamicPPLBenchmarks)), "..", "benchmarks.jmd");
147+
benchmarkbody=joinpath(
148+
dirname(pathof(DynamicPPLBenchmarks)), "..", "benchmark_body.jmd"
149+
),
150+
include_commit_id=false,
151+
name=default_name(; include_commit_id=include_commit_id),
152+
name_old=nothing,
153+
include_typed_code=false,
154+
doctype="github",
155+
outpath="results/$(name)/",
156+
kwargs...,
157+
)
158+
args = Dict(
159+
:benchmarkbody => benchmarkbody,
160+
:name => name,
161+
:include_typed_code => include_typed_code,
162+
)
163+
if !isnothing(name_old)
164+
args[:name_old] = name_old
165+
end
166+
@info "Storing output in $(outpath)"
167+
mkpath(outpath)
168+
return Weave.weave(input, doctype; out_path=outpath, args=args, kwargs...)
169+
end
170+
171+
end # module

0 commit comments

Comments
 (0)