Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,14 @@ SpecialFunctions = "276daf66-3868-5448-9aa4-cd146d93841b"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
StatsFuns = "4c63d2b9-4356-54db-8cca-17b64c39e42c"
Tracker = "9f7883ad-71c0-57eb-9f7f-b5c9e6d3789c"
Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"

[compat]
AbstractMCMC = "~0.1"
AdvancedHMC = "0.2.4"
Bijectors = "0.4.0"
BinaryProvider = "0.5.6"
Distributions = "0.21.11"
DistributionsAD = "0.1.0"
DistributionsAD = "0.1.2"
FiniteDifferences = "0.9"
ForwardDiff = "0.10.3"
Libtask = "0.3.1"
Expand All @@ -47,7 +46,6 @@ Requires = "1.0"
SpecialFunctions = "0.7.2, 0.8, 0.9"
StatsFuns = "0.8, 0.9"
Tracker = "0.2.3"
Zygote = "~0.3.4, 0.4.1"
julia = "1"

[extras]
Expand Down
22 changes: 14 additions & 8 deletions docs/src/using-turing/advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,19 +105,25 @@ mf(vi, sampler, ctx, model) = begin
x = model.args.x

# Assume s has an InverseGamma distribution.
s, lp = Turing.assume(sampler,
InverseGamma(2, 3),
Turing.@varname(s), vi)
s, lp = Turing.Inference.tilde(
ctx,
sampler,
InverseGamma(2, 3),
Turing.@varname(s),
(),
vi,
)

# Add the lp to the accumulated logp.
vi.logp += lp

# Assume m has a Normal distribution.
m, lp = Turing.Inference.tilde(
ctx,
sampler,
Normal(0, sqrt(s)),
Turing.@varname(m),
ctx,
sampler,
Normal(0, sqrt(s)),
Turing.@varname(m),
(),
vi,
)

Expand All @@ -136,7 +142,7 @@ mf(vi, sampler, ctx, model) = begin
end

# Instantiate a Model object.
model = Turing.Model(mf, data)
model = Turing.Model(mf, data, Turing.Core.ModelGen{()}(nothing, nothing))

# Sample the model.
chain = sample(model, HMC(0.1, 5), 1000)
Expand Down
84 changes: 62 additions & 22 deletions docs/src/using-turing/compiler.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,19 @@ The following terminology will be used in this section:
- `D`: observed data variables conditioned upon in the posterior,
- `P`: parameter variables distributed according to the prior distributions, these will also be referred to as random variables,
- `Model`: a fully defined probabilistic model with input data, and
- `Model` generator: a function that can be used to instantiate a `Model` instance by inputing data `D`.
- `ModelGen`: a model generator function that can be used to instantiate a `Model` instance by inputing data `D`.

`Turing`'s `@model` macro defines a `Model` generator that can be used to instantiate a `Model` by passing in the observed data `D`.
`Turing`'s `@model` macro defines a `ModelGen` that can be used to instantiate a `Model` by passing in the observed data `D`.

# `@model` macro
# `@model` macro and `ModelGen`

The following are the main jobs of the `@model` macro:
1. Parse `~` lines, e.g. `y ~ Normal(c*x, 1.0)`
1. Parse `~` and `.~` lines, e.g. `y .~ Normal.(c*x, 1.0)`
2. Figure out if a variable belongs to the data `D` and or to the parameters `P`
3. Enable the handling of missing data variables in `D` when defining a `Model` and treating them as parameter variables in `P` instead
4. Enable the tracking of random variables using the data structures `VarName` and `VarInfo`
5. Change `~` lines with a variable in `P` on the LHS to a call to an `assume`-block
6. Change `~` lines with a variable in `D` on the LHS to a call to an `observe`-block
5. Change `~`/`.~` lines with a variable in `P` on the LHS to a call to an `assume`/`dot_assume`-block
6. Change `~`/`.~` lines with a variable in `D` on the LHS to a call to an `observe`/`dot_observe`-block
7. Enable type stable automatic differentiation of the model using type parameters

Let's take the following model as an example:
Expand All @@ -40,13 +40,24 @@ Let's take the following model as an example:
y ~ Normal(p[2], sqrt(p[1]))
end
```
A `model::Model` can be defined using `gauss(rand(3), 1.0)` or `gauss(x = rand(3), y = 1.0)`. While constructing the model, if an argument is not passed in, it will be assigned to its default value. If there is no default value given, an error will be thrown. If an argument has a default value `missing`, when not passed in, it will be treated as a random variable. For variables which require an intialization because we need to loop or broadcast over its elements, such as `x` above, the following needs to be done:
The above call of the `@model` macro defines an instance of `ModelGen` called `gauss`. A `model::Model` can be defined using `gauss(rand(3), 1.0)` or `gauss(x = rand(3), y = 1.0)`. While constructing the model, if an argument is not passed in, it gets assigned to its default value. If there is no default value given, an error is thrown. If an argument has a default value `missing`, when not passed in, it is treated as a random variable. For variables which require an intialization because we need to loop or broadcast over its elements, such as `x` above, the following needs to be done:
```julia
if x === missing
x = ...
end
```
If `x` will be sampled as a whole from a multivariate distribution, e.g. `x ~ MvNormal(...)`, there is no need to initialize it in an `if`-block.
If `x` is sampled as a whole from a multivariate distribution, e.g. `x ~ MvNormal(...)`, there is no need to initialize it in an `if`-block.

`ModelGen` is defined as:
```julia
struct ModelGen{Targs, F, Tdefaults} <: Function
f::F
defaults::Tdefaults
end
ModelGen{Targs}(args...) where {Targs} = ModelGen{Targs, typeof.(args)...}(args...)
(m::ModelGen)(args...; kwargs...) = m.f(args...; kwargs...)
```
`Targs` is the tuple of the symbols of the model's arguments, `(:x, :y, :TV)`. `defaults` is the `NamedTuple` of default values `(x = missing, y = 1.0, TV = Vector{Float64})`.

The `@model` macro is defined as:
```julia
Expand All @@ -63,7 +74,8 @@ The first stop that the model definition takes is `build_model_info`. This funct
- `main_body`: the model body excluding the header and `end`.
- `arg_syms`: the argument symbols, e.g. `[:x, :y, :TV]` above.
- `args`: a modified version of the arguments changing `::Type{TV}=Vector{Float64}` and `where {TV <: AbstractVector}` to `TV::Type{<:AbstractVector}=Vector{Float64}`. This is `[:(x = missing) :(y = 1.0), :(TV::Type{<:AbstractVector}=Vector{Float64})]` in the example above.
- `args_nt`: an expression constructing a `NamedTuple` of the input arguments, e.g. :((x = x, y = y, T = T)) in the example above.
- `args_nt`: an expression constructing a `NamedTuple` of the input arguments, e.g. :((x = x, y = y, TV = TV)) in the example above.
- `defaults_nt`: an expression constructing a `NamedTuple` of the default values of the input arguments, if any, e.g. :((x = missing, y = 1, TV = Vector{Float64})) in the example above.
and returns it as a dictionary called `model_info`.

## `replace_tilde!`
Expand All @@ -75,8 +87,9 @@ In the above example, `p[1] ~ InverseGamma(2, 3)` is replaced with:
temp_right = InverseGamma(2, 3)
Turing.Core.assert_dist(temp_right, msg = ...)
preprocessed = Turing.Core.@preprocess(Val((:x, :y, :T)), Turing.getmissing(model), p[1])
if preprocessed isa Turing.VarName
out = Turing.Inference.tilde(ctx, sampler, temp_right, preprocessed, vi)
if preprocessed isa Tuple
vn, inds = preprocessed
out = Turing.Inference.tilde(ctx, sampler, temp_right, vn, inds, vi)
p[1] = out[1]
vi.logp += out[2]
else
Expand All @@ -89,9 +102,9 @@ where `ctx::AbstractContext`, `sampler::AbstractSampler` and `vi::VarInfo` will
3. If neither of the above is true, but the value of `p[1]` is `missing`, then `p[1]` will still be treated as a random variable.
4. Otherwise, `p[1]` is treated as an observation.

If `@preprocess` treats `p[1]` as a random variable, it will return a variable identifier `vn::VarName = Turing.@varname p[1]`. Otherwise, it returns the value of `p[1]`. `Turing.@varname` and `VarName` wil be explained later. The above checks by `@preprocess` were carefully written to make sure that the Julia compiler can compile them away so no checks happen at runtime and only the correct branch is run straight away.
If `@preprocess` treats `p[1]` as a random variable, it will return a `2-Tuple` of: 1) a variable identifier `vn::VarName = Turing.@varname p[1]`, and 2) a tuple of tuples of the indices used in `vn`, `((1,),)` in this example. Otherwise, `@preprocess` returns the value of `p[1]`. `Turing.@varname` and `VarName` wil be explained later. The above checks by `@preprocess` were carefully written to make sure that the Julia compiler can compile them away so no checks happen at runtime and only the correct branch is run straight away.

When the output of `@preprocess` is a `VarName`, i.e. `p[1]` is a random variable, the `Turing.Inference.tilde` function will dispatch to a different method than when the output is of another type, i.e `p[1]` is an observation. In the former case, `Turing.Inference.tilde` returns 2 outputs, the value of the random variable and the `log` probability, while in the latter case, only the `log` probability is returned. The `log` probabilities then get accumulated and if `p[1]` is a random variable, the first returned output by `Turing.Inference.tilde` gets assigned to it.
When the output of `@preprocess` is a `Tuple`, i.e. `p[1]` is a random variable, the `Turing.Inference.tilde` function will dispatch to a different method than when the output is of another type, i.e `p[1]` is an observation. In the former case, `Turing.Inference.tilde` returns 2 outputs, the value of the random variable and the `log` probability, while in the latter case, only the `log` probability is returned. The `log` probabilities then get accumulated and if `p[1]` is a random variable, the first returned output by `Turing.Inference.tilde` gets assigned to it.

Note that `Core.tilde` is different from `Inference.tilde`. `Core.tilde` returns the expression block that will be run instead of the `~` line. A part of this expression block is a call to `Inference.tilde` as shown above. `Core.tilde` is defined in the `compiler.jl` file, while `Inference.tilde` is defined in the `Inference.jl` file.

Expand All @@ -100,9 +113,10 @@ The `dot_tilde!` function does something similar for expressions of the form `@.
temp_right = Normal(p[2], sqrt(p[1]))
Turing.Core.assert_dist(temp_right, msg = ...)
preprocessed = Turing.Core.@preprocess(Val((:x, :y, :T)), Turing.getmissing(model), x[1:2])
if preprocessed isa Turing.VarName
if preprocessed isa Tuple
vn, inds = preprocessed
temp_left = x[1:2]
out = Turing.Inference.dot_tilde(ctx, sampler, temp_right, temp_left, preprocessed, vi)
out = Turing.Inference.dot_tilde(ctx, sampler, temp_right, temp_left, vn, inds, vi)
left .= out[1]
vi.logp += out[2]
else
Expand All @@ -126,33 +140,59 @@ Every `model::Model` can be called as a function with arguments:

The `Model` struct is defined as follows:
```julia
struct Model{F, Targs <: NamedTuple}
struct Model{F, Targs <: NamedTuple, Tmodelgen, Tmissings <: Val}
f::F
args::Targs
modelgen::Tmodelgen
missings::Tmissings
end
Model(f, args::NamedTuple, modelgen) = Model(f, args, modelgen, getmissing(args))
(model::Model)(vi) = model(vi, SampleFromPrior())
(model::Model)(vi, spl) = model(vi, spl, DefaultContext())
(model::Model)(args...; kwargs...) = model.f(args..., model; kwargs...)
```
`model.f` is an internal function that is called when `model` is called, where `model::Model`. When `model` is called, `model` itself is passed as an argument to `model.f` because we need to access `model.args` among other things inside `f`. `model.args` is a `NamedTuple` of all the arguments that were passed to the model generating function when constructing an instance of `Model`.
`model.f` is an internal function that is called when `model` is called, where `model::Model`. When `model` is called, `model` itself is passed as an argument to `model.f` because we need to access `model.args` among other things inside `f`. `model.args` is a `NamedTuple` of all the arguments that were passed to the model generating function when constructing an instance of `Model`. `modelgen` is the instance of `ModelGen` that was used to construct `model`. `missings` is an instance of `Val`, e.g. `Val{(:a, :b)}()`. `getmissings` returns a `Val` instance of all the symbols in `args` with a value `missing`. This is the default definition of `missings`. All variables in `missings` are treated as random variables rather than observations.

In some non-traditional use-cases, `missings` is defined differently, e.g. when computing the log joint probability of the random variables and only some observations simultaneously, possibly conditioned on the remaining observations. An example using the model above is `logprob"x = rand(3), p = rand(2) | model = gauss, y = nothing"`. To evaluate this, the model argument `x` on the LHS of `|` is treated as a random variable leading to a call to the `assume` or `dot_assume` function in place of the `~` or `.~` expressions, respectively. The model is then run in the `PriorContext` which ignores the `observe` and `dot_observe` functions and only runs the `assume` and `dot_assume` ones. This returns the correct log probability. The reason why a model input argument, such as `x`, cannot be initialized to `missing` when on the LHS of `|` is somewhat subtle. In the model body before calling `~`, sometimes there would be a call to `length(x)` iterating over the elements of `x` in a loop calling `~` on each element of `x`. If `x` is initialized to `missing`, this will error because `length(missing)` is not defined. Moreover, it is not intuitive to require the user to handle the `x === missing` case because the user never assigned `x` to be `missing` in the first place, `missing` is merely an implementation detail in this case that the users need not concern themselves with. Therefore in this case, it makes sense to de-couple the `missings` field from the values of the arguments.

## `build_output`

Now that we have all the information we need in the `@model` macro, we can start building the model generator function. The model generator function `gauss` will be defined as:
```julia
gauss(; x = missing, y = 1.0, TV::Type{<:AbstractVector} = Vector{Float64}) = gauss(x, y, TV)
function gauss(x = missing, y = 1.0, TV::Type{<:AbstractVector} = Vector{Float64})
function outer_function(;
x = missing,
y = 1.0,
TV::Type{<:AbstractVector} = Vector{Float64},
)
return outer_function(x, y, TV)
end
function outer_function(
x = missing,
y = 1.0,
TV::Type{<:AbstractVector} = Vector{Float64},
)
function inner_function(vi::Turing.VarInfo, sampler::Turing.AbstractSampler, ctx::AbstractContext, model)
...
end
return Turing.Model(inner_function, (x = x, y = y, TV = TV))
return Turing.Model(
inner_function,
(x = x, y = y, TV = TV),
Turing.Core.ModelGen{(:x, :y, :TV)}(
outer_function,
(x = missing, y = 1.0, TV = Vector{Float64}),
),
)
end
gauss = Turing.Core.ModelGen{(:x, :y, :TV)}(
outer_function,
(x = missing, y = 1.0, TV = Vector{Float64}),
)
```
The above 2 methods enable constructing the model using positional or keyword arguments. The second argument to the `Turing.Model` constructor is the expression called `args_nt` stored in `model_info`. The body of the `inner_function` is explained below.
The above 2 methods enable constructing the model using positional or keyword arguments. The second argument to the `Turing.Model` constructor is the expression called `args_nt` stored in `model_info`. The second argument to the `ModelGen` constructor inside `outer_function` and outside is the expression called `defaults_nt` stored in `model_info`. The body of the `inner_function` is explained below.

## `inner_function`

The main method `inner_function` does some pre-processing defining all the input variables from the model definition, `x`, `y` and `TV` in the example above. Then the rest of the model body is run as normal Julia code with the `L ~ R` and `@. L ~ R` lines replaced with the calls to `Inference.tilde` and `Inference.dot_tilde` respectively as shown earlier.
The main method of `inner_function` does some pre-processing defining all the input variables from the model definition, `x`, `y` and `TV` in the example above. Then the rest of the model body is run as normal Julia code with the `L ~ R` and `@. L ~ R` lines replaced with the calls to `Inference.tilde` and `Inference.dot_tilde` respectively as shown earlier.
```julia
function inner_function(vi::Turing.VarInfo, sampler::Turing.AbstractSampler, ctx::AbstractContext, model)
temp_x = model.args.x
Expand Down
30 changes: 29 additions & 1 deletion docs/src/using-turing/guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ Output:
#### Treating observations as random variables


Inputs to the model that have a value `missing` are treated as parameters, aka random variables, to be estimated/sampled. This can be useful if you want to simulate draws for that parameter, or if you are sampling from a conditional distribution. Turing v0.6.7 supports the following syntax:
Inputs to the model that have a value `missing` are treated as parameters, aka random variables, to be estimated/sampled. This can be useful if you want to simulate draws for that parameter, or if you are sampling from a conditional distribution. Turing supports the following syntax:


```julia
Expand Down Expand Up @@ -312,6 +312,34 @@ Similarly, when using a particle sampler, the Julia variable used should either
2. An instance of some type parameter `T` defined in the model header using the type parameter syntax, e.g. `gdemo(x, ::Type{T} = Vector{Float64}) where {T} = begin`.


### Querying Probabilities from Model or Chain


Consider the following `gdemo` model:
```julia
@model gdemo(x, y) = begin
s ~ InverseGamma(2, 3)
m ~ Normal(0, sqrt(s))
x ~ Normal(m, sqrt(s))
y ~ Normal(m, sqrt(s))
end
```

The following are examples of valid queries of the `Turing` model or chain:

- `prob"x = 1.0, y = 1.0 | model = gdemo, s = 1.0, m = 1.0"` calculates the likelihood of `x = 1` and `y = 1` given `s = 1` and `m = 1`.

- `prob"s = 1.0, m = 1.0 | model = gdemo, x = nothing, y = nothing"` calculates the joint probability of `s = 1` and `m = 1` ignoring `x` and `y`. `x` and `y` are ignored so they can be optionally dropped from the RHS of `|`, but it is recommended to define them.

- `prob"s = 1.0, m = 1.0, x = 1.0 | model = gdemo, y = nothing"` calculates the joint probability of `s = 1`, `m = 1` and `x = 1` ignoring `y`.

- `prob"s = 1.0, m = 1.0, x = 1.0, y = 1.0 | model = gdemo"` calculates the joint probability of all the variables.

- After the MCMC sampling, given a `chain`, `prob"x = 1.0, y = 1.0 | chain = chain"` calculates the element-wise likelihood of `x = 1.0` and `y = 1.0` for each sample in `chain`.

In all the above cases, `logprob` can be used instead of `prob` to calculate the log probabilities instead.


## Beyond the Basics


Expand Down
Loading