TuringLang · yebai · Dec 25, 2019 · Dec 18, 2019 · Dec 18, 2019 · Dec 20, 2019
diff --git a/Project.toml b/Project.toml
@@ -25,15 +25,14 @@ SpecialFunctions = "276daf66-3868-5448-9aa4-cd146d93841b"
 Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
 StatsFuns = "4c63d2b9-4356-54db-8cca-17b64c39e42c"
 Tracker = "9f7883ad-71c0-57eb-9f7f-b5c9e6d3789c"
-Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"
 
 [compat]
 AbstractMCMC = "~0.1"
 AdvancedHMC = "0.2.4"
 Bijectors = "0.4.0"
 BinaryProvider = "0.5.6"
 Distributions = "0.21.11"
-DistributionsAD = "0.1.0"
+DistributionsAD = "0.1.2"
 FiniteDifferences = "0.9"
 ForwardDiff = "0.10.3"
 Libtask = "0.3.1"
@@ -47,7 +46,6 @@ Requires = "1.0"
 SpecialFunctions = "0.7.2, 0.8, 0.9"
 StatsFuns = "0.8, 0.9"
 Tracker = "0.2.3"
-Zygote = "~0.3.4, 0.4.1"
 julia = "1"
 
 [extras]

diff --git a/docs/src/using-turing/advanced.md b/docs/src/using-turing/advanced.md
@@ -105,19 +105,25 @@ mf(vi, sampler, ctx, model) = begin
     x = model.args.x
 
     # Assume s has an InverseGamma distribution.
-    s, lp = Turing.assume(sampler,
-                InverseGamma(2, 3),
-                Turing.@varname(s), vi)
+    s, lp = Turing.Inference.tilde(
+        ctx,
+        sampler,
+        InverseGamma(2, 3),
+        Turing.@varname(s),
+        (),
+        vi,
+    )
 
     # Add the lp to the accumulated logp.
     vi.logp += lp
 
     # Assume m has a Normal distribution.
     m, lp = Turing.Inference.tilde(
-        ctx, 
-        sampler, 
-        Normal(0, sqrt(s)), 
-        Turing.@varname(m), 
+        ctx,
+        sampler,
+        Normal(0, sqrt(s)),
+        Turing.@varname(m),
+        (),
         vi,
     )
 
@@ -136,7 +142,7 @@ mf(vi, sampler, ctx, model) = begin
 end
 
 # Instantiate a Model object.
-model = Turing.Model(mf, data)
+model = Turing.Model(mf, data, Turing.Core.ModelGen{()}(nothing, nothing))
 
 # Sample the model.
 chain = sample(model, HMC(0.1, 5), 1000)

diff --git a/docs/src/using-turing/compiler.md b/docs/src/using-turing/compiler.md
@@ -11,19 +11,19 @@ The following terminology will be used in this section:
 - `D`: observed data variables conditioned upon in the posterior,
 - `P`: parameter variables distributed according to the prior distributions, these will also be referred to as random variables,
 - `Model`: a fully defined probabilistic model with input data, and
-- `Model` generator: a function that can be used to instantiate a `Model` instance by inputing data `D`.
+- `ModelGen`: a model generator function that can be used to instantiate a `Model` instance by inputing data `D`.
 
-`Turing`'s `@model` macro defines a `Model` generator that can be used to instantiate a `Model` by passing in the observed data `D`.
+`Turing`'s `@model` macro defines a `ModelGen` that can be used to instantiate a `Model` by passing in the observed data `D`.
 
-# `@model` macro
+# `@model` macro and `ModelGen`
 
 The following are the main jobs of the `@model` macro:
-1. Parse `~` lines, e.g. `y ~ Normal(c*x, 1.0)`
+1. Parse `~` and `.~` lines, e.g. `y .~ Normal.(c*x, 1.0)`
 2. Figure out if a variable belongs to the data `D` and or to the parameters `P`
 3. Enable the handling of missing data variables in `D` when defining a `Model` and treating them as parameter variables in `P` instead
 4. Enable the tracking of random variables using the data structures `VarName` and `VarInfo`
-5. Change `~` lines with a variable in `P` on the LHS to a call to an `assume`-block
-6. Change `~` lines with a variable in `D` on the LHS to a call to an `observe`-block
+5. Change `~`/`.~` lines with a variable in `P` on the LHS to a call to an `assume`/`dot_assume`-block
+6. Change `~`/`.~` lines with a variable in `D` on the LHS to a call to an `observe`/`dot_observe`-block
 7. Enable type stable automatic differentiation of the model using type parameters
 
 Let's take the following model as an example:
@@ -40,13 +40,24 @@ Let's take the following model as an example:
     y ~ Normal(p[2], sqrt(p[1]))
 end
 ```
-A `model::Model` can be defined using `gauss(rand(3), 1.0)` or `gauss(x = rand(3), y = 1.0)`. While constructing the model, if an argument is not passed in, it will be assigned to its default value. If there is no default value given, an error will be thrown. If an argument has a default value `missing`, when not passed in, it will be treated as a random variable. For variables which require an intialization because we need to loop or broadcast over its elements, such as `x` above, the following needs to be done:
+The above call of the `@model` macro defines an instance of `ModelGen` called `gauss`. A `model::Model` can be defined using `gauss(rand(3), 1.0)` or `gauss(x = rand(3), y = 1.0)`. While constructing the model, if an argument is not passed in, it gets assigned to its default value. If there is no default value given, an error is thrown. If an argument has a default value `missing`, when not passed in, it is treated as a random variable. For variables which require an intialization because we need to loop or broadcast over its elements, such as `x` above, the following needs to be done:
 ```julia
 if x === missing
     x = ...
 end
 ```
-If `x` will be sampled as a whole from a multivariate distribution, e.g. `x ~ MvNormal(...)`, there is no need to initialize it in an `if`-block.
+If `x` is sampled as a whole from a multivariate distribution, e.g. `x ~ MvNormal(...)`, there is no need to initialize it in an `if`-block.
+
+`ModelGen` is defined as:
+```julia
+struct ModelGen{Targs, F, Tdefaults} <: Function
+    f::F
+    defaults::Tdefaults
+end
+ModelGen{Targs}(args...) where {Targs} = ModelGen{Targs, typeof.(args)...}(args...)
+(m::ModelGen)(args...; kwargs...) = m.f(args...; kwargs...)
+```
+`Targs` is the tuple of the symbols of the model's arguments, `(:x, :y, :TV)`. `defaults` is the `NamedTuple` of default values `(x = missing, y = 1.0, TV = Vector{Float64})`.
 
 The `@model` macro is defined as:
 ```julia
@@ -63,7 +74,8 @@ The first stop that the model definition takes is `build_model_info`. This funct
 - `main_body`: the model body excluding the header and `end`.
 - `arg_syms`: the argument symbols, e.g. `[:x, :y, :TV]` above.
 - `args`: a modified version of the arguments changing `::Type{TV}=Vector{Float64}` and `where {TV <: AbstractVector}` to `TV::Type{<:AbstractVector}=Vector{Float64}`. This is `[:(x = missing) :(y = 1.0), :(TV::Type{<:AbstractVector}=Vector{Float64})]` in the example above.
-- `args_nt`: an expression constructing a `NamedTuple` of the input arguments, e.g. :((x = x, y = y, T = T)) in the example above.
+- `args_nt`: an expression constructing a `NamedTuple` of the input arguments, e.g. :((x = x, y = y, TV = TV)) in the example above.
+- `defaults_nt`: an expression constructing a `NamedTuple` of the default values of the input arguments, if any, e.g. :((x = missing, y = 1, TV = Vector{Float64})) in the example above.
 and returns it as a dictionary called `model_info`.
 
 ## `replace_tilde!`
@@ -75,8 +87,9 @@ In the above example, `p[1] ~ InverseGamma(2, 3)` is replaced with:
 temp_right = InverseGamma(2, 3)
 Turing.Core.assert_dist(temp_right, msg = ...)
 preprocessed = Turing.Core.@preprocess(Val((:x, :y, :T)), Turing.getmissing(model), p[1])
-if preprocessed isa Turing.VarName
-    out = Turing.Inference.tilde(ctx, sampler, temp_right, preprocessed, vi)
+if preprocessed isa Tuple
+    vn, inds = preprocessed
+    out = Turing.Inference.tilde(ctx, sampler, temp_right, vn, inds, vi)
     p[1] = out[1]
     vi.logp += out[2]
 else
@@ -89,9 +102,9 @@ where `ctx::AbstractContext`, `sampler::AbstractSampler` and `vi::VarInfo` will
 3. If neither of the above is true, but the value of `p[1]` is `missing`, then `p[1]` will still be treated as a random variable.
 4. Otherwise, `p[1]` is treated as an observation.
 
-If `@preprocess` treats `p[1]` as a random variable, it will return a variable identifier `vn::VarName = Turing.@varname p[1]`. Otherwise, it returns the value of `p[1]`. `Turing.@varname` and `VarName` wil be explained later. The above checks by `@preprocess` were carefully written to make sure that the Julia compiler can compile them away so no checks happen at runtime and only the correct branch is run straight away. 
+If `@preprocess` treats `p[1]` as a random variable, it will return a `2-Tuple` of: 1) a variable identifier `vn::VarName = Turing.@varname p[1]`, and 2) a tuple of tuples of the indices used in `vn`, `((1,),)` in this example. Otherwise, `@preprocess` returns the value of `p[1]`. `Turing.@varname` and `VarName` wil be explained later. The above checks by `@preprocess` were carefully written to make sure that the Julia compiler can compile them away so no checks happen at runtime and only the correct branch is run straight away. 
 
-When the output of `@preprocess` is a `VarName`, i.e. `p[1]` is a random variable, the `Turing.Inference.tilde` function will dispatch to a different method than when the output is of another type, i.e `p[1]` is an observation. In the former case, `Turing.Inference.tilde` returns 2 outputs, the value of the random variable and the `log` probability, while in the latter case, only the `log` probability is returned. The `log` probabilities then get accumulated and if `p[1]` is a random variable, the first returned output by `Turing.Inference.tilde` gets assigned to it.
+When the output of `@preprocess` is a `Tuple`, i.e. `p[1]` is a random variable, the `Turing.Inference.tilde` function will dispatch to a different method than when the output is of another type, i.e `p[1]` is an observation. In the former case, `Turing.Inference.tilde` returns 2 outputs, the value of the random variable and the `log` probability, while in the latter case, only the `log` probability is returned. The `log` probabilities then get accumulated and if `p[1]` is a random variable, the first returned output by `Turing.Inference.tilde` gets assigned to it.
 
 Note that `Core.tilde` is different from `Inference.tilde`. `Core.tilde` returns the expression block that will be run instead of the `~` line. A part of this expression block is a call to `Inference.tilde` as shown above. `Core.tilde` is defined in the `compiler.jl` file, while `Inference.tilde` is defined in the `Inference.jl` file.
 
@@ -100,9 +113,10 @@ The `dot_tilde!` function does something similar for expressions of the form `@.
 temp_right = Normal(p[2], sqrt(p[1]))
 Turing.Core.assert_dist(temp_right, msg = ...)
 preprocessed = Turing.Core.@preprocess(Val((:x, :y, :T)), Turing.getmissing(model), x[1:2])
-if preprocessed isa Turing.VarName
+if preprocessed isa Tuple
+    vn, inds = preprocessed
     temp_left = x[1:2]
-    out = Turing.Inference.dot_tilde(ctx, sampler, temp_right, temp_left, preprocessed, vi)
+    out = Turing.Inference.dot_tilde(ctx, sampler, temp_right, temp_left, vn, inds, vi)
     left .= out[1]
     vi.logp += out[2]
 else
@@ -126,33 +140,59 @@ Every `model::Model` can be called as a function with arguments:
 
 The `Model` struct is defined as follows:
 ```julia
-struct Model{F, Targs <: NamedTuple}
+struct Model{F, Targs <: NamedTuple, Tmodelgen, Tmissings <: Val}
     f::F
     args::Targs
+    modelgen::Tmodelgen
+    missings::Tmissings
 end
+Model(f, args::NamedTuple, modelgen) = Model(f, args, modelgen, getmissing(args))
 (model::Model)(vi) = model(vi, SampleFromPrior())
 (model::Model)(vi, spl) = model(vi, spl, DefaultContext())
 (model::Model)(args...; kwargs...) = model.f(args..., model; kwargs...)
 ```
-`model.f` is an internal function that is called when `model` is called, where `model::Model`. When `model` is called, `model` itself is passed as an argument to `model.f` because we need to access `model.args` among other things inside `f`. `model.args` is a `NamedTuple` of all the arguments that were passed to the model generating function when constructing an instance of `Model`.
+`model.f` is an internal function that is called when `model` is called, where `model::Model`. When `model` is called, `model` itself is passed as an argument to `model.f` because we need to access `model.args` among other things inside `f`. `model.args` is a `NamedTuple` of all the arguments that were passed to the model generating function when constructing an instance of `Model`. `modelgen` is the instance of `ModelGen` that was used to construct `model`. `missings` is an instance of `Val`, e.g. `Val{(:a, :b)}()`. `getmissings` returns a `Val` instance of all the symbols in `args` with a value `missing`. This is the default definition of `missings`. All variables in `missings` are treated as random variables rather than observations. 
+
+In some non-traditional use-cases, `missings` is defined differently, e.g. when computing the log joint probability of the random variables and only some observations simultaneously, possibly conditioned on the remaining observations. An example using the model above is `logprob"x = rand(3), p = rand(2) | model = gauss, y = nothing"`. To evaluate this, the model argument `x` on the LHS of `|` is treated as a random variable leading to a call to the `assume` or `dot_assume` function in place of the `~` or `.~` expressions, respectively. The model is then run in the `PriorContext` which ignores the `observe` and `dot_observe` functions and only runs the `assume` and `dot_assume` ones. This returns the correct log probability. The reason why a model input argument, such as `x`, cannot be initialized to `missing` when on the LHS of `|` is somewhat subtle. In the model body before calling `~`, sometimes there would be a call to `length(x)` iterating over the elements of `x` in a loop calling `~` on each element of `x`. If `x` is initialized to `missing`, this will error because `length(missing)` is not defined. Moreover, it is not intuitive to require the user to handle the `x === missing` case because the user never assigned `x` to be `missing` in the first place, `missing` is merely an implementation detail in this case that the users need not concern themselves with. Therefore in this case, it makes sense to de-couple the `missings` field from the values of the arguments.
 
 ## `build_output`
 
 Now that we have all the information we need in the `@model` macro, we can start building the model generator function. The model generator function `gauss` will be defined as:
 ```julia
-gauss(; x = missing, y = 1.0, TV::Type{<:AbstractVector} = Vector{Float64}) = gauss(x, y, TV)
-function gauss(x = missing, y = 1.0, TV::Type{<:AbstractVector} = Vector{Float64})
+function outer_function(;
+    x = missing,
+    y = 1.0,
+    TV::Type{<:AbstractVector} = Vector{Float64},
+)
+    return outer_function(x, y, TV)
+end
+function outer_function(
+    x = missing,
+    y = 1.0,
+    TV::Type{<:AbstractVector} = Vector{Float64},
+)
     function inner_function(vi::Turing.VarInfo, sampler::Turing.AbstractSampler, ctx::AbstractContext, model)
         ...
     end
-    return Turing.Model(inner_function, (x = x, y = y, TV = TV))
+    return Turing.Model(
+        inner_function,
+        (x = x, y = y, TV = TV),
+        Turing.Core.ModelGen{(:x, :y, :TV)}(
+            outer_function,
+            (x = missing, y = 1.0, TV = Vector{Float64}),
+        ),
+    )
 end
+gauss = Turing.Core.ModelGen{(:x, :y, :TV)}(
+            outer_function,
+            (x = missing, y = 1.0, TV = Vector{Float64}),
+        )
 ```
-The above 2 methods enable constructing the model using positional or keyword arguments. The second argument to the `Turing.Model` constructor is the expression called `args_nt` stored in `model_info`. The body of the `inner_function` is explained below.
+The above 2 methods enable constructing the model using positional or keyword arguments. The second argument to the `Turing.Model` constructor is the expression called `args_nt` stored in `model_info`. The second argument to the `ModelGen` constructor inside `outer_function` and outside is the expression called `defaults_nt` stored in `model_info`. The body of the `inner_function` is explained below.
 
 ## `inner_function`
 
-The main method `inner_function` does some pre-processing defining all the input variables from the model definition, `x`, `y` and `TV` in the example above. Then the rest of the model body is run as normal Julia code with the `L ~ R` and `@. L ~ R` lines replaced with the calls to `Inference.tilde` and `Inference.dot_tilde` respectively as shown earlier.
+The main method of `inner_function` does some pre-processing defining all the input variables from the model definition, `x`, `y` and `TV` in the example above. Then the rest of the model body is run as normal Julia code with the `L ~ R` and `@. L ~ R` lines replaced with the calls to `Inference.tilde` and `Inference.dot_tilde` respectively as shown earlier.
 ```julia
 function inner_function(vi::Turing.VarInfo, sampler::Turing.AbstractSampler, ctx::AbstractContext, model)
     temp_x = model.args.x

diff --git a/docs/src/using-turing/guide.md b/docs/src/using-turing/guide.md
@@ -218,7 +218,7 @@ Output:
 #### Treating observations as random variables
 
 
-Inputs to the model that have a value `missing` are treated as parameters, aka random variables, to be estimated/sampled. This can be useful if you want to simulate draws for that parameter, or if you are sampling from a conditional distribution. Turing v0.6.7 supports the following syntax:
+Inputs to the model that have a value `missing` are treated as parameters, aka random variables, to be estimated/sampled. This can be useful if you want to simulate draws for that parameter, or if you are sampling from a conditional distribution. Turing supports the following syntax:
 
 
 ```julia
@@ -312,6 +312,34 @@ Similarly, when using a particle sampler, the Julia variable used should either
     2. An instance of some type parameter `T` defined in the model header using the type parameter syntax, e.g. `gdemo(x, ::Type{T} = Vector{Float64}) where {T} = begin`.
 
 
+### Querying Probabilities from Model or Chain
+
+
+Consider the following `gdemo` model:
+```julia
+@model gdemo(x, y) = begin
+  s ~ InverseGamma(2, 3)
+  m ~ Normal(0, sqrt(s))
+  x ~ Normal(m, sqrt(s))
+  y ~ Normal(m, sqrt(s))
+end
+```
+
+The following are examples of valid queries of the `Turing` model or chain: 
+
+- `prob"x = 1.0, y = 1.0 | model = gdemo, s = 1.0, m = 1.0"` calculates the likelihood of `x = 1` and `y = 1` given `s = 1` and `m = 1`.
+
+- `prob"s = 1.0, m = 1.0 | model = gdemo, x = nothing, y = nothing"` calculates the joint probability of `s = 1` and `m = 1` ignoring `x` and `y`. `x` and `y` are ignored so they can be optionally dropped from the RHS of `|`, but it is recommended to define them.
+
+- `prob"s = 1.0, m = 1.0, x = 1.0 | model = gdemo, y = nothing"` calculates the joint probability of `s = 1`, `m = 1` and `x = 1` ignoring `y`.
+
+- `prob"s = 1.0, m = 1.0, x = 1.0, y = 1.0 | model = gdemo"` calculates the joint probability of all the variables.
+
+- After the MCMC sampling, given a `chain`, `prob"x = 1.0, y = 1.0 | chain = chain"` calculates the element-wise likelihood of `x = 1.0` and `y = 1.0` for each sample in `chain`.
+
+In all the above cases, `logprob` can be used instead of `prob` to calculate the log probabilities instead.
+
+
 ## Beyond the Basics