Skip to content

Conversation

@devmotion
Copy link
Member

This is an updated version of the ESS in #991. I noticed that the bias visible in the plots in #991 is real and the mean of the MCMC chain does not converge to the true posterior mean. The problem was that I computed the joint probabilities although ESS requires the likelihood. I tried to fix this by changing the assume function. Moreover, I included some checks for normality and allowed only one parameter.

The example in the original issue and two other examples included in the tests seem to converge now, but the gdemo example still fails. I'm not sure why this is the case and assume there is some problem with how ESS is integrated in the Gibbs sampler - I guess I forgot to implement something related to Turing's internals or implemented it incorrectly.

You can check the MATLAB implementation from Ian Murray, one of the original authors, for a comparison: https://homepages.inf.ed.ac.uk/imurray2/pub/10ess/elliptical_slice.m

@devmotion
Copy link
Member Author

I updated the PR and changed in particular the implementation of assume and observe (basically, they are not doing anything special anymore apart from not evaluating the logpdf of the prior distribution since it is never needed. Now MCMC converges for both models

@model demo(x) = begin
    m ~ Normal(0, 1)
    x ~ MvNormal(fill(m, length(x)), sqrt(σ²))
end

and

@model gdemo(x, y) = begin
    s ~ InverseGamma(2, 3)
    m ~ Normal(0, sqrt(s))
    x ~ Normal(m, sqrt(s))
    y ~ Normal(m, sqrt(s))
    return s, m
end

successfully. I created plots of the trajectories and the true (dashed blue) and approximated (orange) posterior (code):

ESS (demo): demo_ESS
NUTS (demo): demo_NUTS

ESS (gdemo): gdemo_ESS
HMC (gdemo): gdemo_HMC

Copy link
Member

@trappmartin trappmartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work, I left a few remarks.

@mohamed82008
Copy link
Contributor

I would suggest keeping lateral changes to another PR. Also if you are trying to compute the log likelihood, I recommend overloading tilde and dot_tilde for the ESS sampler with ctx::DefaultContext and call the tilde or dot_tilde functions using the spl::SampleFromPrior and ctx::LikelihoodContext. This will significantly simplify this PR.

@devmotion
Copy link
Member Author

Also if you are trying to compute the log likelihood, I recommend overloading tilde and dot_tilde for the ESS sampler with ctx::DefaultContext and call the tilde or dot_tilde functions using the spl::SampleFromPrior and ctx::LikelihoodContext. This will significantly simplify this PR.

If I understand you correctly, then you suggest using

function tilde(ctx::DefaultContext, sampler::Sampler{<:ESS}, right, left, vi)
    if left isa VarName && left in getspace(sampler)
        return tilde(LikelihoodContext(), SampleFromPrior(), right, left, vi)
    else
        return tilde(ctx, SampleFromPrior(), right, left, vi)
    end
end

function dot_tilde(ctx::DefaultContext, sampler::Sampler{<:ESS}, right, left, vn::VarName, vi)
    if vn in getspace(sampler)
        return dot_tilde(LikelihoodContext(), SampleFromPrior(), right, left, vn, vi)
    else
        return dot_tilde(ctx, SampleFromPrior(), right, left, vn, vi)
    end
end

instead of the assume and observe implementation. Everything else would be unaffected it seems, so I'm not sure if this is actually the significant simplification you're referring to. I still think it would be nice to use and it seems to work as well.

However, in general it seems the approach of sampling from the prior and computing the proposals in the step! function is problematic since it can't cope with arrays of distributions. I'm not sure what would be the best way to handle this, it seems ideally one would like to use runmodel! also for sampling from the prior - in that case, however, one can't forward to SampleFromPrior since that sampler does not overwrite existing variables. Maybe one solution would be to always sample the variables of interest in the DefaultContext and use ctx::LikelihoodContext instead of ctx::DefaultContext in the examples above. But even in that case I would have to retrieve the mean of all different distributions to correct for them...

@cpfiffer
Copy link
Member

Perhaps it would be nice if we exposed something similar to vi[spl] that returned a vector of distributions -- would this help at all?

@mohamed82008
Copy link
Contributor

I meant something like this (only valid after #997 goes in).

# assume
function tilde(ctx::DefaultContext, sampler::Sampler{<:ESS}, right, vn::VarName, inds, vi)
    return tilde(LikelihoodContext(), SampleFromPrior(), right, vn, inds, vi)
end
# observe
function tilde(ctx::DefaultContext, sampler::Sampler{<:ESS}, right, left, vi)
    return tilde(LikelihoodContext(), SampleFromPrior(), right, left, vi)
end

# dot_assume
function dot_tilde(ctx::DefaultContext, sampler::Sampler{<:ESS}, right, left, vn::VarName, inds, vi)
    return dot_tilde(LikelihoodContext(), SampleFromPrior(), right, left, vn, inds, vi)
end
# dot_observe
function dot_tilde(ctx::DefaultContext, sampler::Sampler{<:ESS}, right, left, vi)
    return dot_tilde(LikelihoodContext(), SampleFromPrior(), right, left, vi)
end

This takes care of the assume, observe, dot_assume and dot_observe for you. It's also clearer than having to define those functions ourselves.

@mohamed82008
Copy link
Contributor

Perhaps it would be nice if we exposed something similar to vi[spl] that returned a vector of distributions -- would this help at all?

If it's helpful, we can have a wrapper around vi that gives us the distributions when we use getindex. For example, dists(vi) isa DistsWrapper and dists(vi)[spl] gives us what we want. dists(vi)[vn] also does the obvious.

@devmotion
Copy link
Member Author

I meant something like this

Ah OK, but then it seems it's basically what I outlined above. Your example wouldn't work, however, since the loglikelihood should be computed only based on the variable of interest. But it seems #997 would provide a simpler way to achieve what I sketched above by specifying the variables of interest vns in LikelihoodContext(vns)?

@yebai
Copy link
Member

yebai commented Dec 26, 2019

@mohamed82008 This PR contains some modification to RandomVariables. It might be useful to sync these changes with DynamicPPL.

@mohamed82008
Copy link
Contributor

@yebai sure.

@mohamed82008
Copy link
Contributor

mohamed82008 commented Dec 26, 2019

@devmotion in your implementation, the log prior will be counted in for the random variables that are not under the ESS sampler. I assume this is your intention. If so, then this PR looks good to me but I can't speak on the correctness of the logic.

@devmotion
Copy link
Member Author

@devmotion in your implementation, the log prior will be counted in for the random variables that are not under the ESS sampler. I assume this is your intention.

Yes, that's exactly my intention. Basically, I'm interested in the log joint minus the log prior of the random variables that are sampled by the ESS sampler. A motivating example is the hierarchical model

@model demo(x) = begin
    m ~ Normal()
    k ~ Normal(m, 0.5)
    x ~ Normal(k, 0.5)
end

in which we might want to sample from the posterior distribution of m and k using a Gibbs sampler with ESS for both variables. In this model, if k is given, the log likelihood of x (as evaluated by Turing's LikelihoodContext without the log priors) is independent of the choice of m. Clearly, that's not useful for evaluating which sample on the elliptical slice to accept, but instead we are interested in how the likelihood of k changes with the choice of m. Hence we have to evaluate the log priors of all variables that are not under the ESS sampler since they might be affected by a change in m.

@yebai yebai merged commit 20e81cc into TuringLang:master Dec 27, 2019
@yebai
Copy link
Member

yebai commented Dec 27, 2019

Great work, many thanks @devmotion!

@devmotion devmotion deleted the ess branch December 27, 2019 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants