Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify MvNormal Cholesky decomposition API #3881

Merged
merged 12 commits into from
Apr 28, 2020

Conversation

AlexAndorra
Copy link
Contributor

Thanks to the great advice of @aseyboldt (vielen Dank!), here is a PR to simplify the API for Cholesky decomposition of MvNormal, as detailed here.

  • Thanks to a wrapper function on LKJCholeskyCov class, the new behavior, triggered by the compute_corr argument, automatically computes and returns the expanded Cholesky matrix, the correlations matrix and the standard deviations. The posterior distributions of those parameters are also automatically returned in the trace. Here is an example:
sd_dist = pm.Exponential.dist(1.0, shape=3)
chol, corr, stds = pm.LKJCholeskyCov('chol_cov', n=3, eta=2, sd_dist=sd_dist, compute_corr=True)

mu = pm.Normal('mu', mu=0., sd=1., shape=3)
vals = pm.MvNormal('vals', mu=mu, chol=chol, shape=(20, 3))

The former behavior, which is still the default for backwards-compatibility, can be recovered with:

sd_dist = pm.Exponential.dist(1.0, shape=3)
packed_chol = pm.LKJCholeskyCov('packed_chol', n=3, eta=2, sd_dist=sd_dist)

chol = pm.expand_packed_triangular(3, packed_chol, lower=True)
cov = pm.math.dot(chol, chol.T)
stds = pm.Deterministic('stds', tt.sqrt(tt.diag(cov)))
corr = tt.diag(stds**-1).dot(cov.dot(tt.diag(stds**-1)))
corr = pm.Deterministic('Rho', corr[np.triu_indices(3, k=1)])

mu = pm.Normal('mu', mu=0., sd=1., shape=3)
vals = pm.MvNormal('vals', mu=mu, chol=chol, shape=(20, 3))
  • I also updated the docstrings and the examples in LKJCholeskyCov and MvNormal.
  • Once the implementation is definitive, I'll be happy to update the example notebook for the webite in the same PR.

I'm guessing this is not yet the optimal implementation, so I'm avalaible for any changes, and thank you in advance for the reviews 🖖

@codecov
Copy link

codecov bot commented Apr 15, 2020

Codecov Report

Merging #3881 into master will decrease coverage by 0.03%.
The diff coverage is 35.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3881      +/-   ##
==========================================
- Coverage   90.71%   90.67%   -0.04%     
==========================================
  Files         135      135              
  Lines       21314    21330      +16     
==========================================
+ Hits        19335    19342       +7     
- Misses       1979     1988       +9     
Impacted Files Coverage Δ
pymc3/distributions/multivariate.py 78.13% <35.71%> (-0.84%) ⬇️
pymc3/backends/report.py 93.07% <0.00%> (+0.10%) ⬆️
pymc3/distributions/timeseries.py 71.05% <0.00%> (+0.15%) ⬆️

Copy link
Member

@aseyboldt aseyboldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return the correlation matrix (or possibly the covariance) instead of rho.
For rho it is difficult to figure out what value is which. The correlation / covariance has the naturally correct shape.

pymc3/distributions/multivariate.py Show resolved Hide resolved
pymc3/distributions/multivariate.py Show resolved Hide resolved
pymc3/distributions/multivariate.py Show resolved Hide resolved
pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved
pymc3/distributions/multivariate.py Show resolved Hide resolved
@AlexAndorra
Copy link
Contributor Author

Thanks for the thorough review Adrian!
I integrated your comments and added a release note.

Copy link
Member

@aseyboldt aseyboldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found another one, sorry :-)
I think it would be better to not specify the full name of the stds as arguments, but instead a postfix. Default could be std_postifx='_stds' and corr_postfix='_corr'. Then the packed name should be {name}_packed__, the stds {name}_stds and corr {name}_corr. If the postix is None, we don't store the variable as deterministic.
That way it matches the pymc3 behavior for transformed variables for the most part.

@AlexAndorra
Copy link
Contributor Author

Mmmh yeah, I think I get what you're saying. That's a good idea.
Then I think the postfix should default to None, since compute_corr defaults to False.
Will work on it ASAP 👌

@aseyboldt
Copy link
Member

A non-None default is fine. There is no need for backward comp, because it is off anyway. And I don't think you should have to specify the postfix every time.

@AlexAndorra
Copy link
Contributor Author

Just updated the code. While working on it, I realized it doesn't really make sense to let users choose the postfix, since the customizable part should be the name in {name}_corr, and this is already the case.
So I just added a store_in_trace arg (defaults to True): when compute_corr=True, then the correlations and stds are stored in the trace as {name}_corr and {name}_stds respectively. If the user doesn't need them in the trace, she just has to specify store_in_trace=False.
Hope this makes sense! Tell me otherwise and I'll make the changes :)

@aseyboldt
Copy link
Member

Looks good to me.
There seems to be something off with the coverage. Anyone got an idea what that is about?

@AlexAndorra
Copy link
Contributor Author

Thanks Adrian! I'll work on updating the relevant notebooks then, so please do not merge yet.
Regarding coverage, I'm no expert, but what I got from other PRs I did/saw on PyMC/ArviZ, there seems to be a bug currently with codecov 🤷‍♂️

@twiecki twiecki added the WIP label Apr 19, 2020
@AlexAndorra
Copy link
Contributor Author

AlexAndorra commented Apr 19, 2020

Actually, I can't update the notebooks until issue #3884 is resolved.
How do you want to proceed? Merge this and I'll open a new PR once the issue is resolved? Or wait and have only this PR?

EDIT: I found a workaround for the issue -- I'll update the NBs accordingly and push everything on this PR when it's ready 🎉

@AlexAndorra
Copy link
Contributor Author

I just updated the notebooks, so this is ready for review, and ready for merge if the review is positive 🎉 Thanks in advance!
cc @aseyboldt @twiecki

@AlexAndorra AlexAndorra removed the WIP label Apr 22, 2020
@aseyboldt aseyboldt self-requested a review April 22, 2020 20:39
@AlexAndorra
Copy link
Contributor Author

Do you think this is ready for merge?
Tests have passed and the code changes were already approved above -- only the two NBs need to be reviewed now.

@twiecki twiecki merged commit ae54ba2 into pymc-devs:master Apr 28, 2020
@AlexAndorra
Copy link
Contributor Author

Thanks Thomas! I'll get to play with my new toy right now 🍾

@AlexAndorra AlexAndorra deleted the improve-mvnormal-api branch May 13, 2020 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants