Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Added implementation of SGHMC and example. #415

Merged
merged 4 commits into from
Jan 28, 2017
Merged

[WIP] Added implementation of SGHMC and example. #415

merged 4 commits into from
Jan 28, 2017

Conversation

dwadden
Copy link
Contributor

@dwadden dwadden commented Jan 19, 2017

I'm working with @nfoti and @yianma. We have some questions on a few small parts of the code. See comments in changed files.

decreases.
Implements the update equations from (15) of Chen et al., 2014.
"""
# TODO: Would it be more memory-efficient to use tf.control_dependencies,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does making aliases make copies of the tensors? If so, would it save memory to use tf.control_dependencies and mutate the objects directly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure i understood the question but for example:

>>> x = tf.constant(0)
>>> y = x
>>> x, y
(<tf.Tensor 'Const:0' shape=() dtype=int32>, <tf.Tensor 'Const:0' shape=() dtype=int32>)

y has a reference to the same tensor and doesn't recreate it from scratch, if that's what you're asking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that was the question. Just wanted to make sure it wasn't recreating.


# Simulate Hamiltonian dynamics with friction.
friction = tf.constant(self.friction, dtype = tf.float32)
# TODO: Allow option for exponentially decaying learning rate, or similar.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For SGLD, we found that using a constant learning rate performed better than using exponential decay. Perhaps this should be an option passed into the inference method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, that makes sense. note one difficulty i had when implementing SGLD was that i could not leverage any of the existing tensorflow optimizers, and if more learning rates were an option, one would need to implement them all from scratch. if there were a way to use the tensorflow optimizers, it would be nice so long as the implementation doesn't ruin the readability and/or speed of the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it re: tensorflow.

I'll make a new PR that adds an option to keep the learning rate constant, along with an example showing cases where this may be a bit easier to work with.

z_sample : dict
Latent variable keys to samples.
"""
# TODO: This appears to be identical across HMC, SGHMC, SGLD. Should it be
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _log_joint function looks like it's the same. Refactor?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that sounds useful to refactor (although i'm not sure where the best location is to put it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Will think about this and submit another PR.

from edward.models import Empirical, MultivariateNormalFull

# Helper functions
# TODO: Where should these go?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some helper functions to visualize the contours of a bivariate Gaussian (or more generally, any Edward distribution with a pdf method). Should this be refactored?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's awesome. maybe we can leave it here for now and think about refactoring it in another PR that focuses more on general visualizations for edward

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool.

if label:
plt.clabel(cs, inline=1, fontsize=10)

def retrieve_samples(qz):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took a bit of work to figure out how to grab the actual MCMC samples. Is there a method to do this that we missed, or should we move this elsewhere to allow for more general use?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you just want the samples you can do qz.params

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks

Copy link
Member

@dustinvtran dustinvtran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for submitting the PR! this is excellent work. excited to have SGMCMC methods to experiment with.

if label:
plt.clabel(cs, inline=1, fontsize=10)

def retrieve_samples(qz):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you just want the samples you can do qz.params


# Simulate Hamiltonian dynamics with friction.
friction = tf.constant(self.friction, dtype = tf.float32)
# TODO: Allow option for exponentially decaying learning rate, or similar.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, that makes sense. note one difficulty i had when implementing SGLD was that i could not leverage any of the existing tensorflow optimizers, and if more learning rates were an option, one would need to implement them all from scratch. if there were a way to use the tensorflow optimizers, it would be nice so long as the implementation doesn't ruin the readability and/or speed of the code.

decreases.
Implements the update equations from (15) of Chen et al., 2014.
"""
# TODO: Would it be more memory-efficient to use tf.control_dependencies,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure i understood the question but for example:

>>> x = tf.constant(0)
>>> y = x
>>> x, y
(<tf.Tensor 'Const:0' shape=() dtype=int32>, <tf.Tensor 'Const:0' shape=() dtype=int32>)

y has a reference to the same tensor and doesn't recreate it from scratch, if that's what you're asking.

z_sample : dict
Latent variable keys to samples.
"""
# TODO: This appears to be identical across HMC, SGHMC, SGLD. Should it be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that sounds useful to refactor (although i'm not sure where the best location is to put it)

from edward.models import Empirical, MultivariateNormalFull

# Helper functions
# TODO: Where should these go?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's awesome. maybe we can leave it here for now and think about refactoring it in another PR that focuses more on general visualizations for edward

nfoti and others added 2 commits January 20, 2017 14:20
Figured out how to retrieve MCMC trace more easily.

Clean up normal_sghmc

- Make indent 2 spaces instead of 4
- Get rid of TODO's.

Clean up sghmc.py

Get rid of TODO's.
@dwadden
Copy link
Contributor Author

dwadden commented Jan 20, 2017

Cleaned things up a bit. Got rid of the README notes since we've now discussed them. Squashed into previous commit, which I now know is a bad thing to do because our discussions are now "outdated". Lesson learned.

Let me know if there are other things you'd like changed on this PR. Otherwise, I'll start thinking about the other things we discussed: step size, code viz, and refactor of _log_joint.

@dustinvtran
Copy link
Member

dustinvtran commented Jan 21, 2017

looks good to me. can you add an example with data? e.g., linear/logistic regression is fine.

@dwadden
Copy link
Contributor Author

dwadden commented Jan 23, 2017

Will do.

@dwadden
Copy link
Contributor Author

dwadden commented Jan 26, 2017

Git style question: I'm want to update this branch with the latest changes from master, which contains relevant bug fixes for #412 . I'm also going to add my integer type fix.

In general, this can be done by merging master into this branch, or rebasing the branch onto master. The problem with the former approach is it will make the history messy, while doing the latter makes it impossible to view changes that were made before the rebase (like what's happened above).

Which do you prefer?

@dustinvtran
Copy link
Member

i personally prefer to rebase to keep the commit history clean. that said, we're always squashing+merging pull requests, so feel free to do either.

@dwadden
Copy link
Contributor Author

dwadden commented Jan 27, 2017

Cool, rebase works for me.

@dustinvtran
Copy link
Member

i ran the examples in tensorflow 0.11.0 with python 2 and 3. i also double-checked upon merging that they work in tensorflow 1.0.0rc0. merging now.

@dustinvtran dustinvtran merged commit 7164162 into blei-lab:master Jan 28, 2017
@dwadden dwadden deleted the sghmc branch January 29, 2017 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants