Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPCA Tutorial #499

Merged
merged 5 commits into from
Mar 8, 2017
Merged

PPCA Tutorial #499

merged 5 commits into from
Mar 8, 2017

Conversation

timshell
Copy link
Contributor

@timshell timshell commented Mar 4, 2017

Re: discussion in Gitter channel

Not sure what level of detail is expected. I used the other tutorials as my guide. Wanted to get one tutorial done in line with what you expect before I start doing more.

All feedback appreciated!

@dustinvtran
Copy link
Member

dustinvtran commented Mar 4, 2017

Thanks for writing this. Your description of PPCA is accurate and succinct.

I think the ideal for a model tutorial should roughly explain its key ideas through an illustration with a data set; followed by the model; followed by an algorithm to infer it; followed by a check of its fit. (Good references are http://edwardlib.org/tutorials/unsupervised and http://edwardlib.org/tutorials/gan. Some of the other model tutorials are lacking in this regard.) So you could probably scaffold your current writing into sections with the data and output of the script.

@timshell
Copy link
Contributor Author

timshell commented Mar 7, 2017

Thanks for the quick feedback @dustinvtran and sorry it took this long to revise. Let me know if I can improve this in any way. Thanks!

docs/tex/bib.bib Outdated
year = {1999},
volume = {61},
pages = {611--622}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing }. adding the ending brace lets it compile successfully for me.

docs/tex/bib.bib Outdated
@@ -667,6 +667,14 @@ @article{marin2012approximate
pages = {1167--1180}
}

@ARTICLE{Tipping99probabilisticprincipal,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i recommend the citekey format tipping1999probabilistic, following google scholar.


\subsubsection{Data}

We simulate our data points below. We'll talk about the individual variables and what they stand for in the next section. For this example, $\mathbf{x}\in\mathbb{R}^2$.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"For this example, each data point is 2-dimensional, $\mathbf{x}_n\in\mathbb{R}^2$."


\subsubsection{Model}

Consider a dataset $X = \{(\mathbf{x}_1, \mathbf{x}_2,\ldots , \mathbf{x}_n)\}$ where $\mathbf{x}_i \in \Bbb{R}^D$.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the code, I prefer the notation that N denotes the data set size and n denotes an index to one data point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, are you saying to change the i's to n? Or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore - figured it out by looking at http://edwardlib.org/tutorials/unsupervised


Note here that regular PCA is simply the specific case of Probabilistic PCA, where $\sigma^2 \to 0$.

We set up our model below.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also describe in the model section that you're placing a distribution over principle axes, either viewed as a prior or as a regularizer.


\subsubsection{Inference}

Since $\mathbf{W}$ cannot be analytically determined, we must use some approximation method. Below, we set up our inference variables and then run the approximation algorithm. For this example, our method is to minimize the $\text{KL}(q\|p)$ divergence measure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Since ... determined, we must use some inference method." In this model, I think the posterior is actually normally distributed.

@dustinvtran
Copy link
Member

@timshell: The tutorial looks great. Only minor suggestions above. Happy to merge it once those are made.

@timshell
Copy link
Contributor Author

timshell commented Mar 8, 2017

Sweet, just made the changes @dustinvtran !

@dustinvtran dustinvtran merged commit e65a5f9 into blei-lab:master Mar 8, 2017
@timshell timshell deleted the tutorials branch March 8, 2017 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants