Skip to content

Commit 3da1ab8

Browse files
committed
Mention Bayesian GMM to determine optimal number of components
1 parent 6e009a8 commit 3da1ab8

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

latent_variable_models_part_1.ipynb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -563,6 +563,8 @@
563563
"source": [
564564
"There is a strong increase in the lower bound value until $C = 3$ and then the lower bound more or less doesn't increase any more. With more components there are of course more options to overfit but the simplest model that reaches a relatively high lower bound value is a GMM with 3 components. This is exactly the number of components used to generate the data.\n",
565565
"\n",
566+
"A more principled approach to determine the optimal number of components requires a Bayesian treatment of model parameters. In this case the lower bound would also take into account model complexity and we would see decreasing lower bound values for $C \\gt 3$ and a maximum at $C = 3$. For details see section 10.2.4 in \\[1\\].\n",
567+
"\n",
566568
"### Implementation with scikit-learn\n",
567569
"\n",
568570
"The low-level implementation above was just for illustration purposes. Scikit-learn already comes with a `GaussianMixture` class that can be readily used."

0 commit comments

Comments
 (0)