Skip to content

Density doesn't normalise in VBGMM and DPGMM #4267

@markvdw

Description

@markvdw

I'm having trouble using the VBGMM and DPGMM for density estimation. As far as I understand, both should have the same interface as the "normal" GMM. However, while the "normal" GMM produces a good fit, the VBGMM and DPGMM produce bad fits and non-normalised densities. This leads me to wonder whether there is something deeper wrong than me incorrectly using the code.

The problem presents itself both in the density estimation example, by appending the line:

print np.sum(np.exp(-Z)) * (x[1] - x[0]) * (y[1] - y[0])

This is approximately 1 when using a normal GMM, but much smaller when using the VB or DP GMM's.

The same behaviour is shown on a toy 1D density estimation problem:

import numpy as np
import numpy.random as rndn
import sklearn.mixture as skmix
import matplotlib.pyplot as plt

X = rnd.randn(0.7 * 300, 1) - 5
X = np.vstack((X, rnd.randn(0.3 * 300, 1) * 0.3 + 3))

# gmm = skmix.GMM(2)
gmm = skmix.DPGMM(2)
gmm.fit(X)

x = np.linspace(-10, 10, 1000)
p = np.exp(gmm.score(x))

plt.hist(X, bins=50, normed=True)
plt.plot(x, p)
plt.show()

integral = np.sum(p) * (x[1] - x[0])
print integral

Is this behaviour just the result of a poor fit due to a local optimum or something? The fact that the predictive densities don't normalise lead me to believe it's something else.

I asked the same question on StackOverflow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions