Skip to content

Commit

Permalink
Fix issues in 05.11
Browse files Browse the repository at this point in the history
  • Loading branch information
jakevdp committed Dec 1, 2016
1 parent d021555 commit 7168703
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions notebooks/05.11-K-Means.md
Original file line number Diff line number Diff line change
Expand Up @@ -310,7 +310,7 @@ Let's see how it does:
from sklearn.manifold import TSNE

# Project the data: this step will take several seconds
tsne = TSNE(n_components=2, init='pca', random_state=0)
tsne = TSNE(n_components=2, init='random', random_state=0)
digits_proj = tsne.fit_transform(digits.data)

# Compute the clusters
Expand All @@ -327,7 +327,7 @@ for i in range(10):
accuracy_score(digits.target, labels)
```

That's nearly 94% classification accuracy *without using the labels*.
That's nearly 92% classification accuracy *without using the labels*.
This is the power of unsupervised learning when used carefully: it can extract information from the dataset that it might be difficult to do by hand or by eye.


Expand Down Expand Up @@ -393,6 +393,8 @@ Now let's reduce these 16 million colors to just 16 colors, using a *k*-means cl
Because we are dealing with a very large dataset, we will use the mini batch *k*-means, which operates on subsets of the data to compute the result much more quickly than the standard *k*-means algorithm:

```python
import warnings; warnings.simplefilter('ignore') # Fix NumPy issues.

from sklearn.cluster import MiniBatchKMeans
kmeans = MiniBatchKMeans(16)
kmeans.fit(data)
Expand Down

0 comments on commit 7168703

Please sign in to comment.