Skip to content

Commit

Permalink
DOC: explain the importance of iid index and queries in bench script
Browse files Browse the repository at this point in the history
  • Loading branch information
ogrisel authored and larsmans committed Dec 18, 2014
1 parent a43b91e commit fa70b57
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion benchmarks/bench_plot_approximate_neighbors.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,12 @@ def make_data(n_samples, n_features, n_queries, seed=0):
print('Generating random blob-ish data')
X, _ = make_blobs(n_samples=n_samples + n_queries,
n_features=n_features, centers=100,
random_state=seed)
shuffle=True, random_state=seed)

# Keep the last samples as held out query vectors: note since we used
# shuffle=True we have ensured that index and query vectors are
# samples from the same distribution (a mixture of 100 gaussians in this
# case)
return X[:n_samples], X[n_samples:]


Expand Down

0 comments on commit fa70b57

Please sign in to comment.