|
| 1 | +# ✅ Quiz M4.01 |
| 2 | + |
| 3 | +```{admonition} Question |
| 4 | +Imagine you work for a music streaming platform that hosts a vast library of |
| 5 | +songs, playlists, and podcasts. You have access to detailed listening data from |
| 6 | +millions of users. For each user, you know their most-listened genres, the |
| 7 | +devices they use, their average session length, and how often they explore new |
| 8 | +content. |
| 9 | +
|
| 10 | +You want to segment users based on their listening patterns to improve |
| 11 | +personalized recommendations, without relying on rigid, predefined labels like |
| 12 | +"pop fan" or "casual listener" which may fail to capture the complexity of |
| 13 | +their behavior. |
| 14 | +
|
| 15 | +What kind of problem are you dealing with? |
| 16 | +
|
| 17 | +- a) a supervised task |
| 18 | +- b) an unsupervised task |
| 19 | +- c) a classification task |
| 20 | +- d) a clustering task |
| 21 | +
|
| 22 | +_Select all answers that apply_ |
| 23 | +``` |
| 24 | + |
| 25 | ++++ |
| 26 | + |
| 27 | +```{admonition} Question |
| 28 | +The plots below show the cluster labels as found by k-means with 3 clusters, only |
| 29 | +differing in the scaling step. Based on this, which conclusions can be obtained? |
| 30 | +
|
| 31 | + |
| 32 | + |
| 33 | +
|
| 34 | +- a) without scaling, cluster assignment is dominated by the feature in the vertical axis |
| 35 | +- b) without scaling, cluster assignment is dominated by the feature in the horizontal axis |
| 36 | +- c) without scaling, both features contribute equally to cluster assignment |
| 37 | +
|
| 38 | +_Select a single answer_ |
| 39 | +``` |
| 40 | + |
| 41 | ++++ |
| 42 | + |
| 43 | +```{admonition} Question |
| 44 | +Which of the following statements correctly describe factors that affect the |
| 45 | +stability of k-means clustering across different resampling iterations of the data? |
| 46 | +
|
| 47 | +- a) K-means can produce different results on resampled datasets due to |
| 48 | + sensitivity to initialization. |
| 49 | +- b) If data is unevenly distributed, the stability improves when increasing the |
| 50 | + parameter `n_init` in the "k-means++" initialization. |
| 51 | +- c) Stability under resampling is guaranteed after feature scaling. |
| 52 | +- d) Increasing the number of clusters always reduces the variability of |
| 53 | + results across resamples. |
| 54 | +
|
| 55 | +_Select all answers that apply_ |
| 56 | +``` |
| 57 | + |
| 58 | ++++ |
| 59 | + |
| 60 | +```{admonition} Question |
| 61 | +Which of the following statements correctly describe how WCSS (within-cluster |
| 62 | +sum of squares, or inertia) behaves in k-means clustering? |
| 63 | +
|
| 64 | +- a) For a fixed number of clusters, WCSS is lower when clusters are compact. |
| 65 | +- b) For a fixed number of clusters, WCSS is lower for wider clusters. |
| 66 | +- c) For a fixed number of clusters, lower WCSS implies lower computational cost |
| 67 | + during training. |
| 68 | +- d) Assuming `n_init` is large enough to ensure convergence, WCSS always |
| 69 | + decreases as the number of clusters increases. |
| 70 | +
|
| 71 | +_Select all answers that apply_ |
| 72 | +``` |
| 73 | + |
| 74 | ++++ |
| 75 | + |
| 76 | +```{admonition} Question |
| 77 | +Which of the following statements correctly describe differences between |
| 78 | +supervised and unsupervised clustering metrics? |
| 79 | +
|
| 80 | +- a) Supervised clustering metrics such as ARI and AMI require access to ground |
| 81 | + truth labels to evaluate clustering performance. |
| 82 | +- b) WCSS and the silhouette score evaluate internal cluster structure without |
| 83 | + needing reference labels. |
| 84 | +- c) V-measure is zero when labels are assigned completely at random. |
| 85 | +- d) Supervised clustering metrics are not useful if the number of clusters does |
| 86 | + not match the number of predefined classes. |
| 87 | +
|
| 88 | +_Select all answers that apply_ |
| 89 | +``` |
0 commit comments