Description
🚀 Feature
Lets add clustering metrics to TM:
- Dunn index -> New Metric: Dunn Index #2049
- Silhouette @stancld
- Rand score -> New metric:
Rand Score
#2025 - Adjusted_mutual_info_score -> New metric: Adjusted mutual info score #2058
- Adjusted rand score -> New metric: Adjusted Rand Score #2032
- Calinski Harabasz score -> New metric: Calinski Harabasz Score #2036
- Completeness score -> New metrics: Homogeneity, Completness, V-Measure #2053
- Fowlkes mallows score -> New metric: Fowlkes-Mallows Index #2066
- Homogeneity completeness measure -> New metrics: Homogeneity, Completness, V-Measure #2053
- Mutual info score -> Mutual Information Score #2008
- Normalized mutual info score -> New metric: Normalized Mutual Information Score #2029
- Davies–Bouldin index -> New metric: Davies bouldin score #2071
- V measure score -> New metrics: Homogeneity, Completness, V-Measure #2053
Motivation
In Supervised Learning, the labels are known and evaluation can be done by calculating the degree of correctness by comparing the predicted values against the labels. However, in Unsupervised Learning, the labels are not known, which makes it hard to evaluate the degree of correctness as there is no ground truth.
That being said, it is still consistent that a good clustering algorithm has clusters that have small within-cluster variance (data points in a cluster are similar to each other) and large between-cluster variance (clusters are dissimilar to other clusters).
ref: https://towardsdatascience.com/7-evaluation-metrics-for-clustering-algorithms-bdc537ff54d2
CTA
pls also check our contribution guide: https://torchmetrics.readthedocs.io/en/stable/generated/CONTRIBUTING.html
Metadata
Metadata
Assignees
Type
Projects
Status