Divide a numerical dataset in k clusters using the divisive approach.
- We start with a cluster made up of all the points.
- From that cluster the farthest point with respect to the other points (dissident) is located. The dissident forms its own cluster.
- Then we iterate through all the points checking their average distance to both clusters (or the total clusters formed) , they are assigned to the closest cluster. We keep iterating until there are no more changes.*
- We repeat steps from 2 until k specified clusters are formed.
- Distance is euclidean
- Distance between clusters is average distance between all posible pairs of their points
- For now it only works with 2 variables (feel free to collaborate).
import seaborn as sns
iris = sns.load_dataset('iris')
iris = iris[['sepal_length','sepal_width']]
graficar_labels(iris,cluster_divisivo(iris,3))