Project developed in order to create an algorithm to help on the processing of Data Mining over a Data Set, included at the project. The idea was to extract metrics to use as input to visualization.
-
Project:
K-Means Algorithm -
Objective:
Personal -
Status:
Completed -
Technology:
Java -
Date:
Nov 2015
The K-Means algorithm works in the following way:
- First, it's needed to specify how many clusters K you will have
- Shuffle the data you want to use, and define K data points. This way, you are initiating the centroids
- Iterate over the data until there's no change on centroids.
- Take the distance between every centroid and datapoints, square it and sum
- Take the average of all datapoints, for each cluster
This code generates two results: the Iris and Yeast Gene. It simulates the use of Clusters, in order to analyze the data.
Made by Gabriel Ferreira 💻 Find me at Linkedin!