This repository contains my work for the fifth project of the GE-461: Introduction to Data Science course.
In this project, the objective is to explore the concept of classification ofdata streams. We generate data streams with varying noise and drift features. Then we implement K Nearest Neighbors (KNN), Hoeffding Tree (HT), and Naïve Bayes (NB) with varying batch sizes to analyze online learners, and we implement Majority Voting (MV) and Weighted Majority Voting (WMV) in order to observe the performance of ensemble methods. Finally, we also take a look at another way to increase the performance of our models. My report can be accessed here: https://xmassmx.github.io/GE-461-Data-Stream-Mining/
Note: Please do not copy this work and stay away from plagiarism. The work in this repository is my solution and is meant to be used as a guide only.