Project I worked on over the summer with Oklahoma State University's Research for Undergraduates (REU) Program
The algorithm this implements is based on Dr. Charu C. Aggarwal's Setwise Stream Classification Problem (Link: https://www.researchgate.net/publication/266660383_The_setwise_stream_classification_problem?enrichId=rgreq-32f5c639b9bb483b8bdb0fbc0f364ad3-XXX&enrichSource=Y292ZXJQYWdlOzI2NjY2MDM4MztBUzoyNTk5OTk3Nzk3ODI2NjBAMTQzOTAwMDE4NjU3OA%3D%3D&el=1_x_2&_esc=publicationCoverPdf;)
The main goal of this project was to learn the basics about machine learning (classification in particular) and to implement Dr. Aggarwal's algorithm using a dataset collected from a real-world healthcare study to test the algorithm and to see what improvements could be made in the future. This alogrithm is a new approach to classifying data using machine learning (classification) with datasets composed of multiple subsets of data, allowing for a more accurate classification compared to density based classification. The algorithm also works in realtime, however, my implementation is not executed in realtime because the dataset I used was already collected and stored.
References:
(Algorithm)
- Charu C. Aggarwal. 2014. The setwise stream classification problem. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '14). Association for Computing Machinery, New York, NY, USA, 432–441. DOI:https://doi.org/10.1145/2623330.2623751
(Dataset)
- Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. A Public Domain Dataset for Human Activity Recognition Using Smartphones. 21th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2013. Bruges, Belgium 24-26 April 2013.