In this repository, I have used open-source data published by OSMI (Open Sourcing Mental Illness). This data set contains the findings of the largest mental health survey conducted in the tech industry in 2014.
It contains detailed information on data cleaning, the challenges faced during cleaning, and data processing for a basic machine learning algorithm, which includes the KNN classifier, Decision Tree, Random Forest, Support Vector Machine, and Gaussian Naive Bayes.
Aside from that, I have used NLP (Natural Language Processing) which will help us decide what kinds of words in survey results help us determine the mental health of employees. Also, as part of the end result, I have used a word cloud to define which words highlight the most in the survey results. you can checkout the word cloud here and for more information you can check out Report.pdf