Visualization, Sampling, Statistical Analysis, and Classification to GOOD, BAD categories that's on "Wine Quality" dataset.
It’s a project for a course in collage. The goal was to understand and apply:
- Probability and statistics concepts to extract new info from data.
- Data Visualization in different plotting methods to understand data disribution and getting scense of statistical extracted info.
- Random sampling and compare the statistical info of each sample with the others.
- Two different Machine Learning algorithms (LogisticRegression, SVM) to predict wine quality (good or bad) and compare accuracies of the two models.
I used Python and some packages like:
- NumPy.
- Pandas.
- Matplotlib.
- Seaborn.
- scikit-learn.