Skip to content

Data Analysis of Red and White Wine using its chemical properties. The Wine was rated by Sommeliers. A correlation map was produced to find out how each chemical property affected Red and White wines. Machine learning was then used to try to predict the wine's taste rating based on it's chemical properties. Tableau was used as a visualization aid.

Notifications You must be signed in to change notification settings

mcgeec91/Final-Project-UCF-Coding-Bootcamp

Repository files navigation

Proposal:

Our story… To overcome the complexity of wine tasting and finding the perfect wine. Our group is tasked with analyzing unknown Vinho Verde wines of Portugal for a new wine company PyWines Co. Our group is going to us ML technology and modeling to find and fit the “best” wine based on taste attributes.

Our group is collecting data from Kaggle where we are analyzing and visualizing a wine quality dataset that is based on physiochemical properties of red and white variants of the Portuguese "Vinho Verde" wine.

Objective:

Predict and visualize the quality of red and white wines based on taste attributes using ML (Sci-Kit Learn).

Requirements:

ML (Sci-Kit Learn), Python Pandas, Tableau, Front-End (HTML, CSS, Bootstrap)

Link to the dataset: https://www.kaggle.com/uciml/red-wine-quality-cortez-et-al-2009

Conclusions:

After thorough analysis, we were able to prove that Red and White wine have distinct characteristics that produce the best quality wines. Another conclusion that was made was when all wines were be combined into a single dataset. The machine learning methods used in this project were able to produce accurate results, and thus proving all wines in general have distribution levels of sugar, sulphates, citric acid (and etc.) that will produce a highly rated wine. But, this method was not as precise as the Red and White individually and the high accuracy results can be attributed to having the most data for the machine learning methods to look at. The Random Forrest method consistently produced the highest accuracy scores throughout testing. Aside from the multiple machine learning methods that were used, multiple sorting techniques were used in python to try to make our dataset less screwed as most wine fell into the 5-6 rating range which was originally labeled as "avg". With more data in the "above avg" and "below avg" categories, more accurate conclusions could be reached.

About

Data Analysis of Red and White Wine using its chemical properties. The Wine was rated by Sommeliers. A correlation map was produced to find out how each chemical property affected Red and White wines. Machine learning was then used to try to predict the wine's taste rating based on it's chemical properties. Tableau was used as a visualization aid.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published