Case study: Analyze the candy power ranking to identify and recommend popular candy characteristics.
The dataset by FiveThirtyEight is distributed un the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/).
The production of a new candy is planned. Among the project team there is no consensus about the characteristics of the candy.
Based on a dataset from market analysis, the task is to give a clear recommendation for what characteristics the new product should express.
The results are compiled in a presentation with a clear recommendation. The presentation is in German, but the numbers speak for themselves.
- Scipy
- Scikit-learn
- Seaborn
See requirements.txt.
- Small dataset (86 rows/samples)
- Data is aggregated over brands (e.g. win percentage)
- Study design might not be fair (not blind)
Treating the problem not as a regression but as a classification and using statistical analysis allows to identify features that are statistically dependent with popular brands. With interaction terms, the combination of successful characteristics can be recommended.