Leveraged unsupervised learning techniques like principle component analysis and k-means clustering to compare demographics data for a German mail-order company to demographics of the German population at large. Used pandas and scikit-learn libraries to wrangle the data, perform dimensionality reduction, and clustering. Compared cluster sizes between the general population and the customer base to determine features that are aligned with the target audience.
This project requires Python 3.x and the following Python libraries installed:
You will also need to have software installed to run and execute an iPython Notebook