ML Unsupervised Learning: Identify Customer Segments with Machine Learning

In the following project I apply unsupervised machine learning techniques to identify customer segments of a population.
These segments will then be used for direct marketing campaigns towards audiences that have the highest expected rate of returns.
This population forms the core customer base for a mail-order sales company in Europe.
The data provider and data itself has been anonymized.

Project Overview

There are four files associated with this project:

File Description

Anonymous_MAIN_Subset.csv: Demographics data for the general population; 891211 persons (rows) x 85 features (columns).
Anonymous_CUSTOMERS_Subset.csv: Demographics data for customers of a mail-order company; 191652 persons (rows) x 85 features (columns).
Data_Dictionary.md: Detailed information file about the features in the provided datasets.
MAIN_Feature_Summary.csv: Summary of feature attributes for demographics data; 85 features (rows) x 4 columns

Data Description

Each row of the demographics files represents a single person, but also includes information outside of individuals, including information about their household, building, and neighborhood. You will use this information to cluster the general population into groups with similar demographic properties. Then, you will see how the people in the customers dataset fit into those created clusters. The hope here is that certain clusters are over-represented in the customers data, as compared to the general population; those over-represented clusters will be assumed to be part of the core userbase. This information can then be used for further applications, such as targeting for a marketing campaign.

Directions

To start off with, load in the demographics data for the general population into a pandas DataFrame, and do the same for the feature attributes summary. Note for all of the .csv data files in this project: they're semicolon (;) delimited, so you'll need an additional argument in your read_csv() call to read in the data properly. Also, considering the size of the main dataset, it may take some time for it to load completely.

Once the dataset is loaded, it's recommended that you take a little bit of time just browsing the general structure of the dataset and feature summary file. You'll be getting deep into the innards of the cleaning in the first major step of the project, so gaining some general familiarity can help you get your bearings.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Identify_Customer_Segments.ipynb		Identify_Customer_Segments.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ML Unsupervised Learning: Identify Customer Segments with Machine Learning

Project Overview

File Description

Data Description

Directions

About

Uh oh!

Releases

Packages

Uh oh!

Languages

BryanHolbrook/ml-unsupervised-learning

Folders and files

Latest commit

History

Repository files navigation

ML Unsupervised Learning: Identify Customer Segments with Machine Learning

Project Overview

File Description

Data Description

Directions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages