Unsupervised Learning on the Flags Dataset: PCA and K-Means Clustering

This project explores how Principal Component Analysis (PCA) and K-Means clustering perform on the Flags dataset from the UCI Machine Learning Repository. The study highlights how datasets with mixed feature types influence interpretability and model performance.

Project Structure

flags_pca_clustering.ipynb — Main Jupyter notebook containing code and explanations.
flags_pca_clustering.html — Exported HTML version of the notebook.
flags_pca_clustering.pdf — Exported PDF version of the notebook.
figures/ — Contains exported plots mainly for reference; all key figures are already embedded in the outputs.

Key Points

Perform PCA and K-Means clustering on the Flags dataset.
Conduct exploratory data analysis to visualize trends and correlations.
Assess PCA decomposition onto first two principal components.
Analyze clustering results and discuss the tradeoff between parsimony and interpretability.

Requirements

Python (3.10.16 recommended)
Jupyter Notebook / Jupyter Lab
Python packages: pandas, numpy, matplotlib, seaborn, altair, scikit-learn, ucimlrepo

You can install the required packages using:

pip install pandas numpy matplotlib seaborn altair scikit-learn ucimlrepo

How to Use

Clone or download this repository.
Open flags_pca_clustering.ipynb in Jupyter Notebook or Jupyter Lab and load the dataset via the ucimlrepo package.
Run all cells to reproduce results, figures, and exported HTML/PDF outputs.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
figures		figures
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
flags_pca_clustering.html		flags_pca_clustering.html
flags_pca_clustering.ipynb		flags_pca_clustering.ipynb
flags_pca_clustering.pdf		flags_pca_clustering.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Unsupervised Learning on the Flags Dataset: PCA and K-Means Clustering

Project Structure

Key Points

Requirements

How to Use

About

Uh oh!

Releases

Packages

Languages

License

alan-c-lin/flags_pca_clustering

Folders and files

Latest commit

History

Repository files navigation

Unsupervised Learning on the Flags Dataset: PCA and K-Means Clustering

Project Structure

Key Points

Requirements

How to Use

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages