Skip to content

A Jupyter Notebook with a Clustering and PCA Analysis of a Spotify songs dataset.

Notifications You must be signed in to change notification settings

santiagoarevalo/clustering-spotify-songs-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Spotify Song Clustering and PCA Analysis

In this project, we aim to use clustering and principal component analysis (PCA) to identify similarities between songs on Spotify. We will be using a dataset of songs that includes various audio features provided by Spotify's API.

Dataset 🔢

The dataset used in this project can be found on Kaggle. The dataset contains information on 12,000 songs, including their track ID, artist name, song title, and various audio features such as danceability, energy, loudness, and tempo.

Clustering 📉

Clustering is a technique used to group similar objects together based on their features. In this project, we will use clustering to group similar songs together based on their audio features. We will use the K-means algorithm for clustering, which is a popular clustering algorithm that aims to minimize the sum of squared distances between the data points and their assigned centroids.

PCA Analysis 📊

PCA is a technique used to reduce the dimensionality of a dataset while preserving the important information. In this project, we will use PCA to identify the most important audio features that contribute to the similarity between songs. We will use the scikit-learn library in Python to perform PCA analysis.

Dependencies and Libraries

This project requires the following dependencies:

  • pandas
  • numpy
  • scikit-learn
  • matplotlib
  • scipy

Running the Code

To run the code, follow these steps:

  • Clone this repository.
  • Download the dataset from Kaggle and place it in the same directory as the code.
  • Install the required dependencies.
  • Open the Jupyter notebook clustering.ipynb
  • Run the code cells in order.

About

A Jupyter Notebook with a Clustering and PCA Analysis of a Spotify songs dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published