![]() |
|---|
| A semantic network of the top 50 words in Ariana's discography |
To view a static version of this notebook, click here, or click
to view this project in a Google Colaboratory notebook.
I've been an Ariana Grande fan since 2013. Since then, she has released six albums (and counting) so I figured over 60 songs would be plenty of data to explore how her lyrics have changed over time.
The goal of this project is to use exploratory data analysis to uncover trends in the song lyrics using visualizations of the data. Specifically, I want to answer these questions:
- How do Ariana Grande's songs and albums differ in terms of sentiment? How are they similar?
- What are the most frequent words in her songs? Which bigrams does she use the most?
- How have Ariana Grande's songs changed over time?
- Retrieves song information by scraping data from the web using requests and BeautifulSoup
- Cleans data
- Structures and mungs data using pandas
- Applies common NLP techniques such as sentiment analysis
- Provides insightful visualizations with matplotlib and Gephi
To get the data I used BeautifulSoup to parse text from two websites: discogs.com and genius.com. I pulled the song titles from discogs and using the song title, made a request to genius for the lyrics of the song. I stored the data in a pandas data frame.
Through this project, I discovered the most prominent words and bigrams that appear in Ariana’s lyrics. Using sentiment analysis, I found the following:
- The high and low points in each of her albums
- Which albums were similar by sentiment
- Clusters of similar songs
- Songs that were outliers (had drastically different sentiment scores than others)
Using a co-occurence matrix of keywords I produced a semantic network highlighting clusters of interconnected words. From this we can conlcude that love is an overarching topic in her songs.
This project helped me understand trends in Ariana Grande's albums, her lexical diversity and how her albums have changed over time.
To add to this project, we can try running ML algorithms on the data to see if we can create a model that accurately predicts the sentiment of any song based on its order in the album or words that are likely to appear in future Ariana Grande albums. We can try to recommend similar songs based on sentiment or try this process on a different artist.
October 2020 Update: Ariana released a new album, and I am planning on incorporating lyrics from Positions into this project.
January 2021 Update: The notebook has been updated to include songs from Positions.

