Skip to content

Generate genre-specific lyrics with AI and save your hit songs as an .mp4 file.

Notifications You must be signed in to change notification settings

marcelacastano/NLP-and-Machine-Music

 
 

Repository files navigation

NLP & Machine Music

robot_sings

Objectives

  • Analyze lyric data with Natural Language Processing techniques
    • Tokenization
    • Sentiment analysis
    • N-grams, frequency analysis
    • Named entity recognition
  • Become familiar with text prediction algorithms using machine learning
  • Explore text prediction methodologies

Notebooks

Access the genre of choice for notebook containing analysis:

Country

EDM

Hip Hop

Rock

RnB

POP

To view the summary for all genres, check out:

Visualizations for All Genres

Predictive Models

  • We carried out next word prediction algorithm using the music data from a specific genre, using the following:
    • Markov Chains
    • Maximum Likelihood Estimator Algorithm

Markov Chains: Randomized text prediction

A Markov chain is a stochastic technique, but it differs from a general stochastic technique in that a Markov chain must be "memory-less." That is, (the probability of) future actions are not dependent upon the steps that led up to the present state. This is called the Markov property. 1

markovgif

markov_diagram

For more details, we recommend the following video.

Language models with NLTK

natural_language_toolkit

For further reading, consider this Medium Article.

Also refer to the MLE documentation.

Maximum Likelihood Estimator from NLTK

Maximum_Likelihood_Estimation

This article served as a starting point to our endeavor in text prediction.

Check out the Language Model Module from NTLK for more information on the different models to choose from.

Natural Language Processing

Data Preprocessing

We obtained our lyric data from Shazam Core API at RapidAPI.com

The specific API endpoints used were:

endpoints

  • @ World Chart by Genre endpoint:

    • feed a genre and the limit number of songs to retrieve
    • obtain top chart for genre with trackID, artist, song name
  • @Track Details endpoint:

    • feed trackID
    • obtain lyrics for song

We then generated a dataframe with the lyrics and dropped any chart songs for which lyrics could not be obtained through the API.

lyrics df

Genre Top Song Charts

country

EDM

Hip Hop

RnB

POP

Rock

Sentiment Analysis

country-sent

EDM

Hip Hop

RnB

POP

Rock

Ngrams and Frequency Analysis

Top Word Frequency Analysis

country

EDM

hiphop

RnB

POP

Rock

Named Entity Recognition

country-ner

EDM

hiphop

RnB

POP

Rock

Word Clouds

Country

country

EDM

EDM

Hip Hop

Hip Hop

RnB

RnB

POP

RnB

Rock

Rock

Next Word Prediction

We used Google's Text-To-Speech library to generate mp4 files of our Markov Chains and AI generated lyrics.

gtts

Here is a look at our code with Markov Chains

markov_gif

Here is a look at our code with Maximum Likelihood Estimator

mle_gif

Here are lyric snippets for each genre:

Snippets of Country MLE Algorithm

country_mle_text.mp4
country_mle_text_OG.mp4

Snippet of EDM MLE Algorithm

mle_lyrics_EDM.mp4

Snippet of Hip Hop Markov

hiphop_markovchains_snippet.mp4

Snippet of Hip Hop MLE Algorithm

hiphop_mle_snippet.mp4

Snippet of RnB Markov

rnb_marchov_snippet.mp4

Snippet of RnB MLE Algorithm

rnb_mle_snippet.mp4

Snippet of Pop Markov

pop_marchov_snippet.mp4

Snippet of Rock Markov

mle_lyrics_Rock.mp4

Pop Text MLE Algorithm

pop_mle_text.mp4

Snippet of Pop MLE Algorithm

pop_mle_snippet.mp4

Model Scores

MLE is an N-gram model Algorithm

scores

Some examples from the Country Lyrics Model:

  • The probability of 'woman' appearing in the text is: 0.00139

  • The probability of 'feel like' to be followed by 'a' is: 0.5

  • The probability of 'feel like a' to be followed by woman' is: 1.0

unigram

jingle

Perplexity

perplexity

From the Country Lyrics Model:

  • The perplexity of 'aliens are' is: inf

  • The perplexity of 'old man' is: 7.667

  • The perplexity of 'bell rock' is: 3.4

  • The perplexity of 'jingle bell' is: 1.333

  • The perplexity of 'country boy' is: 1.273

Feel free to read up more on Perplexity and Language Models or watch this video.

Conclusions

Summary of All Genres

Here are the overall results for the sentiment analysis:

Sentiment

VADER concluded that most of the top chart songs across genres were Positive


These are the most used words in the Top Chart Songs for the analyzed genres:

Top Words

This is the word cloud for the most used words across genres:

WC

'like', 'yeah', 'know', 'got' and 'love' are the most used words across all analyzed genres.


These are the frequencies of each Named Entity found in the Top Chart Songs for all genres:

Entities

The main focus across the analyzed genres seems to be people


Limitations

  • We were not able to compare MLE to other language model algorithms due to time constraints.

  • Detokenizer: Returned text is readable, but lacking in punctuation and paragraph structure.

  • NER: The Person and GPE Named entities were greatly mis-identified by the Spacy's NER. The image below is from the country dataset.

    limitations NER


Miami FinTech Bootcamp 2021-2022

Monique Ferguson, Andrew Hidalgo, Frank Lau and Marcela Castaño

Footnotes

  1. https://brilliant.org/wiki/markov-chains/

About

Generate genre-specific lyrics with AI and save your hit songs as an .mp4 file.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.9%
  • Python 0.1%