Music or Lyrics

Predictive Models
Python
NLP
Classification
What is the best predictor of music genre classification - song audio features or lyrics? Project uses NLP and other classification models to investigate.
Published

March 2, 2022

This project explores which subset of song features can create the strongest predictor of music genre - music or lyrics?

To answer this question, this project collected data from Spotify (audio features) and Lyric Genius. Although each dataset contained the same songs, they were treated independently to build separate models (one using only audio feature data, the other using only lyric data for NLP).

The dataset is comprised of ~3500 songs and each song falls into 1 of the 4 following music genres: Country, Dance Pop, Hip Hop, or Rock. Song genres are identified using classifications created by the Every Noise At Once Project created by Glenn McDonald.

You can view the code for this project here.