Visualisations and Analysis

The following visualisations are intended to show correlations and relationships between our focus variables (artist gender, genre, emotional categories, and lyrical sentiment scores), as well as highlighting any changes over time, in a way that is easy to understand and that helps us answer the questions posed in our research aims.

SUMMARY PLOTS

These BAR GRAPHS , separated into songs with positive and negative NLTK sentiments, summarise how many male, female, and group songs are categorised under each of the eight emotional categories in the NLP toolkit.

View Data Set

For songs with a positive sentiment, male, female, and group songs all follow the same pattern of having the most songs categorised under joy, followed by anticipation, trust, etc. There is a slight difference for the least populated categories in emotion as males and groups have more songs in surprise than disgust, and females have the opposite. For songs with negative sentiment, the only difference in the order of the most densely populated emotional categories is that there are the most songs under sadness for males and females, the second most in anger; for groups the opposite is true. Despite these differences, male, female, and group led songs tend to populate similar emotional categories for each type of sentiment.

As one may expect, by far the most populated emotion for songs with positive sentiment is joy. For negatively sentimental songs, the most populated emotions are anger, sadness, and fear. The reason there are more joy, anticipation, and trust songs that are negative compared to disgust, stereotypically a more negative emotion, is probably because of popular subjects in songs. The same goes for surprise, which could be positive or negative, being lower than sadness, fear, and anger in the positive sentiment graph. According to a study on the themes of US songs from 1960 to 2010, the most consistent common themes across the 50 years are romantic and sexual relationships (Christenson et al., 2018). These topics fit better with the other emotions mentioned compared to disgust and surprise.

The following STACKED BAR GRAPH shows what percentage of songs each year were led by males, females, or groups. Also shown in the sentiment/emotion bar graphs above, our data set contains about double the number of male and group songs compared to female songs. On average, in each year there are 43 group songs, 40 male songs, and 15 female songs (calculated averages rounded to the nearest whole number). The reason these add up to 98 not 100 is because of the few Billboard songs that could not be processed. We have a total of 6055 songs, about 98 per year.

Songs led by individual males, individual females, or group of artists

Percentage of songs led by each category of artist in each year

This summary graph flags the low number of female led songs which may interfere with some of the analysis, as it is difficult to draw reliable conclusions on a limited set of data. It may be the case that females are present in the Billboard charts more often as groups or in collaborations, or that male artists dominated popular music.

Overall Sentiment Distribution:

Screenshot 2022-01-08 at 12.07.29 AM.png

This DENSITY PLOT shows the frequency of different sentiment scores of all the male-led and female-led songs in the dataset. For both genders, the sentiments tend towards the extreme minimum and maximum sentiment scores at -0.9 to -1 and 0.9 to 1, respectively. There are very small fluctuations between -0.9 and 0.9, indicating that there are few songs with semi-positive, neutral, or semi-negative sentiments. The red peaks going above the blue show that female artists have slightly more songs at the sentiment score extremes, and male songs are marginally more spread out with more songs between -0.8 to -0.95 and 0.6 to 0.9 compared to females. There are over double the number of positive songs than negative.

As stated, the most common song subjects are romantic and sexual relationships, with an increase in topics such as dancing/partying, wealth/status, substance use, and pain/loss in recent decades (Christenson et al., 2018; Tunedly Team, 2021). These themes are personal human experiences or are related to conventional fun activities, neither of which suit an overall tone of neutrality. Therefore, it makes sense that most songs have strong sentiments and that many more are on the positive side of the NLTK score spectrum, as shown in the distribution graph.

VIOLIN PLOTS

The following violin plots compare song sentiment scores between male and female led songs in each decade (the final decade includes 2020 and 2021).

1/6

Median Values For Each Plot

A violin plot combines the information of a box plot and a histogram, showing both the actual distribution of the data and the probability of the distribution of the data.

The grey rectangle shows the interquartile range for the sentiment scores for all songs each year. In this case, the ‘whiskers’ are not as obvious because the majority of songs clustered close to the extremes of NLTK scores, i.e. 1 and -1. The white dot shows the median NLTK score that applies to all songs, regardless of gender.

The probability aspect of the violin plot may be misleading. The violin plots are being used to compare male and female data distribution rather than investigate NLTK scores themselves. In the graphs, it appears there is a possibility of getting a data point with a sentiment score above 1 or below -1, which are the sentiment score’s maximum and minimum limits, respectively. To correct this, more ‘bins’ were added, but this made the overall shape less clear, making a male vs female comparison more difficult. Since our project focuses on gender comparison, we will use these graphs to do a side-by-side comparison of the distribution. The probability distribution also clearly highlights that there are two peaks, correctly reflecting that the data converges near the extremes of NLTK scores of -1 and 1, which could not be seen on just a box plot.

In general, the sentiment scores do not vary greatly between the two genders. There are a few years where the female distribution is much flatter compared to the two distinct peaks of the male songs, such as in the years 1966, 1967, 1971, 1972, and 1974. However, rather than indicating that female songs in these years had a more even distribution within the NLTK scale, this is likely due to the lack of female song data points available. For example, in 1966, there were 41 male songs but only 6 female songs in the January Billboard Top 100. In 1972, there is data from 37 male songs but only 7 female songs. On the male side, there is enough data to form a more reliable distribution curve relative to the female side; the lack of female song data reflects as an indistinct shape.

There are a few instances where female songs tend to be more positive than the male songs, such as in the years 1963, 1976, 1980, 1989, and 2008. In fact, in 1976, the violin plot does not display properly because there are only positive NLTK scores for the female songs. It initially appears that there is very little female data and no male data for that year in the graph ‘Male vs Female NLTK Scores 1970-1979’. There are 50 data points that should be plotted for 1976, however, the 7 female songs in that year are positive, causing the graph to not form a ‘proper’ violin shape as there is no probability that a female song could be negative.

Separating the male and female song data for 1976 produces the graph below:

Screenshot 2022-01-08 at 12.51.32 AM.png

Unrelated to gender, the median NLTK score every year is consistently high, generally falling between 0.9 and 1. The few exceptions are 1966, 1973, 1995, and 2006 where the score drops slightly to between 0.8 and 0.9. However, these are minor differences compared to 2015, where the median drops noticeably to just above 0 at 0.077, meaning there are almost as many sentimentally negative as positive songs, whereas in other years there are consistently many more positive songs. Our dataset is the Billboard Top 100 songs for January each year. This drastic change in the median may correlate with world events: the ‘Paris Attacks’ or Charlie Hebdo incident and the following series of attacks around France, were significant events that given worldwide media attention throughout January, 2015 (an article was published in Time magazine the same day, the incidents gained the attention of thousands on social media, former US president Obama released a statement, etc.). However, this may be a coincidence as other years have not reflected the potential impact of negative world wide phenomena, such as the current pandemic: the median scores for 2019, 2020, and 2021 are consistently high, higher in fact than pre-pandemic 2018.

Gender

Comparing male and female song sentiment categorised by the song’s most prominent emotion

Similar to the information shown in the emotion/song frequency bar graph, the sentimental distribution in these violin plots makes generally intuitive sense. Songs categorised under joy, anticipation, surprise, and trust are mostly positive; songs categorised under anger are mostly negative; songs categorised under disgust are overall more negative than positive. Counterintuitively, the emotions sadness and fear have a similar number of sentimentally positive and negative songs. The testing accuracy section gives examples and explains how there may be songs with a sentiment score that doesn’t seem to match the emotion it has been categorised as.

For most of the emotional categories, the distribution of NLTK scores does not vary greatly between each gender. Under surprise, where the female songs’ sentiment tends towards more positive scores, petering out on the negative side below 0, compared to the male songs that show two distinct peaks around 1 and -1. Under fear, there are more negative than positive male songs, but the distribution is even for female songs. This could indicate that female songs have a slight probability of being more positive, perhaps due to song genres, discussed with the bar graphs below.

Scatter Plots: Trends in Emotion

Although the overall distribution of the NLTK sentiment score and the frequency distribution of songs in each emotional category do not differ significantly between genders, there may be some gender differences in the change in the NLTK sentiment score within emotional categories.

The scatter plot for male songs shows a slight decrease in the sentiment scores for songs about joy, a slightly steeper decrease in the sentiment scores for songs about anticipation, trust, and surprise, consistent scores for songs about anger and fear, and an increase in scores for songs about sadness. The scatter plot for female songs shows a decrease in sentiment scores for songs about anger only. Trust and sadness songs have consistent sentiment scores, there is a slight increase for songs about joy, steeper positive inclines for songs about disgust and fear, and a drastic increase in the NLTK scores for songs about surprise.

Generally, the changes in score work to cancel out differences between genders, which is why the violin plots showed that the sentiment score distribution was similar between male and female songs. For example, during our time frame from 1960-2021, the average sentiment scores for anticipation and trust for male songs drops from around 0.6-0.65 to around 0.4, the same emotions hovers consistently around the score 0.5 for female songs. Sentiment scores for songs about disgust decrease by about 0.3 for males, and increase around 0.35 for females, although the highest average female scores remain lower than the initial male scores. For both genders songs about joy have a consistently high average sentiment score staying around 0.75. This information is consistent with previous findings that female songs are overall slightly more positive than male songs, but the overall song scores when collated do not show great differences between each gender, except for surprise.

It is important to note that the trend lines shown are subject to varying margins of error. The emotions which are rarer in the dataset, such as anger, disgust and surprise, are less reliable. This is compounded in the female song graph due to lack of individual female artists present in the dataset.

Scatter Plot: NLTK Scores

To calculate its overall sentiment, each song is assigned a positive and negative score based on its lyrics by the NLTK programme. The intention was to perform an investigation using clustering on the graphs to see if there were any emergent patterns between positive/negative scores and another variable such as the gender of the artist, the year/decade it was written in, the emotion behind it, etc. However, unlike the overall NLTK scores, the positive and negative rarely tend to the maximum score of 1. Most songs have a negative score between 0 and 0.3 and a positive score between 0 and 0.4. Therefore, no clear emergent groups were found.

Both graphs show the positive and negative scores of each song, so the data points are the same. As the legends show, in one graph the data points are coloured to be differentiated by the gender of the artist, in the other by the decade that the song was in the top 100 song chart.

BAR GRAPHS: Genre

Screenshot 2022-01-08 at 12.05.13 PM.png

Screenshot 2022-01-08 at 12.05.03 PM.png

There are some significant differences between male and female song genres. Pop and dance pop are by far the biggest genres in popular female music. The most populated male genres are rock, mellow gold, soft rock, classic rock. While pop and rock music could be used to express a variety of themes, dance pop in particular should be more sentimentally positive given its target purpose. Songs classified as dance pop have quick tempos, are upbeat with catchy hooks, and intended generally for clubs, parties, etc. (Abbott, 2021). This may explain why female-led songs showed slightly more positivity.

One may expect, given the differences between male and female song genres, that there should be greater sentimental or emotional differences. However, our dataset is utilising the most popular songs from 1960-2021. Since these are songs that have appealed to millions of people, there are similar trends, topics, and structures in many of the songs, regardless of genre (and gender, in many cases). Another research project has pointed out there is an increasing homogenisation of popular music, which can be found here.

To note, some songs are represented more than others. For example, a song classified under four genres will have influenced the graph more than a song with just two genres. The most common groupings are ['contemporary country', 'country', 'country road'] and [‘pop’, ‘dance pop’]. 317 songs have no genre assigned, but our dataset consists of 6055 songs, so 5% of songs are excluded from genre analysis.

Testing Accuracy