top of page

ABOUT THE COURSE...

CSCI 5622 is a graduate-level computer science course on machine learning. The aim of this project is to identify a topic and explore research questions through the data science process using a variety of ML techniques and models.

UNDERSTANDING AND
MODELLING MUSICAL PREFERENCES:

Music plays an incredibly significant role in our daily lives. Over thousands of years and across every society, music has become a form of expressive communication, a way to bond with others, a tool to regulate our emotions, and much more (Cross, 2001; Schafer & Sedlmeier, 2009). Of course, as individuals we are drawn to some types of music more than others. Research suggests that our musical preferences may begin to form as early as young adolescence and continue to evolve with us as we go through life (Bonneville-Roussy et al., 2017; Bonneville-Roussy & Eerola, 2018). As such, our musical preferences can become closely tied to our personality (Rentfrow & Gosling, 2003), identity, and sense of self (Hargreaves et al., 2008). We commonly use genre labels ("rock," "pop," "hip-hop," "EDM," etc.) in attempts to define our preferences. However, these labels are extremely subjective. Their definition and usage varies from person to person (Lamont & Greasley, 2008). Furthermore, as genres evolve over time, their definition drifts ("metal" does not mean the same thing today that it did 40 years ago) (Lex et al., 2020). Therefore, in order to more accurately characterize the important phenomenon that is individual musical preference, new methodologies are needed that can capture musical preference in a genre-free way.

Several alternative models have been proposed to accomplish this. Generally, these models map various musical characteristics (such as instrumentation, emotion, acoustic descriptors, or genre labels) into some multi-dimensional genre-free space. For example, one solution has proposed the three-dimensional space of Valence, Arousal, and Depth (Greenberg et al., 2016). Greenberg and colleagues presented musical excerpts across a range of styles to study participants. Each excerpt was rated on 38 perceived psychological/emotional attributes, such as "intense," "romantic," "depressing," "danceable," etc. Greenberg and colleagues then analyzed the structure of this high-dimensional data using dimensionality reduction techniques (PCA). This analysis revealed the underlying three-component structure of musical attributes, which they labeled "valence," "arousal," and "depth." Valence is generally used to refer to how pleasant (positive valence) or unpleasant (negative valence) something is. Arousal describes how stimulating (high/positive arousal) or unstimulating (low/negative arousal) something is. Depth captures how complex/sophisticated (high depth) or simplistic/shallow (low depth) something is. 

 

Another proposal offers a five-dimensional space of Mellow, Unpretentious, Sophisticated, Intense, and Contemporary (MUSIC) (Rentfrow et al., 2011). Using a similar methodology to Greenberg et al. (2016), Rentfrow and colleagues also had study participants listen to a variety of musical excerpts. These pieces were selected to represent a broad range of possible musical preferences. However, unlike Greenberg and colleagues' study, participants in this study rated only their individual preference for each piece on a scale from 1-9. Rentfrow and colleagues then analyzed participants' responses (using similar dimensionality reduction techniques to Greenberg et al. 2016) to reveal the underlying structure of individual differences in preference. Their analysis led them to the five-factor "MUSIC" model. The Mellow (M) factor describes relaxing or low-arousal music. The Unpretentious (U) factor describes campestral or various "country" styles of music. The Sophisticated (S) factor describes complex or deep music. The Intense (I) factor describes energetic, high-arousal music. Lastly, the Contemporary (C) factor describes music with strong rhythmic qualities.

 

With both of the proposed models, the result is that attributes commonly associated with music (ex. "relaxing," "electric"...) as well as specific instances of music (ex. songs, artists, or genres) can be defined in terms of the proposed space. Figure 1 below illustrates some examples of how the models may be used. The overall aim of this project is to investigate how such alternative models may be applied/extended to allow for representation and estimation of individual musical preferences. These models were developed through behavioral human subjects research studies. Music perception and preference data were collected from participants via questionnaires, and the underlying structure/model was determined through dimensionality reduction techniques. Conversely, this project will utilize user data available publicly online and attempt to transform the data into the given model space. Doing so will afford exploration of the research questions listed below.

​

fig1.png

Figure 1: Illustration of two proposed alternative models of musical preference. The 3D Valence-Arousal-Depth (Greenberg et al., 2016) is represented using three spatial dimensions (left). The 5D MUSIC model (Rentfrow et al., 2011) is represented using three spatial dimensions, size, and color (right). Some musical attributes are represented in both spaces as examples.

In previous related work, Mehdizadeh & Leslie (2021) conducted a secondary analysis with the aim of developing ways to estimate preference that would promote reuse and further exploration of music cognition datasets. For this work, they used a small set of publicly available data from a music listening experiment (Hanke et al., 2015). These data included general music preference information from each participant, but were missing direct ratings of how much participants liked the pieces they heard during the experiment. In our everyday lives, this would be akin to having a sense for someone's music taste and then using that to estimate how much they would like country, or rap music (for instance). We do this fairly often, when recommending (or not recommending) music to friends, for example. But how might this estimation process be operationalized and quantified? If this missing information could be estimated, that would allow for many more research questions to be asked of this dataset. First, Mehdizadeh & Leslie (2021) transformed each participant's data into the MUSIC five factor model space. Next, each participant in the dataset was represented as the five-dimensional centroid point of all of their listed preferences (see Figure 2 below for an illustrated example). Lastly, they calculated degree of preference estimates for the genres of music participants heard during the original experiment. Degree of preference was defined for each participant and each target genre as the five-dimensional Euclidean distance between the participant's centroid and the genre's representation in the space. They refer to this metric as the "preference distance," where a larger distance indicates that the participant is less likely to prefer that genre and a smaller distance indicates that the participant is more likely to prefer that genre. This project aims to add to this body of work by addressing the two main limitations of Mehdizadeh & Leslie's (2021) analysis: a small sample size and no ground truth preference ratings by which to validate the estimation they proposed.

1) Can we use the information provided in publicly available listening data to transform this information into the space defined by these alternative preference models?

>> Some of the attributes provided in the data were analogous to the dimensions of the alternative preference models (ex. "energy," "valence," "acoustic"...). However, others were more difficult to relate to the models. Given more time, it would likely be possible to write functions to more directly translate songs in the listening data to the alternative model spaces using information from the literature. However, due to time constraints, the attributes given from the data are used as is.

2) Given a representation of a user's musical preferences, can we classify/predict songs which they preferred (vs. others)?

>> I was able to classify user preferred vs. not preferred songs (as labeled by their relative playcounts) using various methods (see decision tree, naive Bayes, SVM, and NN sections), although classification accuracy overall was fairly low. SVMs achieved the highest classification accuracy of 61%.

3) Using this representation, can we further estimate the degree of preference (how much they like it) for a given item (song/artist/genre)?

>> Due to the relatively poor binary classification accuracies for preferred vs. not preferred songs, I did not attempt any regression models. However, future work could explore the use of linear regression or mixed modeling to model playcounts using preference distance (as well as other data, like popularity).

4) Which of the two genre-free multi-dimensional representations of preference yields the most accurate predictions/estimations? The three-dimensional VAL-ARO-DEP model, or the five-dimensional MUSIC model?

>> Although I wasn't able to transform the given data into these models exactly (due to time constraints), I did experiment with estimating preference using three vs. six attributes from the dataset. The six-attribute metric appeared to perform better overall (see SVM section).

5) What is the distribution of breadth of preferences in the dataset? Do the majority of recorded users have a wide or narrow stylistic range of musical preferences, or somewhere in between?

>> In the dataset overall there seems to be a wide distribution of musical styles, with more popular, "umbrella" styles appearing more frequently (ex. rock, pop, metal, dance...) (see "clustering," Figure 27). The average user in this dataset has just one song per unique artist in their data, which suggests a breadth of artists (see "data exploration," Figures 8 & 9). Within-user variances of the Spotify attributes suggests that there is also a fairly broad distribution of styles represented in each user's data (see dataset creation code).

6) Within either of the proposed models, is it more accurate to represent the user as a convex hull of their data points in the space, or as a single point (the centroid of the convex hull)? See Figure 2 below.

>> Due to time constraints, it was not feasible to attempt the more complex convex hull modelling. Centroid modelling was used throughout this project. The convex hull model is left as an implementation for future work.

7) How many data points (songs/artists/genres...) are needed to best represent a user?

>> This really varied from user to user. Because I labeled preferred vs. not preferred songs based on within-user playcount, the range of playcounts that a user had was more important to consider than just how many data points the user had. A higher range of playcounts provided more confidence that higher-playcount songs were in fact more preferred than lower-playcount songs.

8) What level of data granularity (songs vs. artists vs. genres...) best represents a user?

>> I used songs as the level of data granularity throughout this project. Since users in this dataset tended to not have multiple listed songs per artist, it didn't make sense to use this to capture user preferences. Genres could be an interesting level to explore, although the genre information provided by Spotify would need to be simplified in some systematic way order to do so. Then playcounts could be aggregated by genre within a user.

9) Which features/attributes are most important in answering the outlined questions? The mappings to the dimensions of these alternative models? Or are genre labels/other generic metadata actually sufficient?

>> Artist and track popularity metrics were found to improve preference classifications. Track duration, key, and mode were not found to be useful features for the model. The music style features "acoustic," "danceable," "energy," "instrumentalness," "speechiness," and "valence" were also found to improve classification. Some results also suggested that emphasizing "acoustic," "energy," and "valence" features provided additional classification improvement (see SVM section).​

10) Given the extreme availability of music nowadays ("old" or "new") via streaming platforms, is "year of release" a relevant/useful attribute to consider in addition to the other descriptors when representing musical preferences?

>> I tested the usefulness of the "year of release" attribute by building a model with and without this feature and comparing the classification accuracies. Year of release did not increase the classification accuracy of any of the models, suggesting that this attribute does not play a significant role in user's preferences within this data (see decision trees and naive Bayes sections).

Q & A

Screenshot 2023-01-31 181737.png

Figure 2: How an individual who prefers Goth, Metal, Industrial, and Ska music might be represented in the 3D Valence-Arousal-Depth model (the same could be accomplished using the 5D MUSIC model). First each of the individual's preferred genres are represented as points in the model space (black). From there, the individual themselves can be represented as the centroid of these genres (blue), or as the convex hull volume that encompasses all of the preferred genres (yellow triangular pyramid).

BIBLIOGRAPHY

Bonneville-Roussy, A., Stillwell, D., Kosinski, M., & Rust, J. (2017). Age trends in musical preferences in adulthood: 1. Conceptualization and empirical investigation. Musicae Scientiae, 21(4), 369–389.

https://doi.org/10.1177/1029864917691571

​

Bonneville-Roussy, A., & Eerola, T. (2018). Age trends in musical preferences in adulthood: 3. Perceived musical attributes as intrinsic determinants of preferences. Musicae Scientiae, 22(3), 394–414.

https://doi.org/10.1177/1029864917718606

​

Cross, I. (2001). Music, Mind and Evolution. Psychology of Music, 29(1), 95–102. https://doi.org/10.1177/0305735601291007

​

Greenberg, D. M., Kosinski, M., Stillwell, D. J., Monteiro, B. L., Levitin, D. J., & Rentfrow, P. J. (2016). The song is you: Preferences for musical attribute dimensions reflect personality. Social Psychological and Personality Science, 7(6), 597–605. 

https://doi.org/10.1177/1948550616641473

​

Hanke, M., Dinga, R., Hausler, C., Guntupalli, J. S., Casey, M., Kaule, F. R., & Stadler, J. (2015). High-resolution 7-Tesla fMRI data on the perception of musical genres -- an extension to the studyforrest dataset. F1000Research, 4(174). https://doi.org/10.12688/f1000research.6679.1

​

Hargreaves, D. J., MacDonald, R., & Miell, D. (2008). Musical Identities. In S. Hallam, I. Cross, & M. Thaut (Eds.), The Oxford Handbook of Music Psychology (1st ed., pp. 759–774). Oxford University Press.

​

Lamont, A., & Greasley, A. (2008). Musical preferences. In S. Hallam, I. Cross, & M. Thaut (Eds.), The Oxford Handbook of Music Psychology (1st ed., pp. 263–281). Oxford University Press.

​

Lex, E., Kowald, D., & Schedl, M. (2020). Modeling popularity and temporal drift of music genre preferences. Transactions of the International Society for Music Information Retrieval, 3, 17–30. https://doi.org/10.5334/tismir.39

​

Mehdizadeh, S. K., & Leslie, G. (2021). Novel Methodologies for Secondary Analyses of Physiological and Musical Preference Data. International Conference on Music Perception and Cognition (ICMPC). https://youtu.be/3p1wVwXEl0Y

​

Rentfrow, P. J., & Gosling, S. D. (2003). The do re mi's of everyday life: The structure and personality correlates of music preferences. Journal of Personality and Social Psychology, 84(6), 1236–1256.

https://doi.org/10.1037/0022-3514.84.6.1236

​

Rentfrow, P. J., Goldberg, L. R., & Levitin, D. J. (2011). The structure of musical preferences: A five-factor model. Journal of Personality and Social Psychology, 100(6), 1139–1157. https://doi.org/10.1037/a0022406

​

Schäfer, T., & Sedlmeier, P. (2009). From the functions of music to music preference. Psychology of Music, 37(3), 279–300.

https://doi.org/10.1177/0305735608097247

bottom of page