2019 CS109a at Harvard University

Team #33 Members: Seeam Noor, Christina Park, Yinyu Ji



Spotify uses a music recommender system to find a set of songs to recommend to listeners, with the objectives of automating playlist generation and boosting user engagement by extending listening beyond the current playlist. In this project, we would like to use methods from CS109a to evaluate and create a model for automatic playlist generation. To build our models, we utilize data about Spotify Playlists (“Million Playlist Dataset”), which are collections of songs on Spotify generated by both humans and algorithms, as well as the Spotify API, which provides audio feature information about songs on the platform. Furthermore, we seek to overcome the “cold start problem,” which means making recommendations for newer playlists with few prior songs. We seek to utilize patterns within the Playlist Dataset, as well as the audio features available through the Spotify API, in order to successfully make song recommendations.



The development of the internet radically changed the music industry. Some sources claim that by 2008, 95% of all digital music was downloaded illegally, motivating the development of a new business model: online music streaming, also known as “music as a service” or the “open music model.” Today, a number of music streaming services exist, including Spotify, Apple Music, Amazon Music, YouTube, Pandora, and SoundCloud. Of all of these brands, Spotify pioneered the streaming model as we know it today.

Importantly, the rise of music streaming services also popularized music recommender systems. Music recommender systems utilize data to recommend similar songs to add to an existing playlist, or even create a playlist based on a single song. On the Spotify platform, playlists can be human-generated or computer-generated and can be built by any user or by Spotify itself. Since services such as Spotify can collect large amounts of user data and song data, they can use this data to provide good song recommendations to their listeners. For example, Spotify provides a feature called the Discover Playlist based on a user’s listening habits. The goal is to develop an algorithm that can be trained on data describing the existing playlist in order to boost user engagement and to improve user experience by making playlist generation easier, by introducing the user to new but similar songs, and by extending a playlist beyond the last song. Optimizing and improving user experience is a critical task for services like Spotify to keep its users and to distinguish itself from competitors.



We used Python to implement our models and worked with the Spotify Million Playlists Dataset and the Spotify API (see EDA for a description of data).

Open Book


Before deciding on the models we would implement, we performed preliminary research on the approaches that have been taken before on similar problems.

Andric, Andreja, and Goffredo Haus. "Automatic playlist generation based on tracking user’s listening habits." Multimedia Tools and Applications 29, no. 2 (2006): 127-151.

Chen, Ching-Wei, Paul Lamere, Markus Schedl, and Hamed Zamani. "Recsys challenge 2018: Automatic music playlist continuation." In Proceedings of the 12th ACM Conference on Recommender Systems, pp. 527-528. ACM, 2018.

Liu, Ning-Han, Shu-Ju Hsieh, and Cheng-Fa Tsai. "An intelligent music playlist generator based on the time parameter with artificial neural networks." Expert Systems with Applications 37, no. 4 (2010): 2815-2825.

Shakirova, Elena. "Collaborative filtering for music recommender system." Young Researchers in Electrical and Electronic Engineering (EIConRus), 2017 IEEE Conference of Russian. IEEE, 2017.

Ungar, Lyle H., and Dean P. Foster. "Clustering methods for collaborative filtering." AAAI workshop on recommendation systems. Vol. 1. 1998.



Baker, Jack. “Practical Introduction to Recommender Systems.” Last modified November 28, 2019.

Behrens, Nick. “Making Your Own Spotify Discover Weekly Playlist.” Last modified November 8, 2017.

Bernhardsson, Erik. “Collaborative Filtering at Spotify.” Last modified January 25, 2013.

Lee, Sammy. “How We Built a Content-Based Filtering Recommender System For Music with Python.” Last modified May 24, 2019.