Spotify’s Music Recommendations Lambda Architecture

If you like this presentation – show it...

Slide 0

Spotify’s Music Recommendations Lambda Architecture Esh Kumar @eshvk Emily Samuels @emilymsa

Slide 1

Overview Why Lambda? Use Case: Discover Recommendations Batch Architecture Real-time Architecture Challenges Future Work

Slide 2

Why Lambda? 1 new user every 3 seconds. Contextual, time based recs more & more important

Slide 3

Discover Recs

Slide 4

The Discover Page Algorithmically generated fresh recs for users.

Slide 5

The Discover Batch Pipeline

Slide 6

Machine Learning Deep Dive

Slide 7

Word2Vec Words with similar contexts have similar meaning

Slide 8

Word2Vec King – Man + Woman = Queen

Slide 9

Annoy Approximate Nearest Neighbors Oh Yeah! https://github.com/spotify/annoy

Slide 10

Batch Architecture Strengths Recs based on complete user history Weakness User vector generation time increasing with no. users. Not reflective of current mood.

Slide 11

Intro to Storm

Slide 12

Storm Distributed real-time computation system

Slide 13

Storm @ Spotify

Slide 14

Real-time Architecture

Slide 15

Workers die -> Cascading JVM Process death Memcache flakiness Cassandra JVM problems due to write/overwrite pattern Challenges

Slide 16

Future/Ongoing Work Simplify the topology Keep listens for 24 hours Ongoing work on other real time personalization features.

Slide 17

Questions Esh Kumar eshvk@spotify.com Emily Samuels esamuels@spotify.com