Personalized TV Recommendation with Mixture Probabilistic Matrix Factorization

Personalized TV Recommendation with Mixture Probabilistic Matrix Factorization Huayu Li, Hengshu Zhu #, Yong Ge, Yanjie Fu +,Yuan Ge Computer Science Department, UNC Charlotte # Baidu Research-Big Data Lab + Rutgers University Anhui Polytechnic University 5/1/2015 1

Outline Introduction Challenges of TV Recommendation Data Methods Experiments Conclusion 5/1/2015 2

Introduction Nowadays, smart TV is very prevalent 5/1/2015 3

Introduction However, which TV program should we watch? 5/1/2015 4

Introduction TV Recommender System is very important! However, which TV program should we watch? 5/1/2015 5

Outline Introduction Challenges of TV Recommendation Data Methods Experiments Conclusion 5/1/2015 6

Television Watching Groups TV Program 7

Watching group refers to users who have similar preferences for TV programs in front of a television. Television Watching Groups TV Program 8

Challenges of TV Recommendation 1. How to infer the preference for different watching group from such a large number of individual watching records? 2. How to handle the implicit feedbacks of users, e.g. watching time? 5/1/2015 9

Outline Introduction Challenges of TV Recommendation Data Methods Experiments Conclusion 5/1/2015 10

Data 1. Each watching record includes: Television ID Program ID Time Information For example : TV ID Program ID Watching Duration Start Time Total Time 2 ba000000000018817163 740 2014-03-12T00:00:00.000Z1 800 2. Each TV program includes: Title Two types of genres: first level genre and sub-genre 5/1/2015 11

Data 1. Each watching record includes: Television ID Program ID Time Information For example : # Televisions # TV Programs # Watching Records TV ID Program ID Watching Duration Start Time Total Time 230,196 4,289 14,159,678 2 ba000000000018817163 740 2014-03-12T00:00:00.000Z1 800 2. Each TV program includes: Title Two types of genres: first level genre and sub-genre 5/1/2015 12

Outline Introduction Challenges of TV Recommendation Data Methods Experiments Conclusion 5/1/2015 13

Methods Basic Framework Step 1: Discover Watching Groups Step 2: Learn Preference of Television 5/1/2015 14

Methods Basic Framework Step 1: Discover Watching Groups Step 2: Learn Preference of Television 5/1/2015 15

Methods Discovery of Watching Groups Television Clustering (K-means) Feature: Watching frequency of TV program Estimating Watching Groups (Markov Clustering) Feature: First-level genre Sub-genre Watching time in a day Week day or weekend 5/1/2015 16

Methods Discovery of Watching Groups TV Group 1 TV Group 2 17

Methods Discovery of Watching Groups TV Group 1 TV Group 2 In each TV group, televisions have similar watching groups. 18

Methods Discovery of Watching Groups TV Group 1 TV Group 2 TV Groups The hidden watching group number 21

Methods mpmf Basic frame work Step 1: Discover Watching Groups Step 2: Learn Preference of Television Mixture Probabilistic Matrix Factorization (mpmf) 5/1/2015 22

Methods mpmf Assumption: The preferences of a television for TV programs could be decomposed into a mixture preference of the hidden watching groups. Preference of TV Preferences of Watching Groups Mixture 5/1/2015 23

Methods mpmf Given: The learned number of watching groups for each television group Television Program R = Television K T K Program V 5/1/2015 24

Methods mpmf Given: The learned number of watching groups for each television group 1. Draw television-specific latent factor from a mixture of Gaussian distribution 2. The mixture number is the number of watching groups 5/1/2015 25

Methods mpmf 5/1/2015 26

Methods mpmf Alternating Least Square for the parameter estimation. 5/1/2015 27

Outline Introduction Challenges of TV Recommendation Data Methods Experiments Conclusion 5/1/2015 28

Experiments Show an example of clustering Evaluate the proposed method s performance Prediction Accuracy Ranking Accuracy Top-K Recommendation Compare different data conversion methods 5/1/2015 29

Experiments An Example of clustering An example of clustering: Left is the clustering result, and Right is the corresponding program names and main genres. 5/1/2015 30

Experiments Prediction Accuracy Rating Conversion Cumulative ratio of watching time to the total time of a program played Baselines PMF mpmf Random # watching group # watching group as 1 # watching group as 3 5/1/2015 31

Experiments Ranking Accuracy 5/1/2015 32

Experiments Top-K Recommendation 5/1/2015 33

Experiments Top-K Recommendation 5/1/2015 34

Experiments Data Conversion Methods Data Conversion Methods Cumulative Ratios Frequency Binary Confidence Level 5/1/2015 35

Conclusion Design a two-stage framework Employ clustering to discover watching groups Develop probabilistic model to learn the preference of television for TV program based on Gaussian mixture distributions Evaluate the proposed model in real-world data with various metrics 5/1/2015 36

Thank you! Question? 5/1/2015 37