Skip to content

A comparison of Google SlateQ algorithm with traditional Reinforcement Learning algorithms

Notifications You must be signed in to change notification settings

NancoChow/SlateQ

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning for Recommender Systems

Summary

Most practical recommender systems focus on estimating immediate user engagement without considering the long-term effects of recommendations on user behaviour. Reinforcement learning (RL) methods offer the potential to optimize recommendations for long-term user engagement. However, since users are often presented with slates of multiple items—which may have interacting effects on user choice—methods are required to deal with the combinatorics of the RL action space.

Google’s​ SlateQ algorithm addresses this challenge by decomposing the long-term value (LTV) of a slate into a tractable function of its component item-wise LTVs. In this repo, we compare the efficiency of SlateQ to other RL methods like Q-learning that don’t decompose the LTV of a slate into its component-wise LTVs.

Results

Empirically, we've shown that the SlateQ algorithm outperforms traditional Q-learning approaches across multiple metrics in our simulated environment.

results

Environment

Here, we explore the interest evolution environment from RecSim (GitHub repo) library to train RL agents.

Important Links

  1. Problem Formulation Document
  2. Exploratory Notebook on the interest evolution environment
  3. Notebook comparing RL techniques
  4. Presentation

Contributors

Collin Prather and Shishir Kumar are Master students in Data Science at the University of San Francisco.

Thanks to Prof Brian Spiering for introducing us to this wonderful world of RL.


As governed by the recsim library, this repo uses Python 3.6.

About

A comparison of Google SlateQ algorithm with traditional Reinforcement Learning algorithms

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%