Skip to content

prashanth-prakash/Multi-arm-bandit-problems-UCB1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Multi-arm-bandit-problems-UCB1

This code applies the upper confidence bound method on a synthetic dataset named "data.mat". The data is organized in the form of timesteps X num of ads. The objective is to reduce the regret between the best possible reward and the reward that this algorithm outputs. This is one type of determinisitic bandit problem with partial feedback. Before running the code, download "data.mat" and place it in the same folder as your code.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages