Implementation of an Q-learning, ϵ-greedy agent that learns how to play the game with the other agents he is connected to.
-
Updated
Sep 11, 2023 - Python
Implementation of an Q-learning, ϵ-greedy agent that learns how to play the game with the other agents he is connected to.
Implementation of Multi-Armed Bandit (MAB) algorithms UCB and Epsilon-Greedy. MAB is a class of problems in reinforcement learning where an agent learns to choose actions from a set of arms, each associated with an unknown reward distribution. UCB and Epsilon-Greedy are popular algorithms for solving MAB problems.
DQN agent with e-greedy / softmax policy, experience replay and target network.
Multi Armed Bandits implementation using the Jester Dataset
Analysis of various multi armed bandit algorithms over normal and heavy-tailed distributions.
This is a project of reinforcement learning which contains two different environments. The first environment is the taxi driver problem in 4x4 space with the simple Q-learning update rule. In this task, we compared the performance of the e-greedy policy and Boltzmann policy. As a second environment, we chose the LunarLander from the open gym. Fo…
Add a description, image, and links to the e-greedy topic page so that developers can more easily learn about it.
To associate your repository with the e-greedy topic, visit your repo's landing page and select "manage topics."