Skip to content

recohut/drl-recsys

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Reinforcement Learning in Recommender Systems

Open In Colab

Reports

Reinforcement Learning 101 (Tensorflow)

Click here to read the report

Read in notion instead

Medium post

Tree

.
├── [1.8M]  data
│   └── [1.8M]  bronze
│       ├── [140K]  Gemini_BTCUSD_d.csv
│       ├── [117K]  Gemini_ETHUSD_d.csv
│       ├── [ 89K]  MSFT.csv
│       ├── [520K]  rsc.txt
│       ├── [452K]  tb.txt
│       ├── [ 87K]  TSLA.csv
│       └── [443K]  yelp.txt
├── [9.7M]  docs
│   ├── [ 472]  _config.yml
│   ├── [6.2M]  _images
│   ├── [5.9K]  L268705_Offline_Reinforcement_Learning.ipynb
│   ├── [ 12K]  L732057_Markov_Decision_Process.ipynb
│   ├── [ 38K]  R984600_DRL_in_RecSys.ipynb
│   ├── [ 43K]  T000348_Multi_armed_Bandit_for_Banner_Ad.ipynb
│   ├── [167K]  T035236_MDP_with_Dynamic_Programming_in_PyTorch.ipynb
│   ├── [ 68K]  T046728_n_step_algorithms_and_eligibility_traces.ipynb
│   ├── [263K]  T079222_Solving_Multi_armed_Bandit_Problems.ipynb
│   ├── [ 26K]  T079716_Importance_sampling.ipynb
│   ├── [280K]  T119194_Contextual_RL_Product_Recommender.ipynb
│   ├── [138K]  T159137_MDP_Basics_with_Inventory_Control.ipynb
│   ├── [ 18K]  T163940_FrozenLake_using_Cross_Entropy.ipynb
│   ├── [ 31K]  T219174_Recsim_Catalyst.ipynb
│   ├── [154K]  T239645_Neural_Interactive_Collaborative_Filtering.ipynb
│   ├── [119K]  T256744_Real_Time_Bidding_in_Advertising.ipynb
│   ├── [101K]  T257798_Off_Policy_Learning_in_Two_stage_Recommender_Systems.ipynb
│   ├── [ 83K]  T294930_Cartpole_in_PyTorch.ipynb
│   ├── [ 64K]  T365137_REINFORCE_in_PyTorch.ipynb
│   ├── [ 34K]  T373316_Top_K_Off_Policy_Correction_for_a_REINFORCE_Recommender_System.ipynb
│   ├── [290K]  T441700_REINFORCE.ipynb
│   ├── [9.6K]  T471382_FrozenLake_using_Value_Iteration.ipynb
│   ├── [ 14K]  T532530_Predicting_rewards_with_the_state_value_and_action_value_function.ipynb
│   ├── [ 14K]  T587798_FrozenLake_using_Q_Learning.ipynb
│   ├── [ 31K]  T589782_Code_Driven_Introduction_to_Reinforcement_Learning.ipynb
│   ├── [306K]  T616640_Pydeep_Recsys.ipynb
│   ├── [ 85K]  T635579_Q_Learning_vs_SARSA_and_Q_Learning_extensions.ipynb
│   ├── [ 19K]  T705437_CartPole_using_Cross_Entropy.ipynb
│   ├── [461K]  T726861_Introduction_to_Gym_toolkit.ipynb
│   ├── [ 98K]  T729495_GAN_User_Model_for_RL_based_Recommendation_System.ipynb
│   ├── [237K]  T734685_Deep_Reinforcement_Learning_in_Large_Discrete_Action_Spaces.ipynb
│   ├── [ 49K]  T752494_CartPole_using_REINFORCE_in_PyTorch.ipynb
│   ├── [ 27K]  T759314_Kullback_Leibler_Divergence.ipynb
│   ├── [111K]  T798984_Comparing_Simple_Exploration_Techniques:_ε_Greedy,_Annealing,_and_UCB.ipynb
│   ├── [ 64K]  T859183_Q_Learning_on_Lunar_Lander_and_Frozen_Lake.ipynb
│   ├── [ 82K]  T985223_Batch_Constrained_Deep_Q_Learning.ipynb
│   └── [1.9K]  _toc.yml
├── [ 93K]  images
│   └── [ 89K]  S990517_process_flow.svg
├── [197K]  modules
│   ├── [7.6K]  M053518_Builds_a_Gridworld_v2_Environment.ipynb
│   ├── [ 59K]  M253973_Builds_Cryptocurrency_Trading_RL_Environment.ipynb
│   ├── [ 55K]  M346094_Builds_a_Stock_Trading_RL_Environment.ipynb
│   ├── [ 43K]  M445261_Builds_a_Stochastic_Maze_Environment.ipynb
│   ├── [ 12K]  M620717_RL_Gridworld_Visualization_Functions.ipynb
│   └── [ 17K]  M998022_Builds_a_Gridworld_Environment.ipynb
├── [  65]  README.md
├── [2.8M]  reports
│   └── [2.8M]  S990517
│       ├── [2.7M]  images
│       └── [ 68K]  S990517.html
├── [1.7M]  tools
│   ├── [ 20K]  tradegym.zip
│   └── [1.7M]  webgym.zip
└── [5.0M]  tutorials
    ├── [9.5K]  T043789_Training_RL_Agent_in_CartPole_Environment_with_Actor_Critic_method.ipynb
    ├── [343K]  T098537_Building_an_RL_Agent_to_manage_social_media_accounts_on_the_web.ipynb
    ├── [ 23K]  T122762_Training_RL_Agent_in_Gridworld_with_Temporal_Difference_learning_method.ipynb
    ├── [7.6K]  T195475_Building_a_simple_Gridworld_v2_Environment.ipynb
    ├── [216K]  T219631_Training_Stock_Trading_RL_Agent_using_SAC_and_Deploying_as_a_Service.ipynb
    ├── [ 12K]  T244614_Training_RL_Agent_in_CartPole_Environment_with_DRQN_method.ipynb
    ├── [ 62K]  T303629_Training_RL_Agent_in_Gridworld_with_Monte_Carlo_Prediction_and_Control_method.ipynb
    ├── [ 18K]  T307891_Training_RL_Agent_in_Mountain_Car_Environment_with_A3C_Continuous_method.ipynb
    ├── [235K]  T344654_Building_Stock_Trading_RL_Environment.ipynb
    ├── [202K]  T350011_Building_Bitcoin_and_Ethereum_Cryptocurrency_based_Trading_RL_Environment.ipynb
    ├── [ 11K]  T432381_Training_RL_Agent_in_CartPole_Environment_with_Dueling_DQN_method.ipynb
    ├── [ 39K]  T453493_Training_RL_Agent_in_Gridworld_with_Q_learning_method.ipynb
    ├── [126K]  T462163_Building_an_RL_Agent_to_book_flights_on_the_web.ipynb
    ├── [ 15K]  T473399_Training_RL_Agent_in_CartPole_Environment_with_DQN_method.ipynb
    ├── [ 35K]  T490651_Training_RL_Agent_in_Gridworld_Environment_with_MLP_Model.ipynb
    ├── [ 43K]  T495794_Building_a_Stochastic_Maze_Gridworld_Environment.ipynb
    ├── [123K]  T515244_Building_an_RL_Agent_to_manage_emails_on_the_web.ipynb
    ├── [ 42K]  T515396_Training_RL_Agent_in_Gridworld_with_SARSA_method.ipynb
    ├── [ 17K]  T533231_Building_a_simple_Gridworld_Environment.ipynb
    ├── [ 16K]  T559464_Training_RL_Agent_in_Pendulum_Environment_with_DDPG_method.ipynb
    ├── [3.1M]  t608854
    │   ├── [1.2M]  datasets_states
    │   │   ├── [ 31K]  policyV1.npy
    │   │   ├── [ 928]  rewardsV1.npy
    │   │   └── [1.2M]  statesV1.npy
    │   ├── [8.1K]  eval_baseline.py
    │   ├── [100K]  eval_results
    │   ├── [ 27K]  ExpertRecEval.py
    │   ├── [1.8K]  febr_al_irl.py
    │   ├── [9.3K]  irl_agent.py
    │   ├── [1.0K]  LICENSE
    │   ├── [8.1K]  maxEnt_irl.py
    │   ├── [165K]  notebooks
    │   │   └── [161K]  models_eval.ipynb
    │   ├── [4.3K]  README.md
    │   ├── [1.5M]  recsim
    │   ├── [4.2K]  rl.py
    │   ├── [5.3K]  test1.py
    │   ├── [2.4K]  test_expertEnv.py
    │   ├── [3.6K]  test_policyAgent.py
    │   └── [5.3K]  utils.py
    ├── [9.2K]  T611861_Training_RL_Agent_in_Mountain_Car_Environment_with_Policy_gradient_method.ipynb
    ├── [ 17K]  T626473_Training_RL_Agent_in_Pendulum_Environment_with_PPO_Continuous_method.ipynb
    ├── [117K]  T702798_Building_an_RL_Agent_to_complete_tasks_on_the_web_–_Call_to_Action.ipynb
    ├── [121K]  T769395_Building_an_RL_Agent_to_auto_login_on_the_web.ipynb
    ├── [ 52K]  T778350_Training_an_RL_Agent_for_Trading_Cryptocurrencies_using_SAC_method.ipynb
    ├── [ 51K]  T836251_Training_an_RL_Agent_for_Trading_Stocks_using_SAC_method.ipynb
    └── [ 46K]  T920001_Training_RL_Agent_in_Maze_Gridworld_with_Value_iteration_method.ipynb

  21M used in 25 directories, 218 files