This project contains the code to reproduce the results reported in the paper "Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement Learning" (https://arxiv.org/abs/2405.13609).
-
Code used to calculate the data of Table A1 is in dynamic programming.
-
Code related to "3.1 Classical Control" is in lunar_lander_env.
-
Code related to "3.2 Portfolio Optimization with Sharpe Ratio as Objective" is in portfolio_opt_env.
-
Code related to "3.3 Discrete Optimization Problems" for the peak environment is in peak_env.
-
For other results related to "3.3 Discrete Optimization Problems" see https://github.com/MaxNaeg/ZXreinforce, https://github.com/remmyzen/rlftqc, and https://github.com/jolle-ag/qdx.