git clone https://github.com/kkugosu/RL_BASIC.git
python executable.py
you can choose belows
- environments
- policy
- hiddenlayer size
- batch size
- memory capacity
- memory reset time
- train time per memory
- learning rate
- eligibility trace
- done penalty
- load previous model or not (not = 0 yes = 1)
result of reward > 500
- cartpole
- hopper
- Gym
- Mujoco
- Python >= 3.8
- Pytorch >= 1.12.0
- Numpy
deep deterministic policy gradient
-
papers
-
cs285
-
hui reinforcement learning blog