Fix Gym Env and Implement RL Training #203

tztsai · 2024-10-23T13:40:53Z

Fixes gym.py in psl to use BuildingEnv to wrap any BuildingEnvelope system (the original version wraps any instance of ODE_NonAutonomous, but in its implementation, it seems to assume that the system is a BuildingEnvelope).
Implements two DRL algorithms ppo.py and sac.py (largely adopted from https://github.com/vwxyzjn/cleanrl) in the rl folder. Both DRL algorithms can successfully run in the BuildingEnv environment.
Implements gym_dpc.py and gym_nssm.py where a DPCTrainer or NSSMTrainer can directly accept a gym environment as input and use a neuromancer.Trainer to train a neural network in this environment.
Drafts hybrid_control.py as an attempt to implement the technical proposal in README.md, illustrated by diagram.svg. Currently the program can successfully train a DPC policy in a BuildingEnv, and insert its policy model as the actor network of an actor-critic PPO agent, and then continue the training in an RL workflow. This hybrid approach may have the benefit of improving the DPC policy by learning from long-term cumulative reward, as well as accelerating the DRL training by providing a pre-trained DPC policy model.

tztsai added 10 commits October 22, 2024 18:40

update gym env to make it work

ee8284a

update gym env, todo, requirements

19bc8dc

running ppo in BuildingEnv

b70aafe

refactor tqdm usage

d18b6ac

refactor

0fcfb56

remove trash

a1958ea

add import in PPO run() to register the envs

8cd7e1c

update reward func

b5091f7

update reward func

636d68c

Add SAC, project scheme and diagram

79cbc83

drgona self-requested a review October 25, 2024 20:14

tztsai added 7 commits October 26, 2024 21:02

replace activation function

14e40f4

implement NSSM trainer for gym env

21cb903

add init draft of hybrid control

e4fc47e

Implement gym_dpc and train DPC, NSSM, and PPO for hybrid control

5737b41

fix dpc trainer

27f4759

fix gym_dpc testing

9bcd854

modularize PPO and make gym env obs compatible with DPC

994b844

tztsai changed the title ~~Fix Gym Env and Implement PPO RL Training~~ Fix Gym Env and Implement RL Training Oct 30, 2024

tztsai marked this pull request as ready for review October 30, 2024 22:42

Provide feedback