Current state of experimental code #912

porta-logica · 2024-01-05T12:08:59Z

porta-logica
Jan 5, 2024

As of commit #27b851f there are two versions of train_eval_lib.py in the master branch. In
(1) experimental/examples/ppo/train_eval_lib.py
rb_observer is instantiated from the reverb_utils.ReverbTrajectorySequenceObserver class, whereas in
(2) examples/ppo/schulman17/train_eval_lib.py
the class definition itself is included in the file. Most other differences are negligible. Version (1) avoids some code duplication and seems preferable. However, some later improvements are included only in (2).

Q: What is the rationale of keeping experimental code in the master branch? What is the current state of the particular files above?

Partly related to the question above: Both (1) and (2) instantiate
(3) PPOClipAgent,
formally a subclass of PPOAgent. (3) contains no code of its own. Its single purpose is setting some arguments in the super(...).init method that are supposedly meant to be used in the only other subclass, PPOKLPenaltyAgent. However, the latter class is not used at all. According to the introductory docstring in (3), the agent aims at implementing an OpenAI baseline for PPO, and the work is still in progress. Since there has been no change in (3) for almost 4 years, the current state is questionable.

Please note that the "superclass", PPOAgent has been maintained and agents/ppo/examples/v2/train_eval_clip_agent.py seems to work ok. The issue, if any, is some dead code that may impair integrating additional agents into an ideally solid framework.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Current state of experimental code #912

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Current state of experimental code #912

porta-logica Jan 5, 2024

Replies: 0 comments

porta-logica
Jan 5, 2024