Cart Pole example in documentation #43

Wotaker · 2024-02-09T20:54:54Z

Editorial changes to the examples section in docs

TODO:

Delete recomender system example ✅
Add Cart Pole example ✅
Add MAB example

m-wojnar · 2024-02-09T22:10:11Z

docs/source/examples.rst

+Integration with Gymnasium
+**************************
+
+`OpenAI Gymnasium <https://gymnasium.farama.org/>`_, also known as Gym, is a popular toolkit for developing and comparing reinforcement learning algorithms by provides a standardized interface for environments. Gym offers a variety of environments, from simple classic control tasks like balancing a pole, which is described below in detail, to complex games like Atari and MuJoCo. It even supports creating custom environments, making it a versatile tool for all things reinforcement learning research.


"also known" -> "formerly known"

I think we should talk about environments directly and avoid the term "RL algorithms" to avoid confusion (like reviewers who wondered what the difference is between Reinforced-lib and gymnasium).

I've changed these two points. Thanks

m-wojnar · 2024-02-09T22:12:26Z

docs/source/examples.rst

+
+The Cart Pole environment is a classic control task in which the goal is to balance a pole on a cart. The environment is described by a 4-dimensional state space, which consists of the cart's position, the cart's velocity, the pole's angle, and the pole's angular velocity. The agent can take one of two actions: push the cart to the left or push the cart to the right. The episode ends when the pole falls below a certain angle or the cart moves outside of the environment's boundaries. The goal is to keep the pole balanced for as long as possible.
+
+The following example demonstrates how to train a reinforcement learning agent using Reinforced-lib and OpenAI Gym. The agent uses the Deep Q-Network (DQN) algorithm to learn how to balance the pole. The DQN algorithm is implemented in Reinforced-lib and the Cart Pole environment is provided by Gym.


In the documentation we have "Deep Q-Learning (DQN)", here it is "Deep Q-Network (DQN)". We have to make up our mind and change the description here or there so as not to be misleading.

I think the easiest solution (i.e., requiring least explaining) would be to only refer to the "Deep Q-Network (DQN) algorithm".

I was reading about it today to find out what the correct form is and I found this comparison - I think it best illustrates the difference. Now, however, I don't know whether it is more appropriate to refer to an algorithm or a model in the context of our library?

I do not think it is that important. My solution is to leave it as it already is described in agents description which is Deep Q-learning (DQN). I know it is kind of fishy, but it is all over the internet. I was having a headache about it some time in the past :)

Moreover, we are explicite explaining the shortcut both here and in the DQN agent class description, so I think it is ok so that we explain to the user what we mean by DQN. Lets not be part of the Mathematical Inquisition

m-wojnar · 2024-02-09T22:14:58Z

docs/source/examples.rst

+            logger_types=[StdoutLogger, TensorboardLogger]
+        )
+
+We than start the training loop, where we iterate over the number of epochs and for each epoch we run the agent in the environment. We start by resetting the environment and sampling the agent's initial action. Than we run the agent int he environment by updating the environment state with the action and sampling the next action. We continue this loop until the environment reaches a terminal state. We log the length of the epoch and continue to the next epoch.


~~Than~~ Then we run the agent int the environment by performing the action in ~~updating~~ the environment ~~state with the action~~

than != then

"We then start the training loop"

"Then, we run the agent in the environment"
or
"Next, we run the agent in the environment"

Yes, of course - I missed this typo and did not correct it in the comment.

Thanks, I've corrected these spelling mistakes. Next time I will double check with language tools.

Wotaker · 2024-02-11T11:09:57Z

I have also replaced Gym with Gymnasium where possible to further emphasise the OpenAI's transition.

Cart Pole example in documentation

d53585c

m-wojnar reviewed Feb 9, 2024

View reviewed changes

Language fixes after review

6900d39

Wotaker marked this pull request as ready for review February 11, 2024 11:21

Wotaker merged commit e92d0c6 into main Feb 11, 2024
5 checks passed

Wotaker deleted the doc-gym-ext branch February 11, 2024 11:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cart Pole example in documentation #43

Cart Pole example in documentation #43

Wotaker commented Feb 9, 2024

m-wojnar Feb 9, 2024

Wotaker Feb 11, 2024 •

edited

Loading

m-wojnar Feb 9, 2024

SzymonSzott Feb 10, 2024

m-wojnar Feb 10, 2024

Wotaker Feb 11, 2024 •

edited

Loading

Wotaker Feb 11, 2024

m-wojnar Feb 9, 2024 •

edited

Loading

SzymonSzott Feb 10, 2024

m-wojnar Feb 10, 2024

Wotaker Feb 11, 2024

Wotaker commented Feb 11, 2024


		The Cart Pole environment is a classic control task in which the goal is to balance a pole on a cart. The environment is described by a 4-dimensional state space, which consists of the cart's position, the cart's velocity, the pole's angle, and the pole's angular velocity. The agent can take one of two actions: push the cart to the left or push the cart to the right. The episode ends when the pole falls below a certain angle or the cart moves outside of the environment's boundaries. The goal is to keep the pole balanced for as long as possible.

		The following example demonstrates how to train a reinforcement learning agent using Reinforced-lib and OpenAI Gym. The agent uses the Deep Q-Network (DQN) algorithm to learn how to balance the pole. The DQN algorithm is implemented in Reinforced-lib and the Cart Pole environment is provided by Gym.

Cart Pole example in documentation #43

Cart Pole example in documentation #43

Conversation

Wotaker commented Feb 9, 2024

Choose a reason for hiding this comment

Wotaker Feb 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wotaker Feb 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

m-wojnar Feb 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wotaker commented Feb 11, 2024

Wotaker Feb 11, 2024 •

edited

Loading

Wotaker Feb 11, 2024 •

edited

Loading

m-wojnar Feb 9, 2024 •

edited

Loading