Skip to content

Commit

Permalink
Update docs
Browse files Browse the repository at this point in the history
* Documentation now mentions PettingZoo instead of Gymnasium.
* Updated examples to follow the PettingZoo API (e.g., `obs, _ = env.reset()`).
* Removed obsolete parts of documentation.
* Fixed a few typos.
* Improved appearance of some paragraphs.
  • Loading branch information
rchaput committed Jul 18, 2024
1 parent 16070b2 commit bb52e75
Show file tree
Hide file tree
Showing 13 changed files with 121 additions and 163 deletions.
67 changes: 32 additions & 35 deletions docs/source/adding_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Adding a new model
==================

One of the principal goals of this simulator is to be able to compare various
learning algorithms (similarly to Gymnasium's environments).
learning algorithms (similarly to PettingZoo's environments).
This page describes how to implement another learning algorithm (i.e., *model*).

Models interact with the :py:class:`SmartGrid <smartgrid.environment.SmartGrid>`
Expand All @@ -13,7 +13,7 @@ through the *interaction loop*:
from smartgrid import make_basic_smartgrid
env = make_basic_smartgrid()
obs = env.reset()
obs, _ = env.reset()
max_step = 10 # Can also be 10_000, ...
for step in range(max_step)
actions = model.forward(obs) # Add your model here!
Expand Down Expand Up @@ -49,38 +49,35 @@ used for different agents.
self.env = env
def forward(obs):
# `obs` is a dict containing:
# - `global`: an instance of GlobalObservation;
# - `local`: a list of instances of LocalObservations, one per agent.
# To reconstruct the observations per agent, a for loop can be used:
obs_per_agent = [
np.concatenate((
obs['local'][i],
obs['global'],
))
for i in range(self.env.n_agent)
]
# Then, each element of `obs_per_agent` can be used for the specific agent.
# Here, we simply use random.
agent_actions = []
for i in range(self.env.n_agent):
# We need the number of dimensions of the action. It should be 6, but
# it's better to avoid hard-coding it.
agent_action_space = self.env.action_space[i]
# `obs` is a dict mapping each agent name to its observations.
# Agent observations are namedtuples that can be printed for
# easier human readability and debugging, or transformed to
# numpy arrays (with `np.asarray`) for easier handling by Neural
# Networks.
# The env expects a dict mapping each agent name to its desired action.
# Here, we simply create a random action for each agent, with Numpy.
agent_actions = {}
for agent_name in self.env.agents:
# `obs[agent_name]` are the agent's observations
# We need the action's number of dimensions. It should be 6,
# but the SmartGrid can be extended and so it's better to avoid
# hard-coding it.
agent_action_space = self.env.action_space(agent_name)
agent_action_nb_dimensions = agent_action_space.shape[0]
action = np.random.random(agent_action_nb_dimensions)
# `action` is a ndarray of 6 values in [0,1].
# Most learning algorithms will handle values in [0,1], but the
# SmartGrid env actually expects actions in a different space,
# depending on the agent's profile. We can use `interpolate`
# to make the transformation.
# Most learning algorithms will handle values in [0, 1], but the
# SmartGrid env may expect actions in a different space, depending
# on the agent's profile. We can use `interpolate` to transform.
action = interpolate(
value=action,
old_bounds=[(0,1)] * agent_action_nb_dimensions,
new_bounds=list(zip(agent_action_space.low, agent_action_space.high))
)
agent_actions.append(action)
# At this point, `agent_actions` is a list of actions (ndarrays), one
agent_actions[agent_name] = action
# At this point, `agent_actions` is a dict of actions (ndarrays), one
# element for each agent.
return agent_actions
Expand All @@ -104,15 +101,15 @@ but we will illustrate the ``backward`` method anyway:
# (...) code from previous section
def backward(self, new_obs, rewards):
# `new_obs` has the same shape as `obs` in `forward`: `global` and `local`.
new_obs_per_agent = [
np.concatenate((
new_obs['local'][i],
new_obs['global'],
))
for i in range(self.env.n_agent)
]
# `rewards` will be usually a list of scalar values, one per agent
for agent_name in self.env.agents:
# `new_obs` is a dict of observations, one element for each agent.
agent_obs = new_obs[agent_name]
# `rewards` is also a dict; each element can be:
# - a scalar (single value) if the SmartGrid env has a single reward
# function (single-objective);
# - a dict mapping reward names to their values, if the env has
# multiple reward functions (multi-objective).
agent_reward = rewards[agent_name]
.. warning::
If you do not use a :py:class:`~smartgrid.wrappers.reward_aggregator.RewardAggregator`
Expand Down
2 changes: 1 addition & 1 deletion docs/source/argumentation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ You can use argumentation:
Using the existing argumentation reward functions
-------------------------------------------------

You can import these reward functions from the:py:mod:`smartgrid.reward.argumentation`
You can import these reward functions from the :py:mod:`smartgrid.rewards.argumentation`
package; accessing this packages *requires* the `AJAR`_ library, which you can
install with ``pip install git+https://github.com/ethicsai/ajar.git@v1.0.0``.
Trying to import anything from this package without having `AJAR`_ will raise
Expand Down
68 changes: 19 additions & 49 deletions docs/source/custom_scenario.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,45 +64,15 @@ profiles to instantiate :py:class:`~smartgrid.agents.agent.Agent`\ s.
.. note::
If the package was installed through ``pip`` instead of cloning the repository,
accessing the files through a relative path will not work. Instead, the files
must be accessed from the installed package itself. In this case, the
:py:mod:`importlib.resources` module can be used.

To access files from an installed package:

.. code-block:: Python
converter = DataOpenEIConversion()
# Before Python 3.9:
from importlib_resources import path
# `path` returns a context manager that must be used in a `with`.
# The first argument is the path of the dataset, using `.` instead of `/`.
# The `data/` folder is moved within the `smartgrid` package when installing.
# The second argument is the name of the requested file, within the dataset.
with path('smartgrid.data.openei', 'profile_office_annually.npz') as f:
converter.load(
'Office',
f,
comfort.neutral_comfort_profile
)
# Since Python3.9:
from importlib_resources import files, as_file
# `as_file` returns a context manager that must be used in a `with`.
# You may use the `smartgrid` module directly as an argument, or `'smartgrid'`
# (i.e., a string).
with as_file(files(smartgrid).joinpath('data/openei/profile_office_annually.npz')) as f:
converter.load(
'Office',
f,
comfort.neutral_comfort_profile
)
must be accessed from the installed package itself.

To simplify getting the path to data files, the :py:func:`~smartgrid.make_env.find_profile_data`
function may be used, although it has some limitations. In particular, it
only works with a single level of nesting (e.g., ``data/dataset/sub-dataset/file``
will not work), and it relies on the :py:func:`importlib.resources.path` function,
which is deprecated since Python3.11 (but still usable, for now).
will not work). Yet, this function will work whether you have cloned the
repository (as long as the current working directory is at the project root),
or installed as a package; it is the recommended way to specify which data file
to use.

.. code-block:: Python
Expand Down Expand Up @@ -313,14 +283,14 @@ can be used instead. To use *multi-objective* learning algorithms, which
receive several rewards each step, simply avoid wrapping the base environment.

When the environment is wrapped, the base environment can be obtained through
the :py:obj:`~gymnasium.Wrapper.unwrapped` property. Gymnasium
wrappers should allow access to any (public) attribute automatically:
the :py:obj:`~gymnasium.Wrapper.unwrapped` property. The wrapper allows access
to any public attribute of the environment automatically:

.. code-block:: Python
smartgrid = env.unwrapped
n_agent = env.n_agent # Note that `n_agent` is not defined in the wrapper!
assert n_agent == smartgrid.n_agent
num_agents = env.num_agents # Note that `num_agents` is not defined in the wrapper!
assert num_agents == smartgrid.num_agents
The interaction loop
^^^^^^^^^^^^^^^^^^^^
Expand All @@ -333,13 +303,13 @@ can be used:
.. code-block:: Python
done = False
obs_n = env.reset()
obs_n, _ = env.reset()
while not done:
# Implement your decision algorithm here
actions = [
agent.profile.action_space.sample()
for agent in env.agents
]
actions = {
agent_name: env.action_space(agent_name).sample()
for agent_name in env.agents
}
obs_n, rewards_n, terminated_n, truncated_n, info_n = env.step(actions)
done = all(terminated_n) or all(truncated_n)
env.close()
Expand All @@ -349,13 +319,13 @@ Otherwise, the env termination must be handled by the interaction loop itself:
.. code-block:: Python
max_step = 50
obs_n = env.reset()
obs_n, _ = env.reset()
for _ in range(max_step):
# Implement your decision algorithm here
actions = [
agent.profile.action_space.sample()
for agent in env.agents
]
actions = {
agent_name: env.action_space(agent_name).sample()
for agent_name in env.agents
}
# Note that we do not need the `terminated` nor `truncated` values here.
obs_n, rewards_n, _, _, info_n = env.step(actions)
env.close()
Expand Down
54 changes: 31 additions & 23 deletions docs/source/extending/observations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ GlobalObservation
-----------------

Creating a completely new way to compute observations is easy: simply define
a new class (ideally a :py:func:`collections.namedtuple`), and implement its
a new :py:func:`dataclasses.dataclass`, and implement its
:py:meth:`~.GlobalObservation.compute` class method (not instance method!), as
well as :py:meth:`~.GlobalObservation.reset`.

Expand All @@ -31,11 +31,14 @@ For example, let us create a global observation class that only contains the

.. code-block:: Python
from collections import namedtuple
import dataclasses
from smartgrid.observation.base_observation import BaseObservation
fields = ['hour']
@dataclasses.dataclass(frozen=True)
class OnlyHourGlobalObservation(BaseObservation):
class OnlyHourGlobalObservation(namedtuple('OnlyHourGlobalObservation', fields)):
# Dataclass require defining their attributes, which helps readability.
hour: float
@classmethod
def compute(cls, world):
Expand All @@ -46,38 +49,38 @@ For example, let us create a global observation class that only contains the
def reset(cls):
pass
It is a little bit trickier to retain the existing fields of the global
observation, because of the way Python handles namedtuples. For another example,
let us create new *global* observations that include the current day in addition
to the existing fields.
The existing global observation fields can also be retained, by extending the
:py:class:`~smartgrid.observation.global_observation.GlobalObservation` dataclass.
For another example, let us create new *global* observations that include the
current day in addition to the existing fields.

.. code-block:: Python
from smartgrid.observation import GlobalObservation
from collections import namedtuple
import dataclasses
from smartgrid.observation.base_observation import GlobalObservation
# `GlobalObservation._fields` is a tuple, we cannot concatenate a list to it.
fields = ('day',) + GlobalObservation._fields
@dataclasses.dataclass(frozen=True)
class GlobalObservationAndDay(GlobalObservation):
class GlobalObservationAndDay(namedtuple('GlobalObservationAndDay', fields)):
# Dataclass require defining their attributes, which helps readability.
# These attributes are added to the ones defined in parent classes.
day: float
@classmethod
def compute(cls, world):
obs = GlobalObservation.compute(world)
# `obs` is an instance (tuple) of GlobalObservation that contains
# all the other fields we want.
# `obs` is an instance of GlobalObservation containing all other fields.
# We need to compute `day` now.
day = world.current_step // 24
# Now, we need to combine `day` with the other fields. To avoid
# potential errors in the order of arguments, we will use keyworded
# arguments (transforming `obs` into a dict and using the `**` operator).
existing_fields = obs._asdict()
existing_fields = obs.asdict()
return cls(day=day, **existing_fields)
@classmethod
def reset(cls):
GlobalObservation.reset()
super.reset()
LocalObservation
----------------
Expand All @@ -88,12 +91,14 @@ difference between the agents' comfort and the average of others' comfort.

.. code-block:: Python
from collections import namedtuple
import numpy as np
import dataclasses
from smartgrid.observation.base_observation import BaseObservation
fields = ['comfort_diff']
@dataclasses.dataclass(frozen=True)
class ComfortDiffLocalObservation(BaseObservation):
class ComfortDiffLocalObservation(namedtuple('ComfortDiffLocalObservation', fields)):
# Dataclass require defining their attributes, which helps readability.
comfort_diff: float
@classmethod
def compute(cls, world, agent):
Expand All @@ -109,6 +114,9 @@ difference between the agents' comfort and the average of others' comfort.
# But it is provided, to allow for more complex local observations.
pass
Similarly to global observations, existing fields can be retained by inheriting
from :py:class:`~smartgrid.observation.local_observation.LocalObservation`
rather than :py:class:`~smartgrid.observation.base_observation.BaseObservation`.

ObservationManager
------------------
Expand All @@ -133,7 +141,7 @@ For example, assuming that we want to use our ``GlobalObservationAndDay``:
global_observation=GlobalObservationAndDay
)
Both *global* and *local* observations can be overriden at the same time, by
Both *global* and *local* observations can be overridden at the same time, by
specifying both arguments:

.. code-block:: Python
Expand Down
2 changes: 1 addition & 1 deletion docs/source/extending/rewards.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ to gain money by rewarding the difference with the previous step.
super().__init__()
self.previous_payoffs = {}
def calculate(self, world, agents):
def calculate(self, world, agent):
# Get (or use default) the payoff at the last step.
previous_payoff = self.previous_payoffs.get(agent)
if previous_payoff is None:
Expand Down
5 changes: 3 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@ Documentation of |project_name|
===============================

This project aims to provide a (simplified) multi-agent simulator of a
**Smart Grid**, using the `Gymnasium <https://gymnasium.farama.org/>`_
(formerly OpenAI Gym) framework.
**Smart Grid**, using the `PettingZoo <https://pettingzoo.farama.org/>`_
(a multi-agent equivalent to `Gymnasium <https://gymnasium.farama.org/>`_)
framework.

This simulator has a strong focus on **ethical considerations**: in this
environment, the learning agents must decide how to consume and distribute
Expand Down
Loading

0 comments on commit bb52e75

Please sign in to comment.