Skip to content

v2.0.0 PettingZoo multi-agent API

Latest
Compare
Choose a tag to compare
@rchaput rchaput released this 18 Jul 17:19
e15067f

What's Changed

🆕 In short: starting from v2.0.0 onwards, EthicalSmartGrid now uses the PettingZoo API, which is a multi-agent equivalent to the previous Gymnasium API. This change will ensure most learning algorithms will function with our environment.

⚠️ Breaking changes

Switching to the PettingZoo API breaks the previous API in several ways. Please read the following items to adapt your code. (#10)

  • env.reset() now returns both observations and infos (instead of only observations).

New code should use:

obs, infos = env.reset()
# Or, if infos are not used:
obs, _ = env.reset()

instead of:

obs = env.reset()
  • actions must be a dictionary mapping each agent name to its desired action when calling env.step(actions).

New code should use the following structure:

actions = {
  agent_name: (...) # Your action here
  for agent_name in env.agents
}
env.step(actions)

instead of:

actions = [
  (...) # Your action here
  for agent_idx in range(env.n_agent)
]
env.step(actions)
  • obs is now a dictionary, mapping each agent name to its observations (both global and local).

For example, assuming agents ['Household1', 'Household2']:

obs, _ = env.reset()
print(obs)
> {
>  'Household1': Observation(personal_storage=0.052, comfort=0.021, payoff=0.5, hour=0.0, available_energy=0.25, equity=0.84, energy_loss=0.0, autonomy=1.0, exclusion=0.23, well_being=0.021, over_consumption=0.0), 
>  'Household2': Observation(personal_storage=0.052, comfort=0.021, payoff=0.5, hour=0.0, available_energy=0.25, equity=0.84, energy_loss=0.0, autonomy=1.0, exclusion=0.23, well_being=0.021, over_consumption=0.0)
> }

If the learning algorithm requires accessing the global and local observations differently: global observations can still be obtained with obs[agent_name].get_global_observation(); similarly, local observations with obs[agent_name].get_local_observation().

  • The number of agents is accessed through env.num_agents rather than env.n_agent. This follows the PettingZoo API.

  • env.agents allows iterating over the agents' names. Agents themselves can be obtained with env.get_agent(agent_name). This also follows the PettingZoo API, although poorly named.

New code should use:

for agent_name in env.agents:
  agent = env.get_agent(agent_name)
  print(agent.state)

instead of:

for agent in env.agents:
  print(agent.state)
  • infos is a dictionary mapping each agent name to a dictionary containing additional information, such as the original reward reward (before, e.g., scalarization).

New code should use:

_, _, _, _, infos = env.step(actions)
for agent_name in env.agents:
  print(infos[agent_name]['reward'])

instead of:

_, _, _, _, infos = env.step(actions)
for agent in env.agents:
  print(infos['reward'][agent.name]
  • world.agents is now a view over the values of the world.agents_by_name dictionary, instead of a list. Most code should not use world directly anyway, and instead access agents through the env itself. Iterating over agents will still work as previously, but a dict_values object cannot be indexed (world.agents[0] does not work any more, but for agent in world.agents still does).

Various improvements

  • Improve the find_profile_data function for Python 3.9+. It will no longer raise warnings on these versions. (#9)
  • Improve some parts of the documentation.

Full Changelog: v1.2.0...v2.0.0