GitHub - microcosmAI/MuJoCo-RL-Environment-Wrapper: A wrapper for MuJoCo XML files, which turns them into a (Multi-Agent) Reinforcement Learning environment

MuJoCo environment

A python environment for multi agent training in MuJoCo simulations.
Explore the wiki docs »

Report Bug · Request Feature

About The Project

In this repository, we publish a python wrapper that can be used to train a large variety of different environments with reinforcement learning environments.

(back to top)

Getting Started

Clone this repository, navigate with your terminal into this repository and execute the following steps.

Prerequisites

This is an example of how to list things you need to use the software and how to install them.

pip install -r requirements.txt

Installation

To use the environment, you have to install this repository as a pip package. Alternativly you can open a branch of this repository and implement changes directly in this repo.

Navigate to the repository with your terminal.
Install the repository as a pip package
```
pip install .
```
Check whether the installation was successful
```
python -c "import MuJoCo_Gym"
```

(back to top)

Usage

Environment Setup

The basic multi agent environment can be imported and used like this:

First the path for the environment has to be set. Additionaly you need to provide a list of agent names within the environment. Those names correspond to the top level body of your agent within the xml file. The json file containing additional information is optional.

from MuJoCo_Gym.mujoco_rl import MuJoCoRL

environment_path = "Examples/Environment/MultiEnvs.xml"  # File containing the mujoco environment
info_path = "Examples/Environment/info_example.json"  # File containing addtional environment informations
agents = ["agent1", "agent2"]  # List of agents (body names) within the environment

These informations have to be stored in a dictionary. This is necessary to make the environment compatible with Ray.

config_dict = {"xmlPath":environment_path, "infoJson":info_path, "agents":agents}
environment = mujoco_rl(config_dict)

Reset the environment to start the simulation.

observation, infos = environment.reset()

Store the action of each agent in a dictionary with the agent names as keys. The array has to match the shape of the action space and the single agents have to be part of the action range.

actions = {"agent1":np.array([]), "agent2":np.array([])}
observations, rewards, terminations, truncations, infos = environment.step(actions)

(back to top)

Language channel

To use a language channel, you have to implement it as a environment dynamic. Each environment dynamic has its own observation and action space, which will be forwarded to the agents. Note that at the moment each agent gets all environment dynamics and each dynamic is executed for each agent once during every timestep.

A basic implementation of a language channel in the environment. Note that every environment dynamic needs to implement a init(self, mujoco_gym) and a dynamic(self, agent, actions).

class Language():

    def __init__(self, mujoco_gym):
        self.mujoco_gym = mujoco_gym
        self.observation_space = {"low": [0], "high": [3]}
        self.action_space = {"low": [0], "high": [3]}
        # The datastore is used to store and preserve data over one or multiple timesteps
        self.dataStore = {}

    def dynamic(self, agent, actions):

        # At timestep 0, the utterance field has to be initialized
        if "utterance" not in self.mujoco_gym.data_store[agent].keys():
            self.mujoco_gym.data_store[agent]["utterance"] = 0

        # Extract the utterance from the agents action
        utterance = int(actions[0])

        # Store the utterance in the dataStore for the environment
        self.mujoco_gym.data_store[agent]["utterance"] = utterance
        otherAgent = [other for other in self.mujoco_gym.agents if other != agent][0]

        # Check whether the other agent has "spoken" yet (not at timestep 0)
        if "utterance" in self.mujoco_gym.data_store[otherAgent]:
            utteranceOtherAgent = self.mujoco_gym.data_store[otherAgent]["utterance"]
            return 0, np.array([utteranceOtherAgent])
        else:
            return 0, np.array([0])

The environment dynamic has to be added to the environment config.

config_dict = {"xmlPath":environment_path, "infoJson":info_path, "agents":agents, "environmentDynamics":[Language]}
environment = mujoco_rl(config_dict)

(back to top)

Reward and Done function

A reference implementation of a reward function that gives back a positive reward if the agent gets closer to a target object. All possible target objects are filtered by tags at the beginning. Those tags are set in the info json file, which is handed over in the config dict at the beginning.

def reward_function(mujoco_gym, agent):
    # Creates all the necessary fields to store the needed data within the dataStore at timestep 0 
    if "targets" not in mujoco_gym.data_store[agent].keys():
        mujoco_gym.data_store["targets"] = mujoco_gym.filter_by_tag("target")
        mujoco_gym.data_store[agent]["current_target"] =
        mujoco_gym.data_store["targets"][random.randint(0, len(mujoco_gym.data_store["targets"]) - 1)]["name"]
        distance = mujoco_gym.distance(agent, mujoco_gym.data_store[agent]["current_target"])
        mujoco_gym.data_store[agent]["distance"] = distance
        new_reward = 0
    else:  # Calculates the distance between the agent and the current target
        distance = mujoco_gym.distance(agent, mujoco_gym.data_store[agent]["current_target"])
        new_reward = mujoco_gym.data_store[agent]["distance"] - distance
        mujoco_gym.data_store[agent]["distance"] = distance
    reward = new_reward * 10
    return reward

The done function end the current training run if the agent gets closer than 1 distance unit to the target.

def done_function(mujoco_gym, agent):
    if mujoco_gym.data_store[agent]["distance"] <= 1:
        return True
    else:
        return False

Both of them have to be included in the config dictionary.

config_dict = {"xmlPath":environment_path, "infoJson":info_path, "agents":agents, "rewardFunctions":[reward_function], "doneFunctions":[done_function]}
environment = mujoco_rl(config_dict)

For more examples, please refer to the Wiki.

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Cornelius Wolff - cowolff@uos.de

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 151 Commits
MuJoCo_Gym		MuJoCo_Gym
Testing		Testing
benchmarking		benchmarking
images		images
models		models
vision		vision
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MuJoCo environment

Table of Contents

About The Project

Getting Started

Prerequisites

Installation

Usage

Environment Setup

Language channel

Reward and Done function

Contributing

License

Contact

About

Releases 1

Packages

Contributors 7

Languages

microcosmAI/MuJoCo-RL-Environment-Wrapper

Folders and files

Latest commit

History

Repository files navigation

MuJoCo environment

Table of Contents

About The Project

Getting Started

Prerequisites

Installation

Usage

Environment Setup

Language channel

Reward and Done function

Contributing

License

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 7

Languages

Packages