Minecraft, a globally popular sandbox video game, provides a dynamic, block-based universe where players build, explore, and interact with a variety of entities and landscapes. The project intends to train a tree chopping agent in the Minecraft environment using Deep Q-leanring from demonstration.
The code depends on the following libraries:
- Python 3.7
- PyTorch 1.8.1
- CUDA 11.2
- MineRL
- PFRL
The envionment of Minecraft is wrapped by MineRL, so please follow the document to install the MineRL library first. Be sure you could successfully run the following test code from the document :
import minerl
import gym
env = gym.make('MineRLNavigateDense-v0')
obs = env.reset()
done = False
net_reward = 0
while not done:
action = env.action_space.noop()
action['camera'] = [0, 0.03*obs["compassAngle"]]
action['back'] = 0
action['forward'] = 1
action['jump'] = 1
action['attack'] = 1
obs, reward, done, info = env.step(
action)
net_reward += reward
print("Total reward: ", net_reward)
PFRL in a library implementing some state-of-art deep reinforcement learning algorithm. Our project use it for the prioritized buffer. You could find more details about their work here
To download the demonstration data from human, you could follow the guidance in the document to the local path <YOUR LOCAL REPO PATH>/data/rawdata
, or more specific:
sudo gedit ~/.bashrc
Add export MINERL_DATA_ROOT=<YOUR LOCAL REPO PATH>/data/rawdata
at the end of the file and save it, then:
source ~./bashrc
Then download the specific dataset MineRLTreechopVectorObf-v0
python3 -m minerl.data.download "MineRLTreechopVectorObf-v0"
Then we should preprocess the dataset to extract the frames and calculate the actionspace:
python3 -u preprocess.py \
--ROOT "./" \
--DATASET_LOC "./data/rawdata/MineRLTreechopVectorObf-v0" \
--actionNum 32 \
--PREPARE_DATASET True \
--n 25 \
It would generate the output frames in <YOUR LOCAL REPO PATH>/data/processdata
and actionspace in <YOUR LOCAL REPO PATH>/actionspace
If you like to train your own agent, be sure that your PC have at least 32 GB RAM:
python3 -u train.py \
--ROOT "./" \
--DATASET_LOC "./data/rawdata/MineRLTreechopVectorObf-v0" \
--MODEL_SAVE "./saved_network"\
--actionNum 32 \
--n 25 \
We also provide our best train agent in <YOUR LOCAL REPO PATH>/saved_network/best_model.pt
, you could run it by:
python3 -u evaluate.py \
--ROOT "./" \
--DATASET_LOC "./data/rawdata/MineRLTreechopVectorObf-v0" \
--MODEL_SAVE "./saved_network"\
--agentname best_model.pt
--actionNum 32 \
--n 25 \
The results of different architecture is shown in the table: