Deep reinforcement learning for UAV in Gazebo simulation environment
Youtube:
maunal control: https://www.youtube.com/watch?v=9zLjYLtHdPQ
training: https://www.youtube.com/watch?v=zbejm5uHPt8
Gazebo & pixhawk & ROS SITL(software in the loop) simulation:
-
state = [Pos_z, Vel_z, Thrust]
-
action = {0, 1, -1}
// 0: decrease thrust;
// 1: increase thrust;
// -1: environment needs to be restarted(manually selected)!
-
reward:
if(19.7 < Pos_z < 20.3) reward = 1
else reward = 0
-
Deep Network: 3 full connection layers
UAV hovering at the altitude of 20m.
cd $HOME
mkdir src
cd ~/src
git clone https://github.com/PX4-Gazebo-Simulation/Firmware.git
cd Firmware
make px4fmu-v4_default
cd ~/src
mkdir -p mavros_ws/src
cd mavros_ws
catkin_init_workspace
cd src
git clone -b uavcomp https://github.com/PX4-Gazebo-Simulation/mavros.git
git clone -b uavcomp https://github.com/PX4-Gazebo-Simulation/mavlink
cd ..
catkin build
cd ~/src
mkdir -p attitude_controller/src
cd attitude_controller
catkin_init_workspace
cd src
git clone -b flight_test https://github.com/PX4-Gazebo-Simulation/state_machine.git
cd ..
catkin build
cd ~/src
mkdir -p DRL_node_ROS/src
cd DRL_node_ROS
catkin_init_workspace
cd src
git clone https://github.com/PX4-Gazebo-Simulation/drl_uav.git
cd ..
catkin build
(talker.py)
source ~/src/mavros_ws/devel/setup.bash
roslaunch mavros px4.launch fcu_url:="udp://:14540@127.0.0.1:14557"
cd ~/src/Firmware
make posix_sitl_default gazebo
source ~/src/attitude_controller/devel/setup.bash
rosrun state_machine offb_simulation_test
source ~/src/mavros_ws/devel/setup.bash
rosrun mavros mavsafety arm
rosrun mavros mavsys mode -c OFFBOARD
source ~/src/DRL_node_ROS/devel/setup.bash
rosrun drl_uav talker.py
Thrust: [0.40, 0.78]
Vel_z: [-3.0, 3.0]
Pos_z: [10, 30](for training; restart the system if current altitude is out of range)
if vel_z > 3.0 => force action=0(increase thrust)
if vel_z < -3.0 => force action=1(decrease thrust)
~1.14s