TC-Driver: Trajectory Conditioned Driving for Robust Autonomous Racing - A Reinforcement Learning Approach
A model-free RL approach to tackle model missmatch and enhance track generalisation in autonomous racing. Instead of end-to-end RL architectures (control output learned directly from sensory input), we can leverage the reliability of traditional planning methods to interact with a low level RL agent that has learned how to drive/race under varying model parameters. This allows for robustness against model mismatch and greater generalisation towards unseen tracks and demonstrates zero-shot Sim2Real capabilities on a physical F1TENTH race car.
This repository contains the code to reproduce the proposed TC-Driver, as well as the benchmark MPCC and End2End architecture in the F1TENTH Gym environment.
The code is tested on Ubuntu 20.04 with ROS Noetic. You will need to run ROS Noetic.
cd ~/catkin_ws/src/
git clone https://github.com/ETH-PBL/TC-Driver.git
cd ~/catkin_ws/src/TC-Driver
# Install dependencies
pip install -r requirements.txt
# Install the custom gym environment
pip install -e Gym/gym/
# Install the custom splinify package
pip install -e Gym/splinify_package/
# Build the catkin workspace
catkin build
roslaunch f1tenth_simulator pbl_sim_e2e.launch map_name:=f
roslaunch f1tenth_simulator pbl_sim_tc_driver.launch map_name:=f
Where map_name
can be any map within the map directory.
Tire Generalisation Results Results for experiment with tire friction lower than nominal value and outside of training range. Results come from 200 runs.
avg Lap time [s] | std Lap time [s] | Crashes | avg Advancement | std Advancement | |
---|---|---|---|---|---|
MPC | 10.094 | 0.501 | 80.50% | 32.67% | 28.26% |
end-to-end | 11.148 | 0.302 | 73.50% | 52.51% | 28.69% |
TC-Driver | 10.798 | 0.143 | 2.50% | 99.37% | 4.90% |
Track Generalisation results Results for experiment on tracks unseen during training time. Results come from 200 runs.
Track | Driver | avg Lap time [s] | std Lap time [s] | Crashes | avg Advancement | std Advancement |
---|---|---|---|---|---|---|
Autodrome | MPC | 46.461 | 0.029 | 0.00% | 100.00% | 0.00% |
Autodrome | end-to-end | 52.5527 | 0.234 | 96.00% | 35.09% | 27.06% |
Autodrome | TC-Driver | 59.020 | 0.307 | 8.0% | 95.32% | 17.88% |
Catalunya | MPC | 41.475 | 0.036 | 0.00% | 100.00% | 0.00% |
Catalunya | end-to-end | 46.878 | 0.207 | 95.50% | 44.16% | 30.33% |
Catalunya | TC-Driver | 52.978% | 0.321 | 59.50% | 65.27% | 37.03% |
Oschersleben | MPC | 25.915 | 0.022 | 0.00% | 100.00% | 0.00% |
Oschersleben | end-to-end | n.a. | n.a. | 100.00% | 19.27% | 19.93% |
Oschersleben | TC-Driver | 34.603 | 0.415 | 94.00% | 46.95% | 31.23% |
The proposed TC-Driver RL agent is trained in simulation only and can be deployed on a physical car, on an unseen track and complete laps with similar crash-ratio as observed in simulation. In the image below, you can see the physical 1:10 scaled F1TENTH car, along with an example track on which it was deployed.
Here you can see the RVIZ visualisation of TC-Driver and the End2End learned architectures. TC-Driver can perform the 10 laps with a single crash, while the End2End architecture fails to complete a single lap without crash. Therefore TC-Driver demonstrates a 10% crash ratio on this track, while End2End has a 100% crash ratio. The rosbag recordings are uploaded here.
End2End | TC-Driver |
---|---|
Note: Different track recordings following.
Lastly here is a gif of the TC-Driver completing its track :)
If this has been helpful in an academic or industrial context, please consider citing our publication:
@article{Ghignone2023,
doi = {10.55417/fr.2023020},
url = {https://doi.org/10.55417/fr.2023020},
year = {2023},
month = jan,
publisher = {Field Robotics Publication Society},
volume = {3},
number = {1},
pages = {637--651},
author = {Edoardo Ghignone and Nicolas Baumann and Michele Magno},
title = {{TC}-Driver: A Trajectory Conditioned Reinforcement Learning Approach to Zero-Shot Autonomous Racing},
journal = {Field Robotics}
}