Project Page | Paper | Supplementary | Video | Poster | Blog
This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our code or paper useful, please cite
@inproceedings{Prakash2021CVPR,
author = {Prakash, Aditya and Chitta, Kashyap and Geiger, Andreas},
title = {Multi-Modal Fusion Transformer for End-to-End Autonomous Driving},
booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2021}
}
Install anaconda
wget https://repo.anaconda.com/archive/Anaconda3-2020.11-Linux-x86_64.sh
bash Anaconda3-2020.11-Linux-x86_64.sh
source ~/.profile
Clone the repo and build the environment
git clone https://github.com/autonomousvision/transfuser
cd transfuser
conda create -n transfuser python=3.7
pip3 install -r requirements.txt
conda activate transfuser
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
Download and setup CARLA 0.9.10.1
chmod +x setup_carla.sh
./setup_carla.sh
The training data is generated using leaderboard/team_code/auto_pilot.py
in 8 CARLA towns and 14 weather conditions. The routes and scenarios files to be used for data generation are provided at leaderboard/data
.
./CarlaUE4.sh --world-port=2000 -opengl
Without Docker:
SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=0 ./CarlaUE4.sh --world-port=2000 -opengl
With Docker:
Instructions for setting up docker are available here. Pull the docker image of CARLA 0.9.10.1 docker pull carlasim/carla:0.9.10.1
.
Docker 18:
docker run -it --rm -p 2000-2002:2000-2002 --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 carlasim/carla:0.9.10.1 ./CarlaUE4.sh --world-port=2000 -opengl
Docker 19:
docker run -it --rm --net=host --gpus '"device=0"' carlasim/carla:0.9.10.1 ./CarlaUE4.sh --world-port=2000 -opengl
If the docker container doesn't start properly then add another environment variable -e SDL_AUDIODRIVER=dsp
.
Once the CARLA server is running, rollout the autopilot to start data generation.
./leaderboard/scripts/run_evaluation.sh
The expert agent used for data generation is defined in leaderboard/team_code/auto_pilot.py
. Different variables which need to be set are specified in leaderboard/scripts/run_evaluation.sh
. The expert agent is based on the autopilot from this codebase.
Each route is defined by a sequence of waypoints (and optionally a weather condition) that the agent needs to follow. Each scenario is defined by a trigger transform (location and orientation) and other actors present in that scenario (optional). The leaderboard repository provides a set of routes and scenarios files. To generate additional routes, spin up a CARLA server and follow the procedure below.
The position of traffic lights is used to localize intersections and (start_wp, end_wp) pairs are sampled in a grid centered at these points.
python3 tools/generate_intersection_routes.py --save_file <path_of_generated_routes_file> --town <town_to_be_used>
Each route in the provided routes file is interpolated into a dense sequence of waypoints and individual junctions are sampled from these based on change in navigational commands.
python3 tools/sample_junctions.py --routes_file <xml_file_containing_routes> --save_file <path_of_generated_file>
Additional scenarios are densely sampled in a grid centered at the locations from the reference scenarios file. More scenario files can be found here.
python3 tools/generate_scenarios.py --scenarios_file <scenarios_file_to_be_used_as_reference> --save_file <path_of_generated_json_file> --towns <town_to_be_used>
The training code and pretrained models are provided below.
mkdir model_ckpt
wget https://s3.eu-central-1.amazonaws.com/avg-projects/transfuser/models.zip -P model_ckpt
unzip model_ckpt/models.zip -d model_ckpt/
rm model_ckpt/models.zip
Spin up a CARLA server (described above) and run the required agent. The adequate routes and scenarios files are provided in leaderboard/data
and the required variables need to be set in leaderboard/scripts/run_evaluation.sh
.
CUDA_VISIBLE_DEVICES=0 ./leaderboard/scripts/run_evaluation.sh
This implementation is based on codebase from several repositories.