Skip to content

Latest commit

 

History

History
108 lines (89 loc) · 5.48 KB

README.md

File metadata and controls

108 lines (89 loc) · 5.48 KB

gym-rotor

OpenAI Gym environments for a quadrotor UAV control

Learn by Doing

This repository contains OpenAI Gym environments and PyTorch implementations of TD3 and MATD3, for low-level control of quadrotor unmanned aerial vehicles. To better understand What Deep RL Do, see OpenAI Spinning UP. Please don't hesitate to create new issues or pull requests for any suggestions and corrections.

  • We have recently switched from Gym to Gymnasium, but our previous Gym-based environments are still available here.

Installation

Requirements

The repo was written with Python 3.11.3, Gymnasium 0.28.1, Pytorch 2.0.1, and Numpy 1.25.1. It is recommended to create Anaconda environment with Python 3. The official installation guide is available here. Visual Studio Code in Anaconda Navigator is highly recommended.

  1. Open your Anaconda Prompt and install major packages.
conda install -c conda-forge gymnasium
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c anaconda numpy
conda install -c conda-forge vpython

Check out Gymnasium, Pytorch, and Numpy, and Vpython.

  1. Clone the repository.
git clone https://github.com/fdcl-gwu/gym-rotor.git

Environments

Consider a quadrotor UAV below. The equations of motion are given by

The position and the velocity of the quadrotor are represented by $x \in \mathbb{R}^3$ and $v \in \mathbb{R}^3$, respectively. The attitude is defined by the rotation matrix $R \in SO(3) = \lbrace R \in \mathbb{R}^{3\times 3} | R^T R=I_{3\times 3}, \mathrm{det}[R]=1 \rbrace$, that is the linear transformation of the representation of a vector from the body-fixed frame $\lbrace \vec b_{1},\vec b_{2},\vec b_{3} \rbrace$ to the inertial frame $\lbrace \vec e_1,\vec e_2,\vec e_3 \rbrace$. The angular velocity vector is denoted by $\Omega \in \mathbb{R}^3$. Given the total thrust $f = \sum{}_{i=1}^{4} T_i \in \mathbb{R}$ and the moment $M = [M_1, M_2, M_3]^T \in \mathbb{R}^3$ resolved in the body-fixed frame, the thrust of each motor $(T_1,T_2,T_3,T_4)$ is determined by

$$ \begin{gather} \begin{bmatrix} T_1 \\ T_2 \\ T_3 \\ T_4 \end{bmatrix} = \frac{1}{4} \begin{bmatrix} 1 & 0 & 2/d & -1/c_{\tau f} \\ 1 & -2/d & 0 & 1/c_{\tau f} \\ 1 & 0 & -2/d & -1/c_{\tau f} \\ 1 & 2/d & 0 & 1/c_{\tau f} \end{bmatrix} \begin{bmatrix} f \\ M_1 \\ M_2 \\ M_3 \end{bmatrix}. \end{gather} $$

Env IDs Description
Quad-v0 This serves as the foundational env for wrappers, where the state and action are represented as $s = (x, v, R, \Omega)$ and $a = (T_1, T_2, T_3, T_4)$.
CoupledWrapper For single-agent RL frameworks; the observation and action are given by $o = (e_x, e_v, R, e_\Omega, e_{I_x}, e_{b_1}, e_{I_{b_1}})$ and $a = (f, M_1, M_2, M_3)$.
DecoupledWrapper For multi-agent RL frameworks; the observation and action for each agent are defined as $o_1 = (e_x, e_v, b_3, e_{\omega_{12}}, e_{I_x})$, $a_1 = (f, \tau)$ and $o_2 = (b_1, e_{\Omega_3}, e_{b_1}, e_{I_{b_1}})$, $a_2 = M_3$, respectively.

where the error terms $e_x, e_v$, and $e_\Omega$ represent the errors in position, velocity, and angular velocity, respectively. To eliminate steady-state errors, we add the integral terms $e_{I_x}$ and $e_{I_{b_1}}$. More details can be found here.

Examples

Hyperparameters can be adjusted in args_parse.py. For example, training with the CTDE framework can be run by

python3 main.py --framework CTDE --seed 789

Citation

If you find this work useful in your own work or would like to cite it, please give credit to our work:

@article{yu2023multi,
  title={Multi-Agent Reinforcement Learning for the Low-Level Control of a Quadrotor UAV},
  author={Yu, Beomyeol and Lee, Taeyoung},
  journal={arXiv preprint arXiv:2311.06144},
  year={2023}
}

@inproceedings{yu2023equivariant,
  title={Equivariant Reinforcement Learning for Quadrotor UAV},
  author={Yu, Beomyeol and Lee, Taeyoung},
  booktitle={2023 American Control Conference (ACC)},
  pages={2842--2847},
  year={2023},
  organization={IEEE}
}

Reference: