diff --git a/README.md b/README.md index ea44028..c06fa92 100755 --- a/README.md +++ b/README.md @@ -40,6 +40,13 @@ I regularly update [my blog in Toward Data Science](https://medium.com/@patrickl - [OpenVLA: An Open-Source Vision-Language-Action Model](https://arxiv.org/abs/2406.09246) [open source RT-2] - [Parting with Misconceptions about Learning-based Vehicle Motion Planning](https://arxiv.org/abs/2306.07962) CoRL 2023 [Simple non-learning based baseline] - [QuAD: Query-based Interpretable Neural Motion Planning for Autonomous Driving](https://arxiv.org/abs/2404.01486) [Waabi] +- [MPDM: Multipolicy decision-making in dynamic, uncertain environments for autonomous driving](https://ieeexplore.ieee.org/document/7139412) ICRA 2015 [Behavior planning] +- [MPDM2: Multipolicy Decision-Making for Autonomous Driving via Changepoint-based Behavior Prediction](https://www.roboticsproceedings.org/rss11/p43.pdf) RSS 2015 [Behavior planning] +- [MPDM3: Multipolicy decision-making for autonomous driving via changepoint-based behavior prediction: Theory and experiment](https://link.springer.com/article/10.1007/s10514-017-9619-z) RSS 2017 [Behavior planning] +- [EUDM: Efficient Uncertainty-aware Decision-making for Automated Driving Using Guided Branching](https://arxiv.org/abs/2003.02746) ICRA 2020 [Wenchao Ding, Shaojie Shen, Behavior planning] +- [TPP: Tree-structured Policy Planning with Learned Behavior Models](https://arxiv.org/abs/2301.11902) ICRA 2023 [Marco Pavone, Nvidia, Behavior planning] +- [MARC: Multipolicy and Risk-aware Contingency Planning for Autonomous Driving](https://arxiv.org/abs/2308.12021) [[Notes](paper_notes/marc.md)] RAL 2023 [Shaojie Shen, Behavior planning] +- [trajdata: A Unified Interface to Multiple Human Trajectory Datasets](https://arxiv.org/abs/2307.13924) NeurIPS 2023 [Marco Pavone, Nvidia] - [Optimal Vehicle Trajectory Planning for Static Obstacle Avoidance using Nonlinear Optimization](https://arxiv.org/abs/2307.09466) [Xpeng] - [Jointly Learnable Behavior and Trajectory Planning for Self-Driving Vehicles](https://arxiv.org/abs/1910.04586) [[Notes](paper_notes/joint_learned_bptp.md)] IROS 2019 Oral [Uber ATG, behavioral planning, motion planning] - [Enhancing End-to-End Autonomous Driving with Latent World Model](https://arxiv.org/abs/2406.08481) @@ -59,11 +66,7 @@ I regularly update [my blog in Toward Data Science](https://medium.com/@patrickl - [基于改进混合A*的智能汽车时空联合规划方法](https://www.qichegongcheng.com/CN/abstract/abstract1500.shtml) 汽车工程: 规划&决策2023年 [Joint optimization, search] - [Enable Faster and Smoother Spatio-temporal Trajectory Planning for Autonomous Vehicles in Constrained Dynamic Environment](https://journals.sagepub.com/doi/abs/10.1177/0954407020906627) JAE 2020 [Joint optimization, search] - [Focused Trajectory Planning for Autonomous On-Road Driving](https://www.ri.cmu.edu/pub_files/2013/6/IV2013-Tianyu.pdf) IV 2013 [Joint optimization, Iteration] -- [SSC: Safe Trajectory Generation for Complex Urban Environments Using Spatio-Temporal Semantic Corridor](https://arxiv.org/abs/1906.09788) RAL 2019 [Joint optimization, SSC, Wenchao Ding] -- [MPDM: Multipolicy decision-making in dynamic, uncertain environments for autonomous driving](https://ieeexplore.ieee.org/document/7139412) ICRA 2015 -- [MPDM2: Multipolicy Decision-Making for Autonomous Driving via Changepoint-based Behavior Prediction](https://www.roboticsproceedings.org/rss11/p43.pdf) RSS 2015 -- [MPDM3: Multipolicy decision-making for autonomous driving via changepoint-based behavior prediction: Theory and experiment](https://link.springer.com/article/10.1007/s10514-017-9619-z) RSS 2017 -- [EUDM: Efficient Uncertainty-aware Decision-making for Automated Driving Using Guided Branching](https://arxiv.org/abs/2003.02746) ICRA 2020 [Wenchao Ding] +- [SSC: Safe Trajectory Generation for Complex Urban Environments Using Spatio-Temporal Semantic Corridor](https://arxiv.org/abs/1906.09788) RAL 2019 [Joint optimization, SSC, Wenchao Ding, Motion planning] - [AlphaGo: Mastering the game of Go with deep neural networks and tree search](https://www.nature.com/articles/nature16961) Nature 2016 [DeepMind, MTCS] - [AlphaZero: A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play](https://www.science.org/doi/full/10.1126/science.aar6404) Science 2017 [DeepMind] - [MuZero: Mastering Atari, Go, chess and shogi by planning with a learned model](https://www.nature.com/articles/s41586-020-03051-4) Nature 2020 [DeepMind] diff --git a/paper_notes/marc.md b/paper_notes/marc.md new file mode 100644 index 0000000..38eb54d --- /dev/null +++ b/paper_notes/marc.md @@ -0,0 +1,41 @@ +# [MARC: Multipolicy and Risk-aware Contingency Planning for Autonomous Driving](https://arxiv.org/abs/2308.12021) + +_June 2024_ + +tl;dr: Generating safe and non-conservative behaviors in dense dynamic environment, by combining multipolicy decision making and contigency planning. + +#### Overall impression +This is a continuation of work in [MPDM](mpdm.md) and [EUDM](eudm.md). It introduces dynamic branching based on scene-level divergence, and risk-aware contingency planning based on user-defined risk tolerance. + +POMDP provides a theoretically sounds framework to handle dynamic interaction, but it suffers from curse of dimensionality and making it infeasible to solve in realtime. + +* [MPDM](mpdm.md) prunes belief trees heavily and decomposes POMDP into a limited number of closed-loop policy evaluations. MPDM has only one ego policy over planning horizon (8s). Mainly BP. +* EUDM improves by having multiple (2) policy in planning horizon, and performs DCP-Tree and CFB (conditoned focused branching) to use domain specific knowledge to guide branching in both action and intention space. Mainly BP. +* MARC performs risk-aware contigency planning based on multiple scenarios. And it combines BP and MP. + * All previous MPDM-like methods consider the optimal policy and single trajectory generation over all scenarios, resulting in lack of gurantee of policy consistency and loss of multimodality info. + +#### Key ideas +- Planning is hard from uncertainty and interaction (inherently multimodal intentions). + - For interactive decision making, MDP or POMDP are mathematically rigorous formulations for decision processes in stochastic environments. + - For static (non-interactive) decision making, the normal trioka of planninig (sampling, searching, optimization) would suffice. +- *Contigency planning* generates deterministic behavior for mulutiple future scenarios. In other words, it plans a short-term trajectory that ensures safety for all potential scenarios. +- Scenario tree construction + - generating policy-conditioned critical scenario sets via closed-loop forward simulation (similar to CFB in EUDM?). + - building scenario tree with scene-level divergence assessment. Determine the latest timestamp at which the scenario diverge. Delaying branching time as much as possble. + - State variables in trajectory optimization decreases + - Smooth handling of different potential outcomes, more robust to disturbance (more mature driver-like). +- Trajectory tree generation with RCP + - RCP (risk-aware contingency planning) considers tradeoff beween conservativeness and efficiency. + - RCP generates trajectories that are optimal in multiple future scenarios under user-defined risk-averse levels. --> This can mimic human preference. +- Evalution + - Selection based on both policy tree and trajectory tree (new!), ensuring consistency of policies +- MARC are more robust under uncertain interactions and fewer unexpected policy switches + - can handle cut-in with smoother decel, and can handle disturbance (prediciton noise, etc) + - with better effiency (avg speed) and riding comfort (max decel/acc). + +#### Technical details +- Summary of technical details, such as important training details, or bugs of previous benchmarks. + +#### Notes +- Questions and notes on how to improve/revise the current work +