Add MARC

patrick-llgc · Jun 18, 2024 · 7c3eab9 · 7c3eab9
1 parent 90ae2cb
commit 7c3eab9
Show file tree

Hide file tree

Showing 2 changed files with 49 additions and 5 deletions.
diff --git a/README.md b/README.md
@@ -40,6 +40,13 @@ I regularly update [my blog in Toward Data Science](https://medium.com/@patrickl
 - [OpenVLA: An Open-Source Vision-Language-Action Model](https://arxiv.org/abs/2406.09246) [open source RT-2]
 - [Parting with Misconceptions about Learning-based Vehicle Motion Planning](https://arxiv.org/abs/2306.07962) <kbd>CoRL 2023</kbd> [Simple non-learning based baseline]
 - [QuAD: Query-based Interpretable Neural Motion Planning for Autonomous Driving](https://arxiv.org/abs/2404.01486) [Waabi]
+- [MPDM: Multipolicy decision-making in dynamic, uncertain environments for autonomous driving](https://ieeexplore.ieee.org/document/7139412) <kbd>ICRA 2015</kbd> [Behavior planning]
+- [MPDM2: Multipolicy Decision-Making for Autonomous Driving via Changepoint-based Behavior Prediction](https://www.roboticsproceedings.org/rss11/p43.pdf) <kbd>RSS 2015</kbd> [Behavior planning]
+- [MPDM3: Multipolicy decision-making for autonomous driving via changepoint-based behavior prediction: Theory and experiment](https://link.springer.com/article/10.1007/s10514-017-9619-z) <kbd>RSS 2017</kbd> [Behavior planning]
+- [EUDM: Efficient Uncertainty-aware Decision-making for Automated Driving Using Guided Branching](https://arxiv.org/abs/2003.02746) <kbd>ICRA 2020</kbd> [Wenchao Ding, Shaojie Shen, Behavior planning]
+- [TPP: Tree-structured Policy Planning with Learned Behavior Models](https://arxiv.org/abs/2301.11902) <kbd>ICRA 2023</kbd> [Marco Pavone, Nvidia, Behavior planning]
+- [MARC: Multipolicy and Risk-aware Contingency Planning for Autonomous Driving](https://arxiv.org/abs/2308.12021) [[Notes](paper_notes/marc.md)] <kbd>RAL 2023</kbd> [Shaojie Shen, Behavior planning]
+- [trajdata: A Unified Interface to Multiple Human Trajectory Datasets](https://arxiv.org/abs/2307.13924) <kbd>NeurIPS 2023</kbd> [Marco Pavone, Nvidia]
 - [Optimal Vehicle Trajectory Planning for Static Obstacle Avoidance using Nonlinear Optimization](https://arxiv.org/abs/2307.09466) [Xpeng]
 - [Jointly Learnable Behavior and Trajectory Planning for Self-Driving Vehicles](https://arxiv.org/abs/1910.04586) [[Notes](paper_notes/joint_learned_bptp.md)] <kbd>IROS 2019 Oral</kbd> [Uber ATG, behavioral planning, motion planning]
 - [Enhancing End-to-End Autonomous Driving with Latent World Model](https://arxiv.org/abs/2406.08481)
@@ -59,11 +66,7 @@ I regularly update [my blog in Toward Data Science](https://medium.com/@patrickl
 - [基于改进混合A*的智能汽车时空联合规划方法](https://www.qichegongcheng.com/CN/abstract/abstract1500.shtml) <kbd>汽车工程: 规划&决策2023年</kbd> [Joint optimization, search]
 - [Enable Faster and Smoother Spatio-temporal Trajectory Planning for Autonomous Vehicles in Constrained Dynamic Environment](https://journals.sagepub.com/doi/abs/10.1177/0954407020906627) <kbd>JAE 2020</kbd> [Joint optimization, search]
 - [Focused Trajectory Planning for Autonomous On-Road Driving](https://www.ri.cmu.edu/pub_files/2013/6/IV2013-Tianyu.pdf) <kbd>IV 2013</kbd> [Joint optimization, Iteration]
-- [SSC: Safe Trajectory Generation for Complex Urban Environments Using Spatio-Temporal Semantic Corridor](https://arxiv.org/abs/1906.09788) <kbd>RAL 2019</kbd> [Joint optimization, SSC, Wenchao Ding]
-- [MPDM: Multipolicy decision-making in dynamic, uncertain environments for autonomous driving](https://ieeexplore.ieee.org/document/7139412) <kbd>ICRA 2015</kbd>
-- [MPDM2: Multipolicy Decision-Making for Autonomous Driving via Changepoint-based Behavior Prediction](https://www.roboticsproceedings.org/rss11/p43.pdf) <kbd>RSS 2015</kbd>
-- [MPDM3: Multipolicy decision-making for autonomous driving via changepoint-based behavior prediction: Theory and experiment](https://link.springer.com/article/10.1007/s10514-017-9619-z) <kbd>RSS 2017</kbd>
-- [EUDM: Efficient Uncertainty-aware Decision-making for Automated Driving Using Guided Branching](https://arxiv.org/abs/2003.02746) <kbd>ICRA 2020</kbd> [Wenchao Ding]
+- [SSC: Safe Trajectory Generation for Complex Urban Environments Using Spatio-Temporal Semantic Corridor](https://arxiv.org/abs/1906.09788) <kbd>RAL 2019</kbd> [Joint optimization, SSC, Wenchao Ding, Motion planning]
 - [AlphaGo: Mastering the game of Go with deep neural networks and tree search](https://www.nature.com/articles/nature16961) <kbd>Nature 2016</kbd> [DeepMind, MTCS]
 - [AlphaZero: A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play](https://www.science.org/doi/full/10.1126/science.aar6404) <kbd>Science 2017</kbd> [DeepMind]
 - [MuZero: Mastering Atari, Go, chess and shogi by planning with a learned model](https://www.nature.com/articles/s41586-020-03051-4) <kbd>Nature 2020</kbd> [DeepMind]

diff --git a/paper_notes/marc.md b/paper_notes/marc.md
@@ -0,0 +1,41 @@
+# [MARC: Multipolicy and Risk-aware Contingency Planning for Autonomous Driving](https://arxiv.org/abs/2308.12021)
+
+_June 2024_
+
+tl;dr: Generating safe and non-conservative behaviors in dense dynamic environment, by combining multipolicy decision making and contigency planning.
+
+#### Overall impression
+This is a continuation of work in [MPDM](mpdm.md) and [EUDM](eudm.md). It introduces dynamic branching based on scene-level divergence, and risk-aware contingency planning based on user-defined risk tolerance.
+
+POMDP provides a theoretically sounds framework to handle dynamic interaction, but it suffers from curse of dimensionality and making it infeasible to solve in realtime.
+
+* [MPDM](mpdm.md) prunes belief trees heavily and decomposes POMDP into a limited number of closed-loop policy evaluations. MPDM has only one ego policy over planning horizon (8s). Mainly BP. 
+* EUDM improves by having multiple (2) policy in planning horizon, and performs DCP-Tree and CFB (conditoned focused branching) to use domain specific knowledge to guide branching in both action and intention space. Mainly BP.
+* MARC performs risk-aware contigency planning based on multiple scenarios. And it combines BP and MP.
+	* All previous MPDM-like methods consider the optimal policy and single trajectory generation over all scenarios, resulting in lack of gurantee of policy consistency and loss of multimodality info.
+
+#### Key ideas
+- Planning is hard from uncertainty and interaction (inherently multimodal intentions). 
+	- For interactive decision making, MDP or POMDP are mathematically rigorous formulations for decision processes in stochastic environments. 
+	- For static (non-interactive) decision making, the normal trioka of planninig (sampling, searching, optimization) would suffice.
+- *Contigency planning* generates deterministic behavior for mulutiple future scenarios. In other words, it plans a short-term trajectory that ensures safety for all potential scenarios.
+- Scenario tree construction
+	- generating policy-conditioned critical scenario sets via closed-loop forward simulation (similar to CFB in EUDM?).
+	- building scenario tree with scene-level divergence assessment. Determine the latest timestamp at which the scenario diverge. Delaying branching time as much as possble.
+		- State variables in trajectory optimization decreases
+		- Smooth handling of different potential outcomes, more robust to disturbance (more mature driver-like).
+- Trajectory tree generation with RCP
+	- RCP (risk-aware contingency planning) considers tradeoff beween conservativeness and efficiency.
+	- RCP generates trajectories that are optimal in multiple future scenarios under user-defined risk-averse levels. --> This can mimic human preference.
+- Evalution
+	- Selection based on both policy tree and trajectory tree (new!), ensuring consistency of policies
+- MARC are more robust under uncertain interactions and fewer unexpected policy switches
+	- can handle cut-in with smoother decel, and can handle disturbance (prediciton noise, etc) 
+	- with better effiency (avg speed) and riding comfort (max decel/acc).
+
+#### Technical details
+- Summary of technical details, such as important training details, or bugs of previous benchmarks.
+
+#### Notes
+- Questions and notes on how to improve/revise the current work
+