forked from RussTedrake/underactuated
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathacrobot.html
1195 lines (964 loc) · 60.6 KB
/
acrobot.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html>
<head>
<title>Ch. 3 - Acrobots, Cart-Poles, and Quadrotors</title>
<meta name="Ch. 3 - Acrobots, Cart-Poles, and Quadrotors" content="text/html; charset=utf-8;" />
<link rel="canonical" href="http://underactuated.mit.edu/acrobot.html" />
<script src="https://hypothes.is/embed.js" async></script>
<script type="text/javascript" src="chapters.js"></script>
<script type="text/javascript" src="htmlbook/book.js"></script>
<script src="htmlbook/mathjax-config.js" defer></script>
<script type="text/javascript" id="MathJax-script" defer
src="htmlbook/MathJax/es5/tex-chtml.js">
</script>
<script>window.MathJax || document.write('<script type="text/javascript" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js" defer><\/script>')</script>
<link rel="stylesheet" href="htmlbook/highlight/styles/default.css">
<script src="htmlbook/highlight/highlight.pack.js"></script> <!-- http://highlightjs.readthedocs.io/en/latest/css-classes-reference.html#language-names-and-aliases -->
<script>hljs.initHighlightingOnLoad();</script>
<link rel="stylesheet" type="text/css" href="htmlbook/book.css" />
</head>
<body onload="loadChapter('underactuated');">
<div data-type="titlepage">
<header>
<h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a></h1>
<p data-type="subtitle">Algorithms for Walking, Running, Swimming, Flying, and Manipulation</p>
<p style="font-size: 18px;"><a href="http://people.csail.mit.edu/russt/">Russ Tedrake</a></p>
<p style="font-size: 14px; text-align: right;">
© Russ Tedrake, 2023<br/>
Last modified <span id="last_modified"></span>.</br>
<script>
var d = new Date(document.lastModified);
document.getElementById("last_modified").innerHTML = d.getFullYear() + "-" + (d.getMonth()+1) + "-" + d.getDate();</script>
<a href="misc.html">How to cite these notes, use annotations, and give feedback.</a><br/>
</p>
</header>
</div>
<p><b>Note:</b> These are working notes used for <a
href="https://underactuated.csail.mit.edu/Spring2023/">a course being taught
at MIT</a>. They will be updated throughout the Spring 2023 semester. <a
href="https://www.youtube.com/channel/UChfUOAhz7ynELF-s_1LPpWg">Lecture videos are available on YouTube</a>.</p>
<table style="width:100%;"><tr style="width:100%">
<td style="width:33%;text-align:left;"><a class="previous_chapter" href=pend.html>Previous Chapter</a></td>
<td style="width:33%;text-align:center;"><a href=index.html>Table of contents</a></td>
<td style="width:33%;text-align:right;"><a class="next_chapter" href=simple_legs.html>Next Chapter</a></td>
</tr></table>
<script type="text/javascript">document.write(notebook_header('acrobot'))
</script>
<!-- EVERYTHING ABOVE THIS LINE IS OVERWRITTEN BY THE INSTALL SCRIPT -->
<chapter style="counter-reset: chapter 2">
<h1>Acrobots, Cart-Poles, and Quadrotors</h1>
<p> A great deal of work in the control of underactuated systems has been
done in the context of low-dimensional model systems. These model systems
capture the essence of the problem without introducing all of the complexity
that is often involved in more real-world examples. In this chapter we will
focus on two of the most well-known and well-studied model systems--the
Acrobot and the Cart-Pole. After we have developed some tools, we will see
that they can be applied directly to other model systems; we will give a
number of examples using Quadrotors. All of these systems are trivially
underactuated, having less actuators than degrees of freedom.</p>
<section><h1>The Acrobot</h1>
<p> The Acrobot is a planar two-link robotic arm in the vertical plane
(working against gravity), with an actuator at the elbow, but no actuator at
the shoulder. It was first described in detail in <elib>Murray91</elib>.
The companion system, with an actuator at the shoulder but not at the elbow,
is known as the Pendubot<elib>Spong97</elib>. The Acrobot is so named
because of its resemblance to a gymnast (or acrobat) on a parallel bar, who
controls his/her motion predominantly by effort at the waist (and not effort
at the wrist). The most common control task studied for the Acrobot is the
swing-up task, in which the system must use the elbow (or waist) torque to
move the system into a vertical configuration then balance. </p>
<figure>
<img width="40%" src="figures/acrobot.svg"/>
<figcaption>The Acrobot. <a href="http://youtu.be/FeCwtvrD76I" target="_blank">Click here to see a physical Acrobot swing up and balance</a>.</figcaption>
</figure>
<p> The Acrobot is representative of the primary challenge in underactuated
robots. In order to swing up and balance the entire system, the controller
must reason about and exploit the state-dependent coupling between the
actuated degree of freedom and the unactuated degree of freedom. It is also
an important system because, as we will see, it closely resembles one of the
simplest models of a walking robot.</p>
<subsection><h1>Equations of motion</h1>
<p> Figure 3.1 illustrates the model parameters used in our analysis.
$\theta_1$ is the shoulder joint angle, $\theta_2$ is the elbow (relative)
joint angle, and we will use $\bq = [\theta_1,\theta_2]^T$, $\bx =
[\bq,\dot\bq]^T$. The zero configuration is with both links pointed
directly down. The moments of inertia, $I_1,I_2$ are taken about the
pivots<!--<elib>Spong95</elib> uses the center of mass, which differs only
by an extra term in each inertia from the parallel axis theorem.-->. The
task is to stabilize the unstable fixed point $\bx = [\pi,0,0,0]^T$.</p>
<p>
We will derive the equations of motion for the Acrobot using the method
of Lagrange. The locations of the center of mass of each link,
$\bp_{c1}, \bp_{c2},$ are given by the kinematics: \begin{equation}
\bp_{c1} = \begin{bmatrix} l_{c1} s_1 \\ -l_{c1} c_1 \end{bmatrix}, \quad
\bp_{c2} = \begin{bmatrix} l_1 s_1 + l_{c2} s_{1+2} \\ -l_1 c_1 - l_{c2}
c_{1+2} \end{bmatrix}, \end{equation} where $s_1$ is shorthand for
$\sin(\theta_1)$, $c_{1+2}$ is shorthand for $\cos(\theta_1+\theta_2)$,
etc. The energy
<!--\footnote{The complicated expression for $T_2$ comes from $T_2 =
\frac{1}{2} m_2 v_{c2}^2 + \frac{1}{2}I_{c2} (\dot{q}_1 + \dot{q}_2)^2$,
where $v_{c2}$ is the velocity of the center of mass of the second link
and $I_{c2} = I_2 - m_2 l_{c2}^2$ is the moment of inertia about that
point. Use $c_1 c_{1+2} + s_1 s_{1+2} = c_2$ to simplify.}--> is given
by: \begin{gather} T = T_1 + T_2, \quad T_1 = \frac{1}{2} I_1 \dot{q}_1^2
\\ T_2 = \frac{1}{2} ( m_2 l_1^2 + I_2 + 2 m_2 l_1 l_{c2} c_2 )
\dot{q}_1^2 + \frac{1}{2} I_2 \dot{q}_2^2 + (I_2 + m_2 l_1 l_{c2} c_2)
\dot{q}_1 \dot{q}_2 \\
<!-- from expanding sum of point masses --> U = -m_1 g l_{c1} c_1 - m_2 g (l_1 c_1 +
l_{c2} c_{1+2}) \end{gather} Entering these quantities into the Lagrangian
yields the equations of motion: \begin{gather} (I_1 + I_2 + m_2 l_1^2 +
2m_2 l_1 l_{c2} c_2) \ddot{q}_1 + (I_2 + m_2 l_1 l_{c2} c_2)\ddot{q}_2 -
2m_2 l_1 l_{c2} s_2 \dot{q}_1 \dot{q}_2 \\ \quad -m_2 l_1 l_{c2} s_2
\dot{q}_2^2 + m_1 g l_{c1}s_1 + m_2 g (l_1 s_1 + l_{c2} s_{1+2}) = 0 \\
(I_2 + m_2 l_1 l_{c2} c_2) \ddot{q}_1 + I_2 \ddot{q}_2 + m_2 l_1 l_{c2}
s_2 \dot{q}_1^2 + m_2 g l_{c2} s_{1+2} = \tau \end{gather} In standard,
manipulator equation form: $$\bM(\bq)\ddot\bq + \bC(\bq,\dot\bq)\dot\bq =
\btau_g(\bq) + \bB\bu,$$ using $\bq = [\theta_1,\theta_2]^T$, $\bu = \tau$
we have: \begin{gather} \bM(\bq) = \begin{bmatrix} I_1 + I_2 + m_2 l_1^2 +
2m_2 l_1 l_{c2} c_2 & I_2 + m_2 l_1 l_{c2} c_2 \\ I_2 + m_2 l_1 l_{c2} c_2
& I_2 \end{bmatrix},\label{eq:Hacrobot}\\ \bC(\bq,\dot{\bq}) =
\begin{bmatrix} -2 m_2 l_1 l_{c2} s_2 \dot{q}_2 & -m_2 l_1 l_{c2} s_2
\dot{q}_2 \\ m_2 l_1 l_{c2} s_2 \dot{q}_1 & 0 \end{bmatrix}, \\
\btau_g(\bq) = \begin{bmatrix} -m_1 g l_{c1}s_1 - m_2 g (l_1 s_1 +
l_{c2}s_{1+2}) \\ -m_2 g l_{c2} s_{1+2} \end{bmatrix}, \quad \bB =
\begin{bmatrix} 0 \\ 1 \end{bmatrix}. \end{gather}
</p>
<!--
Solving directly for the accelerations, we have
\begin{gather}
\det(\bM) = I_1 I_2 + m_2 l_1^2 (1 - m_2 l_{c2}^2 c_2^2)
\\
\ddot{q}_1 = \frac{1}{det(\bM)} \left[ I_2 \left( m_2 l_1 l_{c2} s_2
(2 \dot{q}_1 + \dot{q}_2) \dot{q}_2 - (m_1 l_{c1} + m_2 l_1) g s_1 -
m_2 g l_{c2} s_{1+2} \right) + (I_{c2} + m_2 l_1 l_{c2} c_2) \left( -m_2
l_1 l_{c2} s_2 \dot{q}_1^2 - m_2 g l_{c2} s_{1+2} \right)
\right]
\end{gather} -->
<example><h1>Dynamics of the Acrobot</h1>
<p>You can experiment with the Acrobot dynamics in <drake></drake>
using, e.g.</p>
<script>document.write(notebook_link('acrobot'))</script>
</example>
</subsection> <!-- acrobot eqs of motion -->
</section> <!-- acrobot -->
<section id="cart_pole"><h1>The Cart-Pole system</h1>
<p> The other model system that we will investigate here is the cart-pole
system, in which the task is to balance a simple pendulum around its
unstable equilibrium, using only horizontal forces on the cart. Balancing
the cart-pole system is used in many introductory courses in control,
including 6.003 at MIT, because it can be accomplished with simple linear
control (e.g. pole placement) techniques. In this chapter we will consider
the full swing-up and balance control problem, which requires a full
nonlinear control treatment.</p>
<figure>
<img width="50%" src="figures/cartpole.svg"/>
<figcaption>The Cart-Pole system. Click <a href="http://youtu.be/Bzq96V1yN5k">here to see a real robot</a>.</figcaption>
<todo> add swing-up + balance swf</todo>
</figure>
<p> The figure shows our parameterization of the system. $x$ is the
horizontal position of the cart, $\theta$ is the counter-clockwise angle
of the pendulum (zero is hanging straight down). We will use $\bq =
[x,\theta]^T$, and $\bx = [\bq,\dot\bq]^T$. The task is to stabilize the
unstable fixed point at $\bx = [0,\pi,0,0]^T.$ </p>
<subsection><h1>Equations of motion</h1>
<p>
The kinematics of the system are given by
\begin{equation}\bx_1 = \begin{bmatrix} x \\ 0
\end{bmatrix}, \quad \bx_2 = \begin{bmatrix} x + l\sin\theta \\
-l\cos\theta \end{bmatrix}. \end{equation}
The energy is given by
\begin{align} T=& \frac{1}{2} (m_c + m_p)\dot{x}^2 + m_p
\dot{x}\dot\theta l \cos{\theta} + \frac{1}{2}m_p l^2 \dot\theta^2 \\
U =& -m_p g l \cos\theta. \end{align}
The Lagrangian yields the equations of motion:
\begin{gather}
(m_c + m_p)\ddot{x} + m_p l \ddot\theta \cos\theta - m_p l \dot\theta^2 \sin\theta = f_x \\
m_p l \ddot{x} \cos\theta + m_p l^2 \ddot\theta + m_p g l \sin\theta = 0
\end{gather}
In standard, manipulator equation form:
$$\bM(\bq)\ddot\bq + \bC(\bq,\dot\bq)\dot\bq = \btau_g(\bq) +
\bB\bu,$$ using $\bq = [x,\theta]^T$, $\bu = f_x$, we have:
\begin{gather*}
\bM(\bq) = \begin{bmatrix} m_c + m_p & m_p l \cos\theta \\ m_p l
\cos\theta & m_p l^2 \end{bmatrix}, \quad \bC(\bq,\dot{\bq}) =
\begin{bmatrix} 0 & -m_p l \dot\theta \sin\theta \\ 0 & 0
\end{bmatrix}, \\
\btau_g(\bq) = \begin{bmatrix} 0 \\ - m_p gl \sin\theta \end{bmatrix},
\quad \bB = \begin{bmatrix} 1 \\ 0 \end{bmatrix}
\end{gather*}
In this case, it is particularly easy to solve directly for the accelerations:
\begin{align}
\ddot{x} =& \frac{1}{m_c + m_p \sin^2\theta}\left[ f_x + m_p \sin\theta (l \dot\theta^2 + g\cos\theta)\right] \label{eq:ddot_x}\\
\ddot{\theta} =& \frac{1}{l(m_c + m_p \sin^2\theta)} \left[ -f_x
\cos\theta - m_p l \dot\theta^2 \cos\theta \sin\theta - (m_c + m_p) g \sin\theta \right] \label{eq:ddot_theta}
\end{align}
In some of the analysis that follows, we will study the form of
the equations of motion, ignoring the details, by arbitrarily setting
all constants to 1:
\begin{gather}
2\ddot{x} + \ddot\theta \cos\theta - \dot\theta^2 \sin\theta = f_x \label{eq:simple}\\
\ddot{x}\cos\theta + \ddot\theta + \sin\theta = 0. \label{eq:simple2}
\end{gather}
</p>
<example><h1>Dynamics of the Cart-Pole System</h1>
<p>You can experiment with the Cart-Pole dynamics in <drake></drake>
using, e.g.</p>
<script>document.write(notebook_link('acrobot'))</script>
</example>
</subsection> <!-- end eq of motion -->
</section> <!-- end cart-pole -->
<section><h1>Quadrotors</h1>
<p>Quadrotors have become immensely popular over the last few years --
advances in outrunner motors from the hobby community made them powerful,
light-weight, and inexpensive! They are strong enough to carry an
interesting payload (e.g. of sensors for mapping / photographing the
environment), but dynamic enough to move relatively quickly. The most
interesting dynamics start showing up at higher speeds, when the propellors
start achieving lift from the airflow across them due to horizontal motion,
or when they are close enough to the ground to experience significant
ground-effect, but we won't try to capture those effects (yet) here.</p>
<p>When the quadrotor revolution started to happen, I predicted that it
would be followed quickly by a realization that fixed-wing vehicles are
better for most applications. Propellors are almost optimally efficient
for producing thrust -- making quadrotors very efficient for hovering --
but to be efficient in forward flight you probably want to have an airfoil.
Wings are a really good idea! But I was wrong -- quadrotors have
completely dominated fixed-wings for commercial UAVs. Perhaps it's only
because they are easier to control? Maybe there is still hope...</p>
<subsection id="planar_quadrotor"><h1>The Planar Quadrotor</h1>
<p>We can get started with an extremely simple model of a quadrotor that
is restricted to live in the plane. In that case, it actually only needs
two propellors, but calling it a "birotor" doesn't have the same ring to
it. The equations of motion are almost trivial, since it is only a
single rigid body, and certainly fit into our standard manipulator
equations:
\begin{gather}
m \ddot{x} = -(u_1 + u_2)\sin\theta, \label{eq:quad_x}\\
m \ddot{y} = (u_1 + u_2)\cos\theta - mg, \label{eq:quad_y}\\
I \ddot\theta = r (u_1 - u_2) \label{eq:quad_theta}
\end{gather}
</p>
<figure> <img width="50%" src="figures/quadrotor2d.svg"/>
<figcaption>The Planar Quadrotor System (which we also refer to in the
code as "Quadrotor2D", to keep it next to "Quadrotor" in the directory
listing). The model parameters are mass, $m$, moment of inertia, $I$, and
the distance from the center to the base of the propellor, $r$.
</figcaption> </figure>
</subsection>
<subsection><h1>The Full 3D Quadrotor</h1>
<p>The dynamics of the 3D Quadrotor follow similarly. The only
complexity comes from handling the 3D rotations. The code implementing
the dynamics of the <a
href="https://github.com/RobotLocomotion/drake/blob/53571e1d65d1716d8b4b72ae8d3571ee89fa36cf/examples/quadrotor/quadrotor_plant.cc#L55"><code>QuadrotorPlant</code></a>
example in Drake is almost as readable as a LaTeX derivation. The most
interesting feature of this model that we include the
<i>moment</i> produced by the rotating propellers. Surprisingly, without
these moments, the system linearized about the hovering configuration is
actually not controllable.</p>
<p>The <code>QuadrotorPlant</code> example implements the simple model.
But often you may want to add extra features to the quadrotor model. For
instance, you might wish to add collision dynamics to land on the ground
or to bounce off obstacles, or you might want to add a pendulum to the
quadrotor to balance, or a suspended payload hanging off the bottom. In
these cases, rather than implement the equations by hand, it is much more
convenient to use Drake's main physics engine (in
<code>MultibodyPlant</code>). We just need to manually wire the
<a href="https://drake.mit.edu/doxygen_cxx/classdrake_1_1multibody_1_1_propeller.html"><code>Propeller</code></a>
forces into the <code>Diagram</code> (because <code>Propeller</code> is
not a concept supported in URDF nor SDF just yet). I've implemented both
versions side-by-side in the notebook.</p>
<script>document.write(notebook_link('acrobot'))</script>
<p>Note that by default, the orientation of each floating body in
<code>MultibodyPlant</code> is represented by a unit quaternion. If you
were to try to linearize the model using the quaternion representation of
state, without accounting for unit norm constraint, the linearization of
this model is also uncontrollable. To avoid this for now, I have manually
added a "roll-pitch-yaw" floating base to the model. This keeps
linearization simple but does introduce a singularity at "gimble lock".
I've tried to mostly avoid dealing with 3D rotations throughout these
notes, but I do have a few more notes about this <a
href="https://manipulation.csail.mit.edu/pick.html#floating-base">here</a>.</p>
</subsection>
</section> <!-- end quadrotor -->
<section><h1>Balancing</h1>
<p> For both the Acrobot and the Cart-Pole systems, we will begin by
designing a linear controller which can balance the system when it begins in
the vicinity of the unstable fixed point. To accomplish this, we will
linearize the nonlinear equations about the fixed point, examine the
controllability of this linear system, then using <a
href="lqr.html">linear quadratic regulator (LQR)</a> theory to design
our feedback controller.</p>
<subsection id="linearizing_manip"><h1>Linearizing the manipulator equations</h1>
<p>Although the equations of motion of both of these model systems are
relatively tractable, the forward dynamics still involve quite a few
nonlinear terms that must be considered in any linearization. Let's
consider the general problem of linearizing a system described by the
manipulator equations.</p>
<p>We can perform linearization around a fixed point, $(\bx^*, \bu^*)$,
using a Taylor expansion:
\begin{equation}
\dot\bx = {\bf f}(\bx,\bu) \approx {\bf f}(\bx^*,\bu^*) + \left[ \pd{\bf
f}{\bx}\right]_{\bx=\bx^*,\bu=\bu^*} (\bx - \bx^*) + \left[ \pd{\bf
f}{\bu}\right]_{\bx=\bx^*,\bu=\bu^*} (\bu - \bu^*)
\end{equation}
Let us consider the specific problem of linearizing the manipulator
equations around a (stable or unstable) fixed point. In this case, ${\bf
f}(\bx^*,\bu^*)$ is zero, and we are left with the standard linear
state-space form:
\begin{align}
\dot\bx =& \begin{bmatrix} \dot\bq \\ \bM^{-1}(\bq) \left[ \btau_g(\bq) + {\bf
B}(\bq)\bu - \bC(\bq,\dot\bq)\dot\bq \right]
\end{bmatrix},\\
\approx& {\bf A}_{lin} (\bx-\bx^*) + \bB_{lin} (\bu - \bu^*),
\end{align}
where ${\bf A}_{lin}$, and $\bB_{lin}$ are constant matrices. Let us
define $\bar\bx = \bx - \bx^*, \bar\bu = \bu - \bu^*$, and write
$$\dot{\bar\bx} = {\bf A}_{lin}\bar\bx + \bB_{lin}\bar\bu.$$ Evaluation of
the Taylor expansion around a fixed point yields the following, very
simple equations, given in block form by:
\begin{align}
{\bf A}_{lin} =& \begin{bmatrix} {\bf 0} & {\bf I} \\ \bM^{-1} \pd{\btau_g}{\bq} + \sum_{j} \bM^{-1}\pd{\bB_j}{\bq} u_j & {\bf 0} \end{bmatrix}_{\bx=\bx^*,\bu=\bu^*} \\
\bB_{lin} =& \begin{bmatrix} {\bf 0} \\ \bM^{-1} \bB
\end{bmatrix}_{\bx=\bx^*, \bu=\bu^*}
\end{align}
where $\bB_j$ is the $j$th column of $\bB$. Note that the term involving
$\pd{\bM^{-1}}{q_i}$ disappears because $\btau_g + \bB\bu - \bC\dot{\bq}$
must be zero at the fixed point. All of the $\bC\dot\bq$ derivatives drop
out, too, because $\dot{\bq}^* = 0$, and any terms with $\bC$ drop out as
well, since centripetal and centrifugal forces are zero when velocity
is zero. In many cases, including both the Acrobot and Cart-Pole systems
(but not the Quadrotors), the matrix $\bB(\bq)$ is a constant, so the
$\pd\bB{\bq}$ terms also drop out. </p>
<example><h1>Linearization of the Acrobot</h1>
<p>Linearizing around the (unstable) upright point, we have:
\begin{gather}
\left[\pd{\bf \tau_g}{\bq}\right]_{\bx=\bx^*} = \begin{bmatrix} g (m_1 l_{c1}
+ m_2 l_1 + m_2 l_{c2}) & m_2 g l_{c2} \\ m_2 g l_{c2} & m_2 g l_{c2}
\end{bmatrix}
\end{gather}
The linear dynamics follow directly from these equations and the manipulator
form of the Acrobot equations.</p>
</example>
<example id="linearize_cart_pole"><h1>Linearization of the Cart-Pole System</h1>
<p>Linearizing around the unstable fixed point in this system, we have:
\begin{gather}
\left[\pd{\btau_g}{\bq}\right]_{\bx=\bx^*} = \begin{bmatrix} 0 & 0 \\ 0
& m_p g l \end{bmatrix}
\end{gather}
Again, the linear dynamics follow simply.</p>
</example>
<p>Studying the properties of the linearized system can tell us some
things about the (local) properties of the nonlinear system. For
instance, having a strictly stable linearization implies local exponential
stability of the nonlinear system <elib>Khalil01</elib> (Theorem 4.15).
It's worth noting that having an unstable linearization also implies that
the system is locally unstable, but if the linearization is marginally
stable then one cannot conclude whether the nonlinear system is
asymptotically stable, stable i.s.L., or unstable (only that it is not
exponentially stable)<elib>Slotine90</elib>.</p>
</subsection> <!-- end linearizing -->
<subsection id="controllability"><h1>Controllability of linear systems</h1>
<definition><h1>Controllability</h1> A control system
of the form $$\dot{\bx} = {\bf f}(\bx,\bu)$$ is called
<em>controllable</em> if it is possible to construct an unconstrained
input signal, $\bu(t)$, $t \in [0,t_f],$ which will move the system from
any initial state to any final state in a finite interval of time, $0 < t
< t_f$ <elib>Ogata96</elib>. </definition>
<p> For the linear system $$\dot{\bx} = {\bf A}\bx + \bB\bu,$$ we can
integrate this linear system in closed form, so it is possible to derive
the exact conditions of controllability. In particular, for linear
systems it is sufficient to demonstrate that there exists a control input
which drives any initial condition to the origin. </p>
<subsubsection><h1>The special case of non-repeated
eigenvalues</h1>
<p> Let us first examine a special case, which falls short as a general
tool but may be more useful for understanding the intuition of
controllability. Let's perform an eigenvalue analysis of the system
matrix ${\bf A}$, so that: $${\bf A}{\bf v}_i = \lambda_i {\bf v}_i,$$
where $\lambda_i$ is the $i$th eigenvalue, and ${\bf v}_i$ is the
corresponding (right) eigenvector. There will be $n$ eigenvalues for the
$n \times n$ matrix ${\bf A}$. Collecting the (column) eigenvectors into
the matrix ${\bf V}$ and the eigenvalues into a diagonal matrix ${\bf
\Lambda}$, we have $${\bf A}{\bf V} = {\bf V}{\bf \Lambda}.$$ Here comes
our primary assumption: let us assume that each of these $n$ eigenvalues
takes on a distinct value (no repeats). With this assumption, it can be
shown that the eigenvectors ${\bf v}_i$ form a linearly independent
basis set, and therefore ${\bf V}^{-1}$ is well-defined.</p>
<p> We can continue our eigenmodal analysis of the linear system by
defining the modal coordinates, ${\bf r}$, with: $$\bx = {\bf V}{\bf
r},\quad \text{or}\quad {\bf r} = {\bf V}^{-1}\bx.$$ In modal
coordinates, the dynamics of the linear system are given by $$\dot{\bf
r} = {\bf V}^{-1} {\bf A} {\bf V} {\bf r} + {\bf V}^{-1} \bB \bu = {\bf
\Lambda} {\bf r} + {\bf V}^{-1}\bB \bu.$$ This illustrates the power of
modal analysis; in modal coordinates, the dynamics diagonalize yielding
independent linear equations: $$\dot{r}_i = \lambda_i r_i + \sum_j
\beta_{ij} u_j,\quad {\bf \beta} = {\bf V}^{-1} \bB.$$</p>
<p> Now the concept of controllability becomes clear. Input $j$ can
influence the dynamics in modal coordinate $i$ if and only if ${\bf
\beta}_{ij} \neq 0$. In the special case of non-repeated eigenvalues,
having control over each individual eigenmode is sufficient to (in
finite time) regulate all of the eigenmodes<elib>Ogata96</elib>.
Therefore, we say that the system is controllable if and only if
$$\forall i, \exists j \text{ such that }\beta_{ij} \neq 0.$$</p>
</subsubsection>
<subsubsection><h1>A general solution</h1>
<details><summary>Included only for completeness. Click to expand the details.</summary>
<p> A more general solution to the controllability issue, which removes
our assumption about the eigenvalues, can be obtained by examining the
time-domain solution of the linear equations. The solution of this
system is $$\bx(t) = e^{{\bf A}t} \bx(0) + \int_0^{t} e^{{\bf A}(t -
\tau)} \bB \bu(\tau) d\tau.$$ Without loss of generality, lets consider
the that the final state of the system is zero. Then we have: $$\bx(0)
= - \int_0^{t_f} e^{-{\bf A}\tau}\bB \bu(\tau) d\tau.$$ You might be
wondering what we mean by $e^{{\bf A}t}$; a scalar raised to the power
of a matrix..? Recall that $e^{z}$ is actually defined by a convergent
infinite sum: $$e^{z} =1 + z + \frac{1}{2} z^2 + \frac{1}{6} z^3 + ...
.$$ The notation $e^{{\bf A}t}$ uses the same definition: $$e^{{\bf A}t}
= {\bf I} + {\bf A}t + \frac{1}{2}({\bf A}t)^2 + \frac{1}{6}({\bf A}t)^3
+ ... .$$ Not surprisingly, this has many special forms. For instance,
if ${\bf A}$ is diagonalizable, $e^{{\bf A}t} = {\bf
V}e^{{\bf\Lambda}t}{\bf V}^{-1},$ where ${\bf A} = {\bf V \Lambda
V}^{-1}$ is the eigenvalue decomposition of ${\bf A}$
<elib>Strang98</elib>. The particular form we will use here is
$$e^{-{\bf A}\tau} = \sum_{k=0}^{n-1} \alpha_k(\tau) {\bf A}^k.$$ This
is a particularly surprising form, because the infinite sum above is
represented by this finite sum; the derivation uses Sylvester's Theorem
<elib>Ogata96</elib>, <elib>Chen98a</elib>. Then we have,
\begin{align*} \bx(0) =& - \sum_{k=0}^{n-1} {\bf A}^k \bB \int_0^{t_f}
\alpha_k(\tau) \bu(\tau) d\tau \\ =& -\sum_{k=0}^{n-1} {\bf A}^k \bB w_k
\text{, where } \bw_k = \int_0^{t_f} \alpha_k(\tau) \bu(\tau) d\tau \\ =& -
\begin{bmatrix} \bB & {\bf AB} & {\bf A}^2\bB & \cdots & {\bf
A}^{n-1}\bB \end{bmatrix}_{n \times (nm)} \begin{bmatrix} \bw_0 \\ \bw_1 \\
\bw_2 \\ \vdots \\ \bw_{n-1} \end{bmatrix} \end{align*} The matrix
containing the vectors $\bB$, ${\bf AB}$, ... ${\bf A}^{n-1}\bB$ is
called the controllability matrix. In order for the system to be
controllable, for every initial condition $\bx(0)$ we must be able to
find the corresponding vector ${\bf w}$. This is only possible when the
rows of the controllability matrix are linearly independent. Therefore,
the condition of controllability is that this controllability matrix is
full rank. In <drake></drake> you can obtain the controllability matrix for a linear system using <a href="https://drake.mit.edu/doxygen_cxx/group__control__systems.html#gaedff9f0a8ddce5d29888ed611ce8bee2"><code>ControllabilityMatrix</code></a>, or check the rank condition directly using <a href="https://drake.mit.edu/doxygen_cxx/group__control__systems.html#ga8bc6169b50b1b127ed5f1b70afcb64ca"><code>IsControllable</code></a>.</p>
<p>Note that a linear feedback to change the eigenvalues of the
eigenmodes is not sufficient to accomplish our goal of getting to the
goal in finite time. In fact, the open-loop control to reach the goal
is easily obtained with a final-value LQR problem, and (for ${\bf
R}={\bf I}$) is actually a simple function of the controllability
Grammian<elib>Chen98a</elib>.</p>
</details>
<p></p>
</subsubsection>
<subsubsection><h1>Controllability vs. underactuated</h1>
<p> Analysis of the controllability of both the Acrobot and Cart-Pole
systems reveals that the linearized dynamics about the upright are, in
fact, controllable. This implies that the linearized system, if started
away from the zero state, can be returned to the zero state in finite
time. This is potentially surprising - after all the systems are
underactuated. For example, it is interesting and surprising that the
Acrobot can balance itself in the upright position without having a
shoulder motor.</p>
<p> The controllability of these model systems demonstrates an extremely
important, point: <em>An underactuated system is not necessarily an
uncontrollable system.</em> Underactuated systems cannot follow
arbitrary trajectories, but that does not imply that they cannot arrive
at arbitrary points in state space. However, the trajectory required to
place the system into a particular state may be arbitrarily complex.</p>
<p> The controllability analysis presented here is for linear
time-invariant (LTI) systems. A comparable analysis exists for linear
time-varying (LTV) systems. We will even see extensions to nonlinear
systems; although it will often be referred to by the synonym of
"reachability" analysis.</p>
</subsubsection>
</subsection> <!-- end controllability -->
<subsection><h1>LQR feedback</h1>
<p> Controllability tells us that a trajectory to the fixed point exists,
but does not tell us which one we should take or what control inputs cause
it to occur. Why not? There are potentially infinitely many solutions.
We have to pick one.</p>
<p> The tools for controller design in linear systems are very advanced.
In particular, as we describe in the <a href="lqr.html">linear
optimal control chapter</a>, one can easily design an optimal feedback
controller for a regulation task like balancing, so long as we are
willing to linearize the system around the operating point and define
optimality in terms of a quadratic cost function: $$J(\bx_0) =
\int_0^\infty \left[ \bx^T(t) \bQ \bx(t) + \bu^T(t) \bR \bu(t) \right]dt,
\quad \bx(0)=\bx_0, \bQ=\bQ^T>0, \bR=\bR^T>0.$$ The linear feedback
matrix $\bK$ used as $$\bu(t) = - \bK\bx(t),$$ is the so-called optimal
linear quadratic regulator (LQR). Even without understanding the
detailed derivation, we can quickly become practitioners of LQR.
<drake></drake> provides a function, <blockquote><code>K =
LinearQuadraticRegulator(A, B, Q, R)</code></blockquote> for linear
systems. It also provides a version <blockquote><code>controller =
LinearQuadraticRegulator(system, context, Q, R)</code></blockquote> that
will linearize the system for you around an equilibrium and return the
controller (in the original coordinates). Therefore, to use LQR, one
simply needs to define the symmetric positive-definite cost matrices,
${\bf Q}$ and ${\bf R}$. In their most common form, ${\bf Q}$ and ${\bf
R}$ are positive diagonal matrices, where the entries $Q_{ii}$ penalize
the relative errors in state variable $x_i$ compared to the other state
variables, and the entries $R_{ii}$ penalize actions in $u_i$.</p>
<example><h1>LQR for the Acrobot and Cart-Pole</h1>
<p>Take a minute to play around with the LQR controller for the
Acrobot and the Cart-Pole</p>
<script>document.write(notebook_link('acrobot'))</script>
<p>Make sure that you take a minute to look at the code which runs
during these examples. Can you set the ${\bf Q}$ and ${\bf R}$ matrices
differently, to improve the performance?</p>
</example>
<p> Simulation of the closed-loop response with LQR feedback shows that
the task is indeed completed - and in an impressive manner. Oftentimes
the state of the system has to move violently away from the origin in
order to ultimately reach the origin. Further inspection reveals the
(linearized) closed-loop dynamics are in fact non-minimum phase (acrobot
has 3 right-half zeros, cart-pole has 1).</p>
<example><h1>LQR for Quadrotors</h1>
<p>LQR works essentially out of the box for Quadrotors, if linearized
around a nominal fixed point (where the non-zero thrust from the
propellers is balancing gravity).</p>
<script>document.write(notebook_link('acrobot'))</script>
<p>or <a target="_blank" href="data/quadrotor2d_lqr.html">Click here for
the animation</a>.</p>
</example>
<p> Note that LQR, although it is optimal for the linearized system, is
not necessarily the best linear control solution for maximizing basin of
attraction of the fixed-point. The theory of <em>robust
control</em>(e.g., <elib>Zhou97</elib>), which explicitly takes into
account the differences between the linearized model and the nonlinear
model, will produce controllers which outperform our LQR solution in this
regard.</p>
</subsection>
</section> <!-- end balancing -->
<section id="partial_feedback_linearization"><h1>Partial feedback linearization</h1>
<p> In the introductory chapters, we made the point that the underactuated
systems are not feedback equivalent to $\ddot{q} = \bu$. Although we
cannot always simplify the full dynamics of the system, it is still possible
to linearize a portion of the system dynamics. The technique is called
<em>partial feedback linearization</em>.</p>
<p> Consider the cart-pole example. The dynamics of the cart are affected
by the motions of the pendulum. If we know the model, then it seems quite
reasonable to think that we could create a feedback controller which would
push the cart in exactly the way necessary to counter-act the dynamic
contributions from the pendulum - thereby linearizing the cart dynamics.
What we will see, which is potentially more surprising, is that we can also
use a feedback controller for the cart to feedback linearize the dynamics
of the passive pendulum joint.</p>
<p> We'll use the term <em>collocated</em> partial feedback linearization to
describe a controller which linearizes the dynamics of the actuated joints.
What's more surprising is that it is often possible to achieve
<em>non-collocated</em> partial feedback linearization - a controller which
linearizes the dynamics of the unactuated joints. The treatment presented
here follows from <elib>Spong94a</elib>.</p>
<subsection><h1>PFL for the Cart-Pole System</h1>
<subsubsection><h1>Collocated</h1>
<p> Starting from the equations \ref{eq:simple} and \ref{eq:simple2}, we
have
\begin{gather*}
\ddot\theta = -\ddot{x}c - s \\
% \ddot\theta = -\frac{1}{l} (\ddot{x} c + g s)
\ddot{x}(2-c^2) - sc - \dot\theta^2 s = f_x
\end{gather*}
Therefore, applying the feedback control
\begin{equation}
f_x = (2 - c^2) \ddot{x}^d - sc - \dot\theta^2 s
% f = (m_c + m_p) u + m_p (u c + g s) c - m_p l \dot\theta^2 s
\end{equation}
results in
\begin{align*}
\ddot{x} =& \ddot{x}^d \\ \ddot{\theta} =& -\ddot{x}^dc - s,
\end{align*}
which are valid globally.</p>
<p>Look carefully at the resulting equations. Of course, it says that we can impose whatever accelerations we like on the cart. But even the resulting equations of the pole happen to take a nice form here: they have been reduced to the equations of the simple pendulum (without a cart), where the torque input is now given instead by $\ddot{x}c$. It's as if we have a simple pendulum with torque control, except our command is modulated by a $\cos\theta$ term, and this $\cos\theta$ term is fundamental -- it's true that our control authority goes to zero when the pole is horizontal, because no amount of force on the cart in that configuration can act like a torque on the pole.</p>
</subsubsection>
<subsubsection><h1>Non-collocated</h1>
<p> Starting again from equations \ref{eq:simple} and \ref{eq:simple2},
we have
\begin{gather*}
\ddot{x} = -\frac{\ddot\theta + s}{c} \\
\ddot\theta(c - \frac{2}{c}) - 2 \tan\theta - \dot\theta^2 s = f_x
\end{gather*}
Applying the feedback control
\begin{equation}
f_x = (c - \frac{2}{c}) \ddot\theta^d - 2 \tan\theta - \dot\theta^2 s
\end{equation} results in
\begin{align*}
\ddot\theta =& \ddot\theta^d \\ \ddot{x} =&
-\frac{1}{c} \ddot\theta^d - \tan\theta.
\end{align*}
Note that this expression is only valid when $\cos\theta \neq 0$. Once again, we know that the force cannot create a torque when
the pole is perfectly horizontal. In fact, the controller we have written will "blow-up" -- requesting infinite force at $\cos\theta = 0$; so make sure you saturate the command before you implement it on hardware (or even in simulation). Although it may come as a small consolation, at least we have that $(c - \frac{2}{c})$ never goes to zero; in fact you can check for yourself that $|c - \frac{2}{c}| \ge 1$.</p>
</subsubsection>
</subsection> <!-- end PFL for the cart-pole -->
<subsection><h1>General form</h1>
<p> For systems that are trivially underactuated (torques on some joints,
no torques on other joints), we can, without loss of generality,
reorganize the joint coordinates in any underactuated system described by
the manipulator equations into the form:
\begin{align}
\bM_{11} \ddot{\bq}_1 + \bM_{12} \ddot{\bq}_2 &= \btau_1, \label{eq:passive_dyn}\\
\bM_{21} \ddot{\bq}_1 + \bM_{22} \ddot{\bq}_2 &= \btau_2 + \bu, \label{eq:active_dyn}
\end{align}
with $\bq \in \Re^n$, $\bq_1 \in \Re^l$, $\bq_2 \in \Re^m$, $l=n-m$.
$\bq_1$ represents all of the passive joints, and $\bq_2$ represents all
of the actuated joints, and the $\btau = \btau_g - \bC\dot\bq$ terms
capture all of the Coriolis and gravitational terms, and $$\bM(\bq) =
\begin{bmatrix} \bM_{11} & \bM_{12} \\ \bM_{21} & \bM_{22}
\end{bmatrix}.$$ Fortunately, because $\bM$ is uniformly (e.g. for all
$\bq$) positive definite, $\bM_{11}$ and $\bM_{22}$ are also positive
definite, by the <a href="https://en.wikipedia.org/w/index.php?title=Schur_complement">Schur complement condition for positive definiteness</a>.</p>
<subsubsection><h1>Collocated linearization</h1>
<p> Performing the same substitutions into the full manipulator
equations, we get:
\begin{gather}
\ddot\bq_1 = \bM_{11}^{-1} \left[ \btau_1 - \bM_{12} \ddot\bq_2
\right] \\
(\bM_{22} - \bM_{21} \bM_{11}^{-1} \bM_{12})
\ddot\bq_2 - \btau_2 + \bM_{21} \bM_{11}^{-1} \btau_1 = \bu
\end{gather}
It can be easily shown that the matrix $(\bM_{22} - \bM_{21}
\bM_{11}^{-1} \bM_{12})$ is invertible<elib>Spong94a</elib>; we can see
from inspection that it is symmetric. PFL follows naturally, and is
valid globally.</p>
</subsubsection>
<subsubsection><h1>Non-collocated linearization</h1>
<p>
\begin{gather}
\ddot\bq_2 = \bM_{12}^+ \left[ \btau_1 - \bM_{11} \ddot\bq_1
\right] \\
(\bM_{21} - \bM_{22} \bM_{12}^+ \bM_{11})
\ddot\bq_1 - \btau_2 + \bM_{22} \bM_{12}^+ \btau_1 = \bu
\end{gather}
where $\bM_{12}^+$ is a Moore-Penrose pseudo-inverse. This inverse
provides a unique solution when the rank of $\bM_{12}$ equals $l$, the
number of passive degrees of freedom in the system (it cannot be more,
since the matrix only has $l$ rows). The rank condition in this
context is sometimes called the property of "Strong Inertial Coupling". It is state
dependent; in the cart-pole example above $\bM_{12} = \cos\theta$ and drops rank exactly when $\cos\theta = 0$. A system has Global Strong Inertial Coupling if it exhibits
Strong Inertial Coupling in every state.</p>
</subsubsection>
<subsubsection id="task_space_pfl"><h1>Task-space partial feedback linearization</h1>
<p> In general, we can define some combination of active and passive
joints that we would like to control. This combination is sometimes
called a "task space". <todo>cite some task space refs here?</todo>
Consider an output function of the form, $$\by = \bh(\bq),$$ with ${\bf
y} \in \Re^p$, which defines the task space. Define $\bH_1 =
\frac{\partial \bh}{\partial \bq_1}$, $\bH_2 = \frac{\partial
\bh}{\partial \bq_2}$, $\bH = [\bH_1,\bH_2]$.</p>
<theorem><h1>Task Space PFL</h1>
<p> If the actuated joints are commanded so that \begin{equation}
\ddot\bq_2 = \bar\bH^+ \left [\ddot\by^d - \dot\bH\dot\bq - \bH_1
\bM_{11}^{-1}\btau_1 \right], \label{eq:q2cmd} \end{equation} where
$\bar{\bH} = \bH_2 - \bH_1 \bM_{11}^{-1} \bM_{12}.$ and $\bar\bH^+$ is
the right Moore-Penrose pseudo-inverse, $$\bar\bH^+ = \bar\bH^T (\bar\bH
\bar\bH^T)^{-1},$$ then we have \begin{equation} \ddot\by =
\ddot\by^d.\end{equation} subject to \begin{equation}\text{rank}\left(\bar{\bH}
\right) = p, \label{eq:rank_condition}\end{equation} </p>
<p><b>Proof Sketch.</b>
Differentiating the output function we have \begin{gather*} \dot\by =
\bH \dot\bq \\ \ddot\by = \dot\bH \dot\bq + \bH_1 \ddot\bq_1 + \bH_2
\ddot\bq_2. \end{gather*} Solving \ref{eq:passive_dyn} for the dynamics
of the unactuated joints we have: \begin{equation} \ddot\bq_1 =
\bM_{11}^{-1} (\btau_1 - \bM_{12} \ddot\bq_2) \label{eq:q1cmd}
\end{equation} Substituting, we have \begin{align} \ddot\by =& \dot\bH
\dot\bq + \bH_1 \bM_{11}^{-1}(\btau_1 - \bM_{12}\ddot\bq_2) + \bH_2
\ddot\bq_2 \\ =& \dot\bH \dot\bq + \bar{\bH} \ddot\bq_2 + \bH_1
\bM_{11}^{-1}\btau_1 \\ =& \ddot\by^d \end{align} Note that the last
line required the rank condition ($\ref{eq:rank_condition}$) on
$\bar\bH$ to ensure that the rows of $\bar{\bH}$ are linearly
independent, allowing $\bar\bH \bar\bH^+ = \bI$.</p>
</theorem>
<p> In order to execute a task space trajectory one could command
$$\ddot\by^d = \ddot{\by}^d + \bK_d (\dot{\by}^d - \dot\by)
+ \bK_p (\by^d -\by).$$ Assuming the internal dynamics are stable,
this yields converging error dynamics when
$\bK_p,\bK_d > 0$<elib>Slotine90</elib>, which implies $y(t) \rightarrow y^d(t)$. For a position control robot,
the acceleration command of ($\ref{eq:q2cmd}$) suffices. Alternatively,
a torque command follows by substituting ($\ref{eq:q2cmd}$) and
($\ref{eq:q1cmd}$) into ($\ref{eq:active_dyn}$).</p>
<example><h1>End-point trajectory following with the Cart-Pole system</h1>
<p> Consider the task of trying to track a desired kinematic trajectory
with the endpoint of pendulum in the cart-pole system. With one
actuator and kinematic constraints, we might be hard-pressed to track a
trajectory in both the horizontal and vertical coordinates. But we can
at least try to track a trajectory in the vertical position of the
end-effector.</p>
<p>Using the task-space PFL derivation, we have:
\begin{gather*}
y = h(\bq) = -l \cos\theta \\
\dot{y} = l \dot\theta \sin\theta
\end{gather*}
If we define a desired trajectory:
$$y^d(t) = \frac{l}{2} + \frac{l}{4} \sin(t),$$
then the task-space controller is easily implemented using the
derivation above.</p>
<todo>add results here</todo>
</example>
<p> The task space derivation above provides a convenient generalization
of the partial feedback linearization (PFL) <elib>Spong94a</elib>, which
encompasses both the collocated and non-collocated results. If we
choose $\by = \bq_2$ (collocated), then we have $$\bH_1 = 0, \bH_2 =
\bI, \dot\bH = 0, \bar\bH = \bI, \bar\bH^+ = \bI.$$ From this, the
command in ($\ref{eq:q2cmd}$) reduces to $\ddot\bq_2 = \ddot\bq_2^d$.
The actuator command is then $$ \bu = \bM_{21} \bM_{11}^{-1} (\btau_1 -
\bM_{12} \ddot\bq_2^d) + \bM_{22} \ddot\bq_2^d - \btau_2,$$ and the rank
condition ($\ref{eq:rank_condition}$) is always met.</p>
<p> If we choose ${\bf y} = \bq_1$ (non-collocated), we have $$\bH_1 =
\bI, \bH_2 = 0, \dot\bH = 0, \bar{\bH}=-\bM_{11}^{-1}\bM_{12}.$$ The
rank condition ($\ref{eq:rank_condition}$) requires that
$\text{rank}(\bM_{12}) = l$, in which case we can write
$\bar{\bH}^+=-\bM_{12}^+\bM_{11}$, reducing the rank condition to
precisely the "Strong Inertial Coupling" condition described in
<elib>Spong94a</elib>. Now the command in ($\ref{eq:q2cmd}$) reduces to
\begin{equation} \ddot\bq_2 = \bM_{12}^+ \left [ \btau_1 -
\bM_{11}\ddot\bq_1^d \right] \label{eq:q2ddnonco} \end{equation} The
actuator command is found by substituting ($\ref{eq:q2ddnonco}$) into
($\ref{eq:active_dyn}$), yielding the same results as in
<elib>Spong94a</elib>.</p>
</subsubsection> <!-- task space PFL -->
</subsection>
</section>
<section><h1>Swing-up control</h1>
<subsection><h1>Energy shaping</h1>
<p> Recall in the last chapter, <a href="pend.html#energy_shaping">we used
energy shaping to swing up the simple pendulum</a>. This idea turns out
to be a bit more general than just for the simple pendulum. As we will
see, we can use similar concepts of "energy shaping" to produce swing-up
controllers for the acrobot and cart-pole systems. It's important to note
that it only takes one actuator to change the total energy of a
system.</p>
<p> Although a large variety of swing-up controllers have been proposed
for these model systems (c.f. <elib>Fantoni02</elib>,
<elib>Araki05</elib>, <elib>Xin04</elib>, <elib>Spong94</elib>,
<elib>Mahindrakar05</elib>, <elib>Berkemeier99</elib>,
<elib>Murray91</elib>, <elib>Lai06</elib>), the energy shaping controllers
tend to be the most natural to derive and perhaps the most well-known.</p>
</subsection>
<subsection id="cartpole_swingup"><h1>Cart-Pole</h1>
<p> Let's try to apply the energy-shaping ideas to the cart-pole system.
The basic idea, from <elib>Chung95</elib>, is to use collocated PFL to
simplify the dynamics, use energy shaping to regulate the pendulum to its
homoclinic orbit, then to add a few terms to make sure that the cart stays
near the origin. This is a bit surprising... if we want to control the
pendulum, shouldn't we use the non-collocated version? Actually, we
ultimately want to control both the cart and the pole, and we will build
on the collocated version both because it avoids inverting the
$\cos\theta$ term that can go to zero and because (when all parameters are
set to 1) we were left with a particularly simple form of the equations:
\begin{gather} \ddot{x} = u \\ \ddot\theta = - u c - s. %\ddot\theta =
-\frac{1}{l}( u c + g s ) \end{gather} The first equation is clearly
simple, but even the second equation is exactly the equations of a
<i>decoupled</i> pendulum, just with a slightly odd control input that is
modulated by $\cos\theta$.</p>
<p>Let us regulate the energy of this decoupled pendulum using energy
shaping. The energy of the pendulum (a unit mass, unit length, simple
pendulum in unit gravity) is given by: $$E(\bx) = \frac{1}{2} \dot\theta^2
- \cos\theta.$$ <!-- E = \frac{1}{2}ml^2\dot\theta^2 - mgl\cos\theta -->
The desired energy, equivalent to the energy at the desired fixed-point,
is $$E^d = 1.$$ <!-- E^d = m_p g l --> Again defining $\tilde{E}(\bx) =
E(\bx) - E^d$, we now observe that
\begin{align*}
\dot{\tilde{E}}(\bx) =& \dot{E}(\bx) = \dot\theta \ddot\theta +
\dot\theta s \\
% \dot\tilde{E} = ml^2 \dot\theta \ddot\theta + mgl s
=& \dot\theta [ -uc - s] + \dot\theta s \\
% - ml \dot\theta [ u c + g s ] + mgl s
=& - u\dot\theta \cos\theta.
% - ml u \dot\theta c
\end{align*}
</p>
<p>Therefore, if we design a controller of the form $$u = k
\dot\theta\cos\theta \tilde{E},\quad k>0$$ the result is $$\dot{\tilde{E}}
= - k \dot\theta^2 \cos^2\theta \tilde{E}.$$ <!-- times ml --> This
guarantees that $| \tilde{E} |$ is non-increasing, but isn't quite enough
to guarantee that it will go to zero. For example, if $\theta =
\dot\theta = 0$, the system will never move. However, if we have that
$$\int_0^t \dot\theta^2(t') \cos^2 \theta(t') dt' \rightarrow \infty,\quad
\text{as } t \rightarrow \infty,$$ then we have $\tilde{E}(t) \rightarrow
0$. This condition, a version of the LaSalle's theorem that we will
develop in our notes on Lyapunov methods, is satisfied for all but the
trivial constant trajectories at fixed points.</p>
<p> Now we return to the full system dynamics (which includes the cart).
<elib>Chung95</elib> and <elib>Spong96</elib> use the simple pendulum
energy controller with an addition PD controller designed to regulate the
cart: $$\ddot{x}^d = k_E \dot\theta \cos\theta \tilde{E} - k_p x - k_d
\dot{x}.$$ <elib>Chung95</elib> provides a proof of convergence for this
controller with some nominal parameters. <elib>Fantoni02</elib> provides
a slightly different controller derived directly from a Lyapunov
argument.</p>
<figure>
<img width="70%" src="figures/cartpole_swingup_phase.svg"/>
<figcaption>Cart-Pole Swingup: Example phase plot of the pendulum
subsystem using energy shaping control. The controller drives the system
to the homoclinic orbit, then switches to an LQR balancing controller near
the top.</figcaption>
</figure>
</subsection>
<subsection><h1>Acrobot</h1>
<p> Swing-up control for the acrobot can be accomplished in much the same
way. <elib>Spong94</elib> - pump energy. Clean and simple. No proof.
Slightly modified version (uses arctan instead of sat) in
<elib>Spong95</elib>. Clearest presentation in <elib>Spong96</elib>.</p>
<p> Use collocated PFL. ($\ddot{q}_2 = \ddot{q}_2^d$). $$E(\bx) =
\frac{1}{2}\dot\bq^T\bM\dot{\bq} + U(\bx).$$ $$ E_d = U(\bx^*).$$
$$\bar{u} = \dot{q}_1 \tilde{E}.$$ $$\ddot{q}_2^d = - k_1 q_2 - k_2
\dot{q}_2 + k_3 \bar{u},$$</p>
<p> Extra PD terms prevent proof of asymptotic convergence to homoclinic
orbit. Proof of another energy-based controller in
<elib>Xin04</elib>.</p>
</subsection>
<subsection><h1>Discussion</h1>
<p> The energy shaping controller for swing-up presented here is a pretty
faithful representative from the field of nonlinear underactuated control.
Typically these control derivations require some clever tricks for
simplifying or canceling out terms in the nonlinear equations, then some
clever Lyapunov function to prove stability. In these cases, PFL was used
to simplify the equations, and therefore the controller design.</p>
<p>We will see another nice example of changing coordinates in the
nonlinear equations to make the problem easier when we study "<a
href="trajopt.html#differential_flatness">differential flatness</a>" for
trajectory optimization. This approach is perhaps most famous these days
for very dynamic trajectory planning with quadrotors.</p>
<p>These controllers are important, representative, and relevant. But
clever tricks with nonlinear equations seem to be fundamentally limited.
Most of the rest of the material presented in this book will emphasize