forked from ice-wg/rfc5245bis
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdraft-ietf-ice-rfc5245bis.xml
5153 lines (4349 loc) · 199 KB
/
draft-ietf-ice-rfc5245bis.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc='yes'?>
<?rfc tocdepth='5'?>
<?rfc symrefs='yes'?>
<?rfc compact='yes'?>
<?rfc subcompact='no'?>
<rfc ipr="pre5378Trust200902" category="std" obsoletes="5245"
docName="draft-ietf-ice-rfc5245bis-latest"
>
<front>
<title abbrev="ICE">Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal</title>
<author fullname="Ari Keranen" initials="A." surname="Keranen">
<organization abbrev="Ericsson">Ericsson</organization>
<address>
<postal>
<street>Hirsalantie 11</street>
<city>02420 Jorvas</city>
<country>Finland</country>
</postal>
<email>ari.keranen@ericsson.com</email>
</address>
</author>
<author fullname="Christer Holmberg" initials="C." surname="Holmberg">
<organization abbrev="Ericsson">Ericsson</organization>
<address>
<postal>
<street>Hirsalantie 11</street>
<city>02420 Jorvas</city>
<country>Finland</country>
</postal>
<email>christer.holmberg@ericsson.com</email>
</address>
</author>
<author initials="J.R." surname="Rosenberg"
fullname="Jonathan Rosenberg">
<organization>jdrosen.net</organization>
<address>
<postal>
<street/>
<city>Monmouth</city> <region>NJ</region>
<country>US</country>
</postal>
<email>jdrosen@jdrosen.net</email>
<uri>http://www.jdrosen.net</uri>
</address>
</author>
<date year="2017" />
<area>ART</area>
<workgroup>ICE</workgroup>
<keyword>NAT</keyword>
<abstract>
<t>This document describes a protocol for Network Address
Translator (NAT) traversal for UDP-based multimedia. This
protocol is called Interactive Connectivity Establishment
(ICE). ICE makes use of the Session Traversal Utilities
for NAT (STUN) protocol and its extension, Traversal Using
Relay NAT (TURN).</t>
<t> This document obsoletes RFC 5245. </t>
</abstract>
</front>
<middle>
<section title="Introduction">
<t>Protocols establishing multimedia sessions between peers typically
involve exchanging IP addresses and ports for the media sources and
sinks. However this poses challenges when operated through Network Address
Translators (NATs) <xref target="RFC3235"/>. These protocols also seek to
create a media flow directly between participants, so that there is
no application layer intermediary between them. This is done to
reduce media latency, decrease packet loss, and reduce the
operational costs of deploying the application. However, this is
difficult to accomplish through NAT. A full treatment of the reasons
for this is beyond the scope of this specification. </t>
<t> Numerous solutions have been defined for allowing these protocols
to operate through NAT. These include Application Layer Gateways
(ALGs), the <xref target="RFC3303"> Middlebox Control Protocol</xref>,
the original <xref target="RFC3489">Simple Traversal of UDP Through
NAT (STUN)</xref> specification, and <xref target="RFC3102">Realm
Specific IP</xref> <xref target="RFC3103"/> along with session
description extensions needed to make them work, such as the Session
Description Protocol (SDP) <xref target="RFC4566"/> attribute for the
Real Time Control Protocol (RTCP) <xref
target="RFC3605"/>. Unfortunately, these techniques all have pros and
cons which, make each one optimal in some network topologies, but a
poor choice in others. The result is that administrators and
implementors are making assumptions about the topologies of the
networks in which their solutions will be deployed. This introduces
complexity and brittleness into the system. What is needed is a single
solution that is flexible enough to work well in all situations. </t>
<t> This specification defines Interactive Connectivity Establishment
(ICE) as a technique for NAT traversal for UDP-based media streams
(though ICE has been extended to handle other transport protocols,
such as TCP <xref target="RFC6544"/>). ICE works by exchanging a
multiplicity of IP addresses and ports which are then tested for
connectivity by peer-to-peer connectivity checks. The IP addresses and
ports are exchanged via mechanisms (for example, including in a
offer/answer exchange) and the connectivity checks are performed using
Session Traversal Utilities for NAT (STUN) specification <xref
target="RFC5389"/>. ICE also makes use of Traversal Using Relays
around NAT (TURN) <xref target="RFC5766"/>, an extension to STUN.
Because ICE exchanges a multiplicity of IP addresses and ports for
each media stream, it also allows for address selection for multihomed
and dual-stack hosts, and for this reason it deprecates <xref
target="RFC4091"/> and <xref target="RFC4092"/>.
</t>
</section>
<section title="Overview of ICE">
<t> In a typical ICE deployment, we have two endpoints (known as ICE
AGENTS) that want to communicate. They are able to communicate
indirectly via some signaling protocol (such as SIP), by which they
can exchange ICE candidates. Note that ICE is not intended for NAT
traversal for the signaling protocol, which is assumed to be provided
via another mechanism. At the beginning of the ICE process, the agents
are ignorant of their own topologies. In particular, they might or
might not be behind a NAT (or multiple tiers of NATs). ICE allows the
agents to discover enough information about their topologies to
potentially find one or more paths by which they can communicate.
</t>
<t>
<xref target="fig-ice-ref-arch"/> shows a typical environment for ICE
deployment. The two endpoints are labelled L and R (for left and
right, which helps visualize call flows). Both L and R are behind
their own respective NATs though they may not be aware of it. The type
of NAT and its properties are also unknown. Agents L and R are capable
of engaging in an candidate exchange process, whose purpose is to set
up a media session between L and R. Typically, this exchange will
occur through a signaling (e.g., SIP) server.
</t>
<t>
In addition to the agents, a signaling server and NATs, ICE is
typically used in concert with STUN or TURN servers in the
network. Each agent can have its own STUN or TURN server, or they can
be the same.
</t>
<figure title="ICE Deployment Scenario" anchor="fig-ice-ref-arch"
align="center"><artwork>
<![CDATA[
+---------+
+--------+ |Signaling| +--------+
| STUN | |Server | | STUN |
| Server | +---------+ | Server |
+--------+ / \ +--------+
/ \
/ \
/ <- Signaling -> \
/ \
+--------+ +--------+
| NAT | | NAT |
+--------+ +--------+
/ \
/ \
+-------+ +-------+
| Agent | | Agent |
| L | | R |
+-------+ +-------+
]]></artwork></figure>
<t>The basic idea behind ICE is as follows: each agent has a variety
of candidate TRANSPORT ADDRESSES (combination of IP address and port
for a particular transport protocol, which is always UDP in this
specification) it could use to communicate with the other agent. These
might include:
<list style="symbols">
<t>A transport address on a directly attached network interface</t>
<t>A translated transport address on the public side of a NAT (a "server
reflexive" address)</t>
<t>A transport address allocated from a TURN server (a "relayed
address")</t>
</list>
</t>
<t>
Potentially, any of L's candidate transport addresses can be used to
communicate with any of R's candidate transport addresses. In
practice, however, many combinations will not work. For instance, if L
and R are both behind NATs, their directly attached interface
addresses are unlikely to be able to communicate directly (this is why
ICE is needed, after all!). The purpose of ICE is to discover which
pairs of addresses will work. The way that ICE does this is to
systematically try all possible pairs (in a carefully sorted order)
until it finds one or more that work.
</t>
<section title="Gathering Candidate Addresses">
<t>
In order to execute ICE, an agent has to identify all of its address
candidates. A CANDIDATE is a transport address -- a combination of IP
address and port for a particular transport protocol (with only UDP
specified here). This document defines three types of candidates, some
derived from physical or logical network interfaces, others
discoverable via STUN and TURN. Naturally, one viable candidate is a
transport address obtained directly from a local interface. Such a
candidate is called a HOST CANDIDATE. The local interface could be
Ethernet or WiFi, or it could be one that is obtained through a tunnel
mechanism, such as a Virtual Private Network (VPN) or Mobile IP
(MIP). In all cases, such a network interface appears to the agent as
a local interface from which ports (and thus candidates) can be
allocated.
</t>
<t> If an agent is multihomed, it obtains a candidate from each IP
address. Depending on the location of the PEER (the other agent in the
session) on the IP network relative to the agent, the agent may be
reachable by the peer through one or more of those IP
addresses. Consider, for example, an agent that has a local IP address
on a private net 10 network (I1), and a second connected to the public
Internet (I2). A candidate from I1 will be directly reachable when
communicating with a peer on the same private net 10 network, while a
candidate from I2 will be directly reachable when communicating with a
peer on the public Internet. Rather than trying to guess which IP
address will work, the initiating sends both the candidates to its
peer.
</t>
<t> Next, the agent uses STUN or TURN to obtain additional
candidates. These come in two flavors: translated addresses on the
public side of a NAT (SERVER REFLEXIVE CANDIDATES) and addresses on
TURN servers (RELAYED CANDIDATES). When TURN servers are utilized,
both types of candidates are obtained from the TURN server. If only
STUN servers are utilized, only server reflexive candidates are
obtained from them. The relationship of these candidates to the host
candidate is shown in <xref target="fig-address-types"/>. In this
figure, both types of candidates are discovered using TURN. In the
figure, the notation X:x means IP address X and UDP port x.
</t>
<figure title="Candidate Relationships" anchor="fig-address-types">
<artwork>
<![CDATA[
To Internet
|
|
| /------------ Relayed
Y:y | / Address
+--------+
| |
| TURN |
| Server |
| |
+--------+
|
|
| /------------ Server
X1':x1'|/ Reflexive
+------------+ Address
| NAT |
+------------+
|
| /------------ Local
X:x |/ Address
+--------+
| |
| Agent |
| |
+--------+
]]></artwork></figure>
<t> When the agent sends the TURN Allocate request from IP address and
port X:x, the NAT (assuming there is one) will create a binding
X1':x1', mapping this server reflexive candidate to the host candidate
X:x. Outgoing packets sent from the host candidate will be translated
by the NAT to the server reflexive candidate. Incoming packets sent
to the server reflexive candidate will be translated by the NAT to the
host candidate and forwarded to the agent. We call the host candidate
associated with a given server reflexive candidate the BASE.
</t>
<t><list style="empty">
<t>Note: "Base" refers to the address an agent sends from for a particular
candidate. Thus, as a degenerate case host candidates also have a base,
but it's the same as the host candidate.
</t></list></t>
<t>
When there are multiple NATs between the agent and the TURN server,
the TURN request will create a binding on each NAT, but only the
outermost server reflexive candidate (the one nearest the TURN server)
will be discovered by the agent. If the agent is not behind a NAT,
then the base candidate will be the same as the server reflexive
candidate and the server reflexive candidate is redundant and will be
eliminated.
</t>
<t>
The Allocate request then arrives at the TURN server. The TURN server
allocates a port y from its local IP address Y, and generates an
Allocate response, informing the agent of this relayed candidate. The
TURN server also informs the agent of the server reflexive candidate,
X1':x1' by copying the source transport address of the Allocate
request into the Allocate response. The TURN server acts as a packet
relay, forwarding traffic between L and R. In order to send traffic to
L, R sends traffic to the TURN server at Y:y, and the TURN server
forwards that to X1':x1', which passes through the NAT where it is
mapped to X:x and delivered to L.
</t>
<t>
When only STUN servers are utilized, the agent sends a STUN Binding
request <xref target="RFC5389"/> to its STUN server. The STUN server
will inform the agent of the server reflexive candidate X1':x1' by
copying the source transport address of the Binding request into the
Binding response.
</t>
</section>
<section title="Connectivity Checks">
<t>
Once L has gathered all of its candidates, it orders them in highest
to lowest-priority and sends them to R over the signaling
channel. When R receives the candidates from L, it performs the same
gathering process and responds with its own list of candidates. At the
end of this process, each agent has a complete list of both its
candidates and its peer's candidates. It pairs them up, resulting in
CANDIDATE PAIRS. To see which pairs work, each agent schedules a
series of CHECKS. Each check is a STUN request/response transaction
that the client will perform on a particular candidate pair by sending
a STUN request from the local candidate to the remote candidate.
</t>
<t>
The basic principle of the connectivity checks is simple:
<list style="numbers">
<t>Sort the candidate pairs in priority order.</t>
<t>Send checks on each candidate pair in priority order.</t>
<t>Acknowledge checks received from the other agent.</t>
</list>
With both agents performing a check on a candidate pair, the result is
a 4-way handshake:
</t>
<figure title="Basic Connectivity Check"
anchor="fig:connectivity-checks" align="center"><artwork>
<![CDATA[
L R
- -
STUN request -> \ L's
<- STUN response / check
<- STUN request \ R's
STUN response -> / check
]]></artwork></figure>
<t>
It is important to note that the STUN requests are sent to and from
the exact same IP addresses and ports that will be used for media
(e.g., RTP and RTCP). Consequently, agents demultiplex STUN and
RTP/RTCP using contents of the packets, rather than the port on which
they are received. Fortunately, this demultiplexing is easy to do,
especially for RTP and RTCP.
</t>
<t>
Because a STUN Binding request is used for the connectivity check, the
STUN Binding response will contain the agent's translated transport
address on the public side of any NATs between the agent and its
peer. If this transport address is different from other candidates the
agent already learned, it represents a new candidate, called a PEER
REFLEXIVE CANDIDATE, which then gets tested by ICE just the same as
any other candidate.
</t>
<t>
As an optimization, as soon as R gets L's check message, R schedules a
connectivity check message to be sent to L on the same candidate
pair. This accelerates the process of finding a valid candidate, and
is called a TRIGGERED CHECK.
</t>
<t>
At the end of this handshake, both L and R know that they can
send (and receive) messages end-to-end in both directions.
</t>
</section>
<section title="Sorting Candidates">
<t>
Because the algorithm above searches all candidate pairs, if a working
pair exists it will eventually find it no matter what order the
candidates are tried in. In order to produce faster (and better)
results, the candidates are sorted in a specified order. The resulting
list of sorted candidate pairs is called the CHECK LIST. The algorithm
is described in <xref target="sec-prioritizing"/> but follows two
general principles:
<list style="symbols">
<t>Each agent gives its candidates a numeric priority, which is sent
along with the candidate to the peer.</t>
<t>The local and remote priorities are combined so that each
agent has the same ordering for the candidate pairs.</t>
</list>
</t>
<t>
The second property is important for getting ICE to work when there
are NATs in front of L and R. Frequently, NATs will not allow packets
in from a host until the agent behind the NAT has sent a packet
towards that host. Consequently, ICE checks in each direction will not
succeed until both sides have sent a check through their respective
NATs.
</t>
<t>
The agent works through this CHECK LIST by sending a STUN request for
the next candidate pair on the list periodically. These are called
ORDINARY CHECKS.
</t>
<t>
In general, the priority algorithm is designed so that candidates of
similar type get similar priorities and so that more direct routes
(that is, through fewer media relays and through fewer NATs) are
preferred over indirect ones (ones with more media relays and more
NATs). Within those guidelines, however, agents have a fair amount of
discretion about how to tune their algorithms.
</t>
</section>
<section title="Frozen Candidates">
<t> The previous description only addresses the case where the agents
wish to establish a media session with one COMPONENT (a piece of a
media stream requiring a single transport address; a media stream may
require multiple components, each of which has to work for the media
stream as a whole to be work). Sometimes (e.g., with RTP and RTCP in
separate components), the agents actually need to establish
connectivity for more than one flow. </t>
<t>
The network properties are likely to be very similar for each
component (especially because RTP and RTCP are sent and received
from the same IP address). It is usually possible to leverage
information from one media component in order to determine the best
candidates for another. ICE does this with a mechanism called "frozen
candidates".
</t>
<t>
Each candidate is associated with a property called its
FOUNDATION. Two candidates have the same foundation when they are
"similar" -- of the same type and obtained from the same host
candidate and STUN/TURN server using the same protocol. Otherwise,
their foundation is different. A candidate pair has a foundation too,
which is just the concatenation of the foundations of its two
candidates. Initially, only the candidate pairs with unique
foundations are tested. The other candidate pairs are marked
"frozen". When the connectivity checks for a candidate pair succeed,
the other candidate pairs with the same foundation are unfrozen. This
avoids repeated checking of components that are superficially more
attractive but in fact are likely to fail.
</t>
<t>
While we've described "frozen" here as a separate mechanism for
expository purposes, in fact it is an integral part of ICE and the ICE
prioritization algorithm automatically ensures that the right
candidates are unfrozen and checked in the right order. However, if
the ICE usage does not utilize multiple components or media streams,
it does not need to implement this algorithm.
</t>
</section>
<section title="Security for Checks">
<t>
Because ICE is used to discover which addresses can be used to send
media between two agents, it is important to ensure that the process
cannot be hijacked to send media to the wrong location. Each STUN
connectivity check is covered by a message authentication code (MAC)
computed using a key exchanged in the signaling channel. This MAC
provides message integrity and data origin authentication, thus
stopping an attacker from forging or modifying connectivity check
messages. Furthermore, if for example a SIP <xref target="RFC3261"/>
caller is using ICE, and their call forks, the ICE exchanges happen
independently with each forked recipient. In such a case, the keys
exchanged in the signaling help associate each ICE exchange with each
forked recipient.
</t>
</section>
<section title="Concluding ICE">
<t>
ICE checks are performed in a specific sequence, so that high-priority
candidate pairs are checked first, followed by lower-priority
ones. One way to conclude ICE is to declare victory as soon as a check
for each component of each media stream completes
successfully. Indeed, this is a reasonable algorithm, and details for
it are provided below. However, it is possible that a packet loss will
cause a higher-priority check to take longer to complete. In that
case, allowing ICE to run a little longer might produce better
results. More fundamentally, however, the prioritization defined by
this specification may not yield "optimal" results. As an example, if
the aim is to select low-latency media paths, usage of a relay is a
hint that latencies may be higher, but it is nothing more than a
hint. An actual round-trip time (RTT) measurement could be made, and
it might demonstrate that a pair with lower priority is actually
better than one with higher priority.
</t>
<t>
Consequently, ICE assigns one of the agents in the role of the
CONTROLLING AGENT, and the other of the CONTROLLED AGENT. The
controlling agent gets to nominate which candidate pairs will get used
for media amongst the ones that are valid.
</t>
<t>
When nominating, the controlling agent lets the checks continue
until at least one valid candidate pair for each media stream is
found. Then, it picks amongst those that are valid, and sends a second
STUN request on its NOMINATED candidate pair, but this time with a
flag set to tell the peer that this pair has been nominated for use.
This is shown in <xref target="fig-regular-select"/>.
</t>
<figure title="Nomination"
anchor="fig-regular-select" align="center"><artwork>
<![CDATA[
L R
- -
STUN request -> \ L's
<- STUN response / check
<- STUN request \ R's
STUN response -> / check
STUN request + flag -> \ L's
<- STUN response / check
]]></artwork></figure>
<t>
Once the STUN transaction with the flag completes, both sides cancel
any future checks for that media stream. ICE will now send media using
this pair. The pair an ICE agent is using for media is called the
SELECTED PAIR.
</t>
<t>
Once ICE is concluded, it can be restarted at any time for one or all
of the media streams by either agent. This is done by sending an updated
candidate information indicating a restart.
</t>
</section>
<section title="Lite Implementations">
<t>
In order for ICE to be used in a call, both agents need to support it.
However, certain agents will always be connected to the public
Internet and have a public IP address at which it can receive packets
from any correspondent. To make it easier for these devices to support
ICE, ICE defines a special type of implementation called LITE (in
contrast to the normal FULL implementation). A lite implementation
doesn't gather candidates; it includes only host candidates for any
media stream. Lite agents do not generate connectivity checks or run
the state machines, though they need to be able to respond to
connectivity checks. When a lite implementation connects with a full
implementation, the full agent takes the role of the controlling
agent, and the lite agent takes on the controlled role. When two lite
implementations connect, no checks are sent.
</t>
<t>For guidance on when a
lite implementation is appropriate, see the discussion in <xref
target="sec-liteandfull"/>.
</t>
<t>
It is important to note that the lite implementation was added to this
specification to provide a stepping stone to full implementation. Even
for devices that are always connected to the public Internet, a full
implementation is preferable if achievable.
</t>
</section>
<section title="Usages of ICE">
<t>This document specifies generic use of ICE with protocols that
provide means to exchange candidate information between the ICE Peers.
The specific details of (i.e how to encode candidate information and
the actual candidate exchange process) for different protocols using
ICE are described in separate usage documents. One possible way the
agents can exchange the candidate information is to use <xref
target="RFC3264"/> based Offer/Answer semantics as part of the SIP
<xref target="RFC3261"/> protocol <xref
target="I-D.ietf-mmusic-ice-sip-sdp"/>.
</t>
</section>
</section>
<section title="Terminology">
<t> The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY",
and "OPTIONAL" in this document are to be interpreted as described in
<xref target="RFC2119">RFC 2119</xref>. </t>
<t>
Readers should be familiar with the terminology defined in the STUN <xref
target="RFC5389"/>, and NAT Behavioral requirements
for UDP <xref target="RFC4787"/>.
</t>
<t>
This specification makes use of the following additional terminology:
</t>
<t><list style="hanging">
<t hangText="ICE Session:">
An ICE session consists of all ICE-related actions starting with the
candidate gathering, followed by the interactions (candidate exchange,
connectivity checks, nominations and keep-alives) between the ICE agents
until all the candidates are released or ICE-restart is triggered.</t>
<t hangText="ICE Agent:">
An agent is the protocol implementation involved in the
ICE candidate exchange. There are two agents involved in a typical
candidate exchange. </t>
<t hangText= "Initiating Peer, Initiating Agent, Initiator:">
An initiating agent is the protocol implementation involved in the ICE
candidate exchange that initiates the ICE candidate exchange
process. </t>
<t hangText="Responding Peer, Responding Agent, Responder:">
A receiving agent is the protocol implementation involved in the ICE
candidate exchange that receives
and responds to the candidate exchange process initiated by the
Initiator. </t>
<t hangText="ICE Candidate Exchange, Candidate Exchange:">
The process where the ICE agents exchange information (e.g.,
candidates and passwords) that is needed to perform ICE. <xref
target="RFC3264"/> Offer/Answer with SDP encoding is one example of a
protocol that can be used for exchanging the candidate
information. </t>
<t hangText="Peer:">
From the perspective of one of the agents in a session, its peer is
the other agent. Specifically, from the perspective of the initiating
agent, the peer is the responding agent. From the perspective of the
responding agent, the peer is the initiating agent. </t>
<t hangText="Transport Address:"> The combination of an IP address and
transport protocol (such as UDP or TCP) port.</t>
<t hangText="Media, Media Stream, Media Session:"> When ICE is used to
setup multimedia sessions, the media is usually transported over RTP,
and a media stream composes of a stream of RTP packets. When ICE is
used with other than multimedia sessions, the terms "media", "media
stream", and "media session" are still used in this specification to
refer to the IP data packets that are exchanged between the peers on
the path created and tested with ICE. </t>
<t hangText="Candidate, Candidate Information:"> A transport address
that is a potential point of contact for receipt of media. Candidates
also have properties -- their type (server reflexive, relayed, or
host), priority,foundation, and base.
</t>
<t hangText="Component:"> A component is a piece of a media stream
requiring a single transport address; a media stream may require
multiple components, each of which has to work for the media stream as
a whole to work. For media streams based on RTP, unless RTP and RTCP
are multiplexed in the same port, there are two components per media
stream -- one for RTP, and one for RTCP. </t>
<t hangText="Host Candidate:"> A candidate obtained by binding to a
specific port from an IP address on the host. This includes IP
addresses on physical interfaces and logical ones, such as ones
obtained through Virtual Private Networks (VPNs) and Realm Specific IP
(RSIP) <xref target="RFC3102"/> (which lives at the operating system
level).
</t>
<t hangText="Server Reflexive Candidate:"> A candidate whose IP
address and port are a binding allocated by a NAT for an agent when it
sent a packet through the NAT to a server. Server reflexive candidates
can be learned by STUN servers using the Binding request, or TURN
servers, which provides both a relayed and server reflexive candidate.
</t>
<t hangText="Peer Reflexive Candidate:"> A candidate whose IP
address and port are a binding allocated by a NAT for an agent when it
sent a STUN Binding request through the NAT to its peer.
</t>
<t hangText="Relayed Candidate:"> A candidate obtained by sending a
TURN Allocate request from a host candidate to a TURN server. The
relayed candidate is resident on the TURN server, and the TURN server
relays packets back towards the agent.
</t>
<t hangText="Base:"> The transport address that an agent sends from
for a particular candidate. For host-, server reflexive- and peer reflexive
candidates the base is the same as the host candidate. For relayed
candidates the base is the same as the relayed candidate (i.e., the
transport address used by the TURN server to send from).
</t>
<t hangText="Foundation:"> An arbitrary string that is the same for
two candidates that have the same type, base IP address, protocol
(UDP, TCP, etc.), and STUN or TURN server. If any of these are
different, then the foundation will be different. Two candidate pairs
with the same foundation pairs are likely to have similar network
characteristics. Foundations are used in the frozen algorithm.
</t>
<t hangText="Local Candidate:">A candidate that an agent has obtained
and shared with the peer.
</t>
<t hangText="Remote Candidate:">A candidate that an agent received
from its peer.
</t>
<t hangText="Default Destination/Candidate:"> The default destination
for a component of a media stream is the transport address that would
be used by an agent that is not ICE aware. A default candidate for a
component is one whose transport address matches the default
destination for that component.
</t>
<t hangText="Candidate Pair:"> A pairing containing a local candidate
and a remote candidate.
</t>
<t hangText="Check, Connectivity Check, STUN Check:"> A STUN Binding
request transaction for the purposes of verifying connectivity. A
check is sent from the local candidate to the remote candidate of a
candidate pair.
</t>
<t hangText="Check List:"> An ordered set of candidate pairs that an
agent will use to generate checks.
</t>
<t hangText="Ordinary Check:"> A connectivity check generated by an
agent as a consequence of a timer that fires periodically, instructing
it to send a check.
</t>
<t hangText="Triggered Check:"> A connectivity check generated as a
consequence of the receipt of a connectivity check from the peer.
</t>
<t hangText="VALID LIST:"> An ordered set of candidate pairs for a
media stream that have been validated by a successful STUN
transaction.
</t>
<t hangText="Check List Set:"> An ordered list of CHECK LISTs.
</t>
<t hangText="Full:"> An ICE implementation that performs the complete
set of functionality defined by this specification.
</t>
<t hangText="Lite:"> An ICE implementation that omits certain
functions, implementing only as much as is necessary for a peer
implementation that is full to gain the benefits of ICE. Lite
implementations do not maintain any of the state machines and do not
generate connectivity checks.
</t>
<t hangText="Controlling Agent:"> The ICE agent that is responsible
for selecting the final choice of candidate pairs and signaling them
through STUN. In any session, one agent is always controlling. The
other is the controlled agent.
</t>
<t hangText="Controlled Agent:"> An ICE agent that waits for the
controlling agent to select the final choice of candidate pairs.
</t>
<t hangText="Nomination, Regular Nomination:"> The process of picking a valid
candidate pair for media traffic by validating the pair with one
STUN request, and then picking it by sending a second STUN request
with a flag indicating its nomination.
</t>
<t hangText="Nominated:"> If a valid candidate pair has its nominated
flag set, it means that it may be selected by ICE for sending and
receiving media.
</t>
<t hangText="Selected Pair, Selected Candidate:"> The candidate pair
selected by ICE for sending and receiving media is called the selected
pair, and each of its candidates is called the selected candidate.
Before a pair has been selected, any valid candidate pair
can be used for sending and receiving media (only one candidate pair
at any given time).
</t>
<t hangText="Using Protocol, ICE Usage:"> The protocol that uses ICE
for NAT traversal. A usage specification defines the protocol specific
details on how the procedures defined here are applied to that
protocol. </t>
</list></t>
</section>
<section anchor="sec-gathering_exchange" title="ICE Candidate Gathering and Exchange">
<t>
As part of ICE processing, both the initiating and responding agents
exchange encoded candidate information as defined by the Usage
Protocol (ICE Usage). Specifics of encoding mechanism and the
semantics of candidate information exchange is out of scope of this
specification.
</t>
<t>
However at a higher level, the below diagram captures ICE processing
sequence in the agents (initiator and responder) for exchange of
their respective candidate(s) information.
</t>
<figure title="Candidate Gathering and Exchange Sequence"
anchor="fig:basic-cand-exchange" align="left">
<artwork>
<![CDATA[
Initiating Responding
Agent Agent
(I) (R)
Gather, | |
prioritize, | |
eliminate | |
redundant | |
candidates, | |
Encode | |
candidates | |
| I's Candidate Information |
|------------------------------>|
| | Gather,
| | prioritize,
| | eliminate
| | redundant
| | candidates,
| | Encode
| | candidates
| R's Candidate Information |
|<------------------------------|
| |
]]></artwork></figure>
<t>
As shown, the agents involved in the candidate exchange perform (1)
candidate gathering, (2) candidate prioritization, (3) eliminating
redundant candidates, (4) (possibly) choose default candidates, and
then (5) formulate and send the candidates to the Peer ICE agent. All
but the last of these five steps differ for full and lite
implementations.
</t>
<t>
</t>
<section anchor="sec-full-impl-reqs" title="Procedures for Full Implementation">
<section anchor="sec-gathering" title="Gathering Candidates">
<t>
An agent gathers candidates when it believes that communication is
imminent. An initiating agent can do this based on a user interface
cue, or based on an explicit request to initiate a session. Every
candidate is a transport address. It also has a type and a base.
Four types are defined and gathered by this specification -- host
candidates, server reflexive candidates, peer reflexive candidates,
and relayed candidates. The server reflexive candidates are gathered
using STUN or TURN, and relayed candidates are obtained through TURN.
Peer reflexive candidates are obtained in later phases of ICE, as a
consequence of connectivity checks.
</t>
<t>
The process for gathering candidates at the responding agent is
identical to the process for the initiating agent. It is RECOMMENDED
that the responding agent begins this process immediately on receipt
of the candidate information, prior to alerting the user. Such
gathering MAY begin when an agent starts.
</t>
<section title="Host Candidates">
<t> The first step is to gather host candidates. Host candidates are
obtained by binding to ports (typically ephemeral) on a IP address
attached to an interface (physical or virtual, including VPN
interfaces) on the host.
</t>
<t>For each UDP media stream the agent wishes to use, the agent SHOULD
obtain a candidate for each component of the media stream on each IP
address that the host has, with the exceptions listed below. The agent
obtains each candidate by binding to a UDP port on the specific IP
address. A host candidate (and indeed every candidate) is always
associated with a specific component for which it is a candidate. </t>
<t> Each component has an ID assigned to it, called the component ID.
For RTP-based media streams, unless both RTP and RTCP are multiplexed
in the same UDP port (RTP/RTCP multiplexing), the RTP itself has a
component ID of 1, and RTCP a component ID of 2. In case of RTP/RTCP
multiplexing, a component ID of 1 is used for both RTP and RTCP.</t>
<t>When candidates are obtained, unless the agent knows for sure that
RTP/RTCP multiplexing will be used (i.e. the agent knows that the
other agent also supports, and is willing to use, RTP/RTCP
multiplexing), or unless the agent only supports RTP/RTCP
multiplexing, the agent MUST obtain a separate candidate for RTCP. If
an agent has obtained a candidate for RTCP, and ends up using RTP/RTCP
multiplexing, the agent does not need to perform connectivity checks
on the RTCP candidate.</t>
<t>If an agent is using separate candidates for RTP and RTCP, it will
end up with 2*K host candidates if an agent has K IP addresses.</t>
<t>Note that the responding agent, when obtaining its candidates, will
typically know if the other agent supports RTP/RTCP multiplexing, in
which case it will not need to obtain a separate candidate for
RTCP. However, absence of a component ID 2 as such does not imply use
of RTCP/RTP multiplexing, as it could also mean that RTCP is not
used. </t>
<t> For other than RTP-based streams, use of multiple components is
discouraged since using them increases the complexity of ICE
processing. If multiple components are needed, the component IDs
SHOULD start with 1 and increase by 1 for each component.
</t>
<t>
The base for each host candidate is set to the candidate itself.
</t>
<t> The host candidates are gathered from all IP addresses with the
following exceptions:
<list style="symbols">
<t> Addresses from a loopback interface MUST NOT be included in
the candidate addresses. </t>
<t> Deprecated IPv4-compatible IPv6 addresses <xref
target="RFC4291"/> and IPv6 site-local unicast addresses <xref
target="RFC3879"/> MUST NOT be included in the address
candidates. </t>
<t> IPv4-mapped IPv6 addresses SHOULD NOT be included in the
offered candidates unless the application using ICE does not
support IPv4 (i.e., is an IPv6-only application <xref
target="RFC4038"/>). </t>
<t> If one or more host candidates corresponding to an IPv6
address generated using a mechanism that prevents location
tracking <xref target="RFC7721"/> are
gathered, host candidates corresponding to IPv6 addresses that do
allow location tracking, that are configured on the same
interface, and are part of the same network prefix MUST NOT be
gathered; and host candidates corresponding to IPv6 link-local
addresses MUST NOT be gathered.</t>
</list>