4 Protocol Examples

This protocol example illustrates the establishment of a media session between two endpoints based on the sample topology that is shown in the following figure.

ICE implementations

Figure 3: ICE implementations

The figure shows Endpoint L and Endpoint R using ICE. Both agents are full ICE implementations and use Regular Nominations for selecting the candidates to be used for media flow. Endpoint L is behind a NAT device in a private address space (192.168.2.1) with the public edge of the NAT device at 10.107.0.71, and Agent R is on the public Internet at 10.104.0.68. Both endpoints are configured with the same User Datagram Protocol (UDP) TURN server that is listening on IP address 10.101.0.57and port 3478.

The transport address follows a similar naming convention to that in the sample described in [IETFDRAFT-ICENAT-19] section 17.

Transport addresses are referred to by using the mnemonic names with the format entity-type-seqno, where entity refers to the entity whose IP address the transport address is on and is either "L", "R", "NAT", or "TURN". The type is either "PUB" for transport addresses that are publicly reachable on the Internet or "PRIV" for transport addresses that are not reachable from the Internet. The seqno is a number that is different for transport addresses of the same type on an entity. The TURN server has the transport address TURN-PUB-1 (10.101.0.57 and port 3478).

For the call flow:

  • "S=" refers to the source transport address.

  • "D=" refers to the destination transport address.

  • "MA=" refers to the mapped address in the Simple Traversal of UDP through NAT (STUN) binding response.

  • "RA=" refers to the reflexive address.

  • "TA=" refers to the relay transport address.

For clarity, the example does not show the Traversal Using Relay NAT (TURN) authentication mechanisms and the Real-Time Transport Control Protocol (RTCP) component.

The example focuses on the Real-Time Transport Protocol (RTP) component for establishing a media session between Endpoint L and Endpoint R. Endpoint L initiates the media session and becomes the controlling agent because Endpoint L is a full ICE implementation. Endpoint L gathers its UDP Host Candidate by binding to its local interface and then gathers UDP Relayed Candidates and UDP Server Reflexive Candidates from the configured TURN server. Because no Transmission Control Protocol (TCP) TURN servers are configured, Endpoint L creates an active TCP TCP-ACT Server Reflexive Candidate based on the UDP Server Reflexive Candidate. After gathering the candidates, Endpoint L sends the INVITE to Endpoint R. A sample INVITE Session Description Protocol (SDP) for Endpoint L's topology is as follows:

 v=0
 o=- 0 0 IN IP4 10.101.0.57
 s=session
 c=IN IP4 10.101.0.57
 b=CT:99980
 t=0 0
 m=audio 52732 RTP/AVP 114 111 112 115 116 4 8 0 97 13 118 101
 a=ice-ufrag:qkEP
 a=ice-pwd:ed6f9GuHjLcoCN6sC/Eh7fVl
 a=candidate:1 1 UDP 2130706431 192.168.2.1 50005 typ host
 a=candidate:2 1 UDP 16648703 10.101.0.57 52732 typ relay raddr 10.107.0.71 rport 50033 a=candidate:3 1 UDP 1694234623 10.107.0.71 50033 typ srflx raddr 192.168.2.1 rport 50033 a=candidate:4 1 TCP-ACT 1684797951 10.107.0.71 50033 typ srflx raddr 192.168.2.1 rport 50033 a=rtpmap:114 x-msrta/16000

The following diagrams illustrate the ICE request and response sequence.

ICE request and response sequence

Figure 4: ICE request and response sequence

ICE request and response sequence (continued)

Figure 5: ICE request and response sequence (continued)

Endpoint R, upon receiving the offer, gathers its candidates. It gathers its UDP Host Candidate by binding to its local interface and then gathers UDP Relayed Candidates from the configured TURN server. Endpoint R is not behind a NAT device so UDP Server Reflexive Candidates are created. Because no TCP TURN servers are configured, Endpoint R creates a TCP-ACT Server Reflexive Candidate based on the UDP Host Candidate. Endpoint R sends its candidates to Endpoint L in the answer. Endpoint R pairs its local candidates with Endpoint L's remote candidates and starts connectivity checks. A sample answer SDP for Endpoint R's topology is as follows:

 v=0
 o=- 0 0 IN IP4 10.101.0.57
 s=session
 c=IN IP4 10.101.0.57
 b=CT:99980
 t=0 0
 m=audio 52714 RTP/AVP 114 111 112 115 116 4 8 0 97 13 118 101
 a=ice-ufrag:qkEP
 a=ice-pwd:ed6f9GuHjLcoCN6sC/Eh7fVl
 a=candidate:1 1 UDP 2130706431 10.104.0.68 50025 typ host
 a=candidate:2 1 UDP 16648703 10.101.0.57 52714 typ relay raddr 10.104.0.68 rport 50036 a=candidate:3 1 TCP-ACT 1684797951 10.104.0.68 50025 typ srflx raddr 10.104.0.68 rport 50025 a=rtpmap:114 x-msrta/16000

Endpoint L, upon receiving the answer from Endpoint R, pairs its local candidates with the candidates received in the answer and starts connectivity checks. Both endpoints perform connectivity checks with the highest priority candidate pairs.

The preceding sequence diagram shows that Endpoint R sends a STUN binding request from R-PUB-1 to L-PRIV-2, which does not reach L-PRIV-2 because it is not directly reachable from R-PUB-1. At this point, Endpoint L sends a STUN binding request from L-PRIV-2 to R-PUB-1. This request goes through the NAT device and Endpoint R eventually receives the packet at R-PUB-1 with the source as NAT-PUB-2. Agent R sends a STUN binding response with the mapped address set to NAT-PUB-2. Endpoint L eventually gets the packet from the NAT device and discovers a new peer-derived candidate, because the mapped address is different from the address the STUN binding request sent. The endpoint validates this candidate pair and disables all lower priority candidate pairs. Because this is the highest priority candidate pair, Endpoint L nominates this candidate pair and sends a STUN binding request to R-PUB-1 with the USE-CANDIDATE flag set. Endpoint R, upon getting the request with the USE-CANDIDATE flag, responds with a STUN binding response. Upon receiving the response, Endpoint L stops its connectivity checks because it has found the candidate pair that has to be used for media flow.

Endpoint L sends the final offer to Endpoint R, with the final local and remote candidate to be used for media flow. A sample final offer is as follows:

 v=0
 o=- 0 0 IN IP4 10.107.0.71
 s=session
 c=IN IP4 10.107.0.71
 b=CT:99980
 t=0 0
 m=audio 50005 RTP/SAVP 114 111 112 115 116 4 8 0 97 13 118 101
 a=ice-ufrag:32sD
 a=ice-pwd:YF9/OwRcN/pXUglBv1c+5QMu
 a=candidate:7 1 UDP 1862270719 10.107.0.71 50005 typ prflx raddr 192.168.2.4 rport 50005
 a=remote-candidates:1 10.104.0.68 50025
 a=rtpmap:114 x-msrta/16000

Endpoint R, upon receiving the final offer, stops its connectivity checks and sends its answer to the final offer:

 v=0
 o=- 0 0 IN IP4 10.104.0.68
 s=session
 c=IN IP4 10.104.0.68
 b=CT:99980
 t=0 0
 m=audio 50025 RTP/SAVP 114 111 112 115 116 4 8 0 97 13 118 101
 a=ice-ufrag:32sD
 a=ice-pwd:YF9/OwRcN/pXUglBv1c+5QMu
 a=candidate:7 1 UDP 1862270719 10.104.0.68 50025 typ host 
 a=remote-candidates:1 10.107.0.71 50005
 a=rtpmap:114 x-msrta/16000

With the receipt of the final answer, the connectivity checks phase ends and both ends stream media using the final candidates selected by the connectivity checks.