1.1 Glossary

This document uses the following terms:

202 Accepted: A response that indicates that a request was accepted for processing.

Audio/Video Edge Server (A/V Edge Server): A protocol server that implements the Traversal Using Relay NAT (TURN) Extensions Protocol, as described in [MS-TURN]. The protocol server provides connectivity to a protocol client that is behind a network entity, if the network entity provides network address translation (NAT).

B-frame: A bidirectional video frame that references both the previous frame and the next frame.

call: A communication between peers that is configured for a multimedia conversation.

candidate: A set of transport addresses that form an atomic unit for use with a media session. For example, in the case of Real-Time Transport Protocol (RTP) there are two transport addresses for each candidate, one for RTP and another for the Real-Time Transport Control Protocol (RTCP). A candidate has properties such as type, priority, foundation, and base.

codec: An algorithm that is used to convert media between digital formats, especially between raw media data and a format that is more suitable for a specific purpose. Encoding converts the raw data to a digital format. Decoding reverses the process.

Common Intermediate Format (CIF): A picture format, described in the H.263 standard, that is used to specify the horizontal and vertical resolutions of pixels in YCbCr sequences in video signals.

conference: A Real-Time Transport Protocol (RTP) session that includes more than one participant.

connectivity check: A Simple Traversal of UDP through NAT (STUN) binding request that is sent to validate connectivity between the local and remote candidates in a candidate pair.

Coordinated Universal Time (UTC): A high-precision atomic time standard that approximately tracks Universal Time (UT). It is the basis for legal, civil time all over the Earth. Time zones around the world are expressed as positive and negative offsets from UTC. In this role, it is also referred to as Zulu time (Z) and Greenwich Mean Time (GMT). In these specifications, all references to UTC refer to the time at UTC-0 (or GMT).

dialog: A peer-to-peer Session Initiation Protocol (SIP) relationship that exists between two user agents and persists for a period of time. A dialog is established by SIP messages, such as a 2xx response to an INVITE request, and is identified by a call identifier, a local tag, and a remote tag.

endpoint: A device that is connected to a computer network.

forward error correction (FEC): A process in which a sender uses redundancy to enable a receiver to recover from packet loss.

fully qualified domain name (FQDN): An unambiguous domain name that gives an absolute location in the Domain Name System's (DNS) hierarchy tree, as defined in [RFC1035] section 3.1 and [RFC2181] section 11.

I-frame: A video frame that is encoded as a single image, such that it can be decoded without any dependencies on previous frames. Also referred to as Intra-Coded frame, Intra frame, and key frame.

Interactive Connectivity Establishment (ICE): A methodology that was established by the Internet Engineering Task Force (IETF) to facilitate the traversal of network address translation (NAT) by media.

jitter: A variation in a network delay that is perceived by the receiver of each packet.

mean opinion score (MOS): A numerical indication of the perceived quality of media. It is expressed as a single number in the range of 1 to 5, where 1 is the lowest perceived quality and 5 is the highest perceived quality.

Multipurpose Internet Mail Extensions (MIME): A set of extensions that redefines and expands support for various types of content in email messages, as described in [RFC2045], [RFC2046], and [RFC2047].

network address translation (NAT): The process of converting between IP addresses used within an intranet, or other private network, and Internet IP addresses.

P-frame: A predicative video frame that references a previous frame. Also referred to as inter-coded frame or inter-frame.

proxy: A computer, or the software that runs on it, that acts as a barrier between a network and the Internet by presenting only a single network address to external sites. By acting as a go-between that represents all internal computers, the proxy helps protects network identities while also providing access to the Internet.

public switched telephone network (PSTN): Public switched telephone network is the voice-oriented public switched telephone network. It is circuit-switched, as opposed to the packet-switched networks.

QoE Monitoring Agent: A service running on a front-end server that collects and processes Quality of Experience (QoE) reports from clients in the form of a SIP message, sends a 202 Accepted or an error response to the client, and sends the QoE metrics to the QoE Monitoring Server.

QoE Monitoring Server: A server that collects and processes Quality of Experience (QoE) metrics.

Quality of Experience (QoE): A subjective measure of a user's experiences with a media service.

Real-Time Transport Protocol (RTP): A network transport protocol that provides end-to-end transport functions that are suitable for applications that transmit real-time data, such as audio and video, as described in [RFC3550].

remote endpoint: See peer.

reporting endpoint: A protocol client that sends Quality of Experience (QoE) metrics to a QoE Monitoring Server.

RTP packet: A data packet consisting of the fixed RTP header, a possibly empty list of contributing sources, and the payload data. Some underlying protocols may require an encapsulation of the RTP packet to be defined. Typically one packet of the underlying protocol contains a single RTP packet, but several RTP packets can be contained if permitted by the encapsulation method. See [RFC3550] section 3.

RTVideo: A video stream (2) that carries an RTVC1 bit stream.

service: A process or agent that is available on the network, offering resources or services for clients. Examples of services include file servers, web servers, and so on.

session: A collection of multimedia senders and receivers and the data streams that flow between them. A multimedia conference is an example of a multimedia session.

Session Description Protocol (SDP): A protocol that is used for session announcement, session invitation, and other forms of multimedia session initiation. For more information see [MS-SDP] and [RFC3264].

Session Initiation Protocol (SIP): An application-layer control (signaling) protocol for creating, modifying, and terminating sessions with one or more participants. SIP is defined in [RFC3261].

SIP message: The data that is exchanged between Session Initiation Protocol (SIP) elements as part of the protocol. An SIP message is either a request or a response.

SIP transaction: A SIP transaction occurs between a UAC and a UAS. The SIP transaction comprises all messages from the first request sent from the UAC to the UAS up to a final response (non-1xx) sent from the UAS to the UAC. If the request is INVITE, and the final response is a non-2xx, the SIP transaction also includes an ACK to the response. The ACK for a 2xx response to an INVITE request is a separate SIP transaction.

stream: (1) An element of a compound file, as described in [MS-CFB]. A stream contains a sequence of bytes that can be read from or written to by an application, and they can exist only in storages.

(2) A flow of data from one host to another host, or the data that flows between two hosts.

Super P-frame (SP-frame): A special P-frame that uses the previous cached frame instead of the previous P-frame or I-frame as a reference frame.

synchronization source (SSRC): The source of a stream of RTP packets, identified by a 32-bit numeric SSRC identifier carried in the RTP header so as not to be dependent upon the network address. All packets from a synchronization source form part of the same timing and sequence number space, so a receiver groups packets by synchronization source for playback. Examples of synchronization sources include the sender of a stream of packets derived from a signal source such as a microphone or a camera, or an RTP mixer. A synchronization source may change its data format (for example, audio encoding) over time. The SSRC identifier is a randomly chosen value meant to be globally unique within a particular RTP session. A participant need not use the same SSRC identifier for all the RTP sessions in a multimedia session; the binding of the SSRC identifiers is provided through RTCP. If a participant generates multiple streams in one RTP session, for example from separate video cameras, each MUST be identified as a different SSRC. See [RFC3550] section 3.

Transmission Control Protocol (TCP): A protocol used with the Internet Protocol (IP) to send data in the form of message units between computers over the Internet. TCP handles keeping track of the individual units of data (called packets) that a message is divided into for efficient routing through the Internet.

TURN server: An endpoint that receives Traversal Using Relay NAT (TURN) request messages and sends TURN response messages. The protocol server acts as a data relay, receiving data on the public address that is allocated to a protocol client and forwarding that data to the client.

Uniform Resource Identifier (URI): A string that identifies a resource. The URI is an addressing mechanism defined in Internet Engineering Task Force (IETF) Uniform Resource Identifier (URI): Generic Syntax [RFC3986].

user agent client (UAC): A logical entity that creates a new request, and then uses the client transaction state machinery to send it. The role of UAC lasts only for the duration of that transaction. In other words, if a piece of software initiates a request, it acts as a UAC for the duration of that transaction. If it receives a request later, it assumes the role of a user agent server (UAS) for the processing of that transaction.

user agent server (UAS): A logical entity that generates a response to a Session Initiation Protocol (SIP) request. The response either accepts, rejects, or redirects the request. The role of the UAS lasts only for the duration of that transaction. If a process responds to a request, it acts as a UAS for that transaction. If it initiates a request later, it assumes the role of a user agent client (UAC) for that transaction.

User Datagram Protocol (UDP): The connectionless protocol within TCP/IP that corresponds to the transport layer in the ISO/OSI reference model.

XML schema: A description of a type of XML document that is typically expressed in terms of constraints on the structure and content of documents of that type, in addition to the basic syntax constraints that are imposed by XML itself. An XML schema provides a view of a document type at a relatively high level of abstraction.

XML schema definition (XSD): The World Wide Web Consortium (W3C) standard language that is used in defining XML schemas. Schemas are useful for enforcing structure and constraining the types of data that can be used validly within other XML documents. XML schema definition refers to the fully specified and currently recommended standard for use in authoring XML schemas.

MAY, SHOULD, MUST, SHOULD NOT, MUST NOT: These terms (in all caps) are used as defined in [RFC2119]. All statements of optional behavior use either MAY, SHOULD, or SHOULD NOT.