1.4.2 Real Time Streaming Protocol (RTSP)

The Real Time Streaming Protocol (RTSP), specified in [MS-RTSP], is used for transferring real-time multimedia data (for example, audio and video) between a server and a client. It is a streaming protocol; this means that RTSP attempts to facilitate scenarios in which the multimedia data is being simultaneously transferred and rendered (that is, video is displayed and audio is played).

RTSP typically uses a TCP connection for control of the streaming media session, although it is also possible to use UDP for this purpose. The entity that sends the RTSP request that initiates the session is referred to as the client, and the entity that responds to that request is referred to as the server. Typically, multimedia data flows from the server to the client. RTSP also allows multimedia data to flow in the opposite direction.

Clients can send RTSP requests to the server requesting information on content before a session is established. The information that the server returns is formatted by using a syntax called Session Description Protocol (SDP), as specified in [RFC4566]. Clients use RTSP requests to control the session and to request that the server perform actions, such as starting or stopping the flow of multimedia data. Each request has a corresponding RTSP response that is sent in the opposite direction. Servers can also send RTSP requests to clients, for example, to inform them that the session state has changed. If TCP is used to exchange RTSP requests and responses, the multimedia data can also be transferred over the same TCP connection. Otherwise, the multimedia data is transferred over UDP.

The multimedia data is encapsulated in Real-time Transport Protocol (RTP) packets, as specified in [RFC3550]. For each RTP stream, the server and client can also exchange Real-Time Transport Control Protocol (RTCP) packets, as specified in [RFC3556].