2.2.12.2 Video Source Request (VSR)

The Video Source Request (VSR) feedback message is an application-layer feedback message. It is identified by PT=PSFB, FMT=15, and Type=1.

Application layer feedback message details are specified in [RFC4585] section 6.4.

The VSR message MAY be used to notify the sender that the receiver wants to watch a specific video source with the video parameters in the message. It contains one Feedback Control Information (FCI) field in addition to the application feedback message common field.

The VSR FCI field contains one VSR header and up to 20 VSR entries. The VSR feedback message can be used by the mixer to forward the VSR it receives to the real video source. In this scenario, the VSR sent by the mixer is the aggregated result of multiple VSRs the mixer receives. When these VSRs received by the mixer have different video parameters, the aggregated VSR MAY have multiple VSR entries. The algorithm used by the mixer is implementation-defined and not prescribed by this specification.

The VSR header is defined as follows:


0


1


2


3


4


5


6


7


8


9

1
0


1


2


3


4


5


6


7


8


9

2
0


1


2


3


4


5


6


7


8


9

3
0


1

AFB Type

Length

Requested Media Source ID (MSI)

Request Id

Reserve1

Version

A

Reserve2

B

C

Reserve3

AFB Type (2 bytes): The application layer feedback message type. Set to 0x1.

Length (2 bytes): The FCI size, including the AFB Type and Length field. A VSR MUST have less than or equal to 20 VSR entries.

Requested Media Source ID (MSI) (4 bytes): The MSI of the video source to be watched. The following constant values have special meanings:

0xFFFFFFFF: SOURCE_NONE, meaning the receiver is not requesting a video source.

0xFFFFFFFE: SOURCE_ANY, meaning the sender selects the video source.

Request Id (2 bytes): A 16-bit unsigned number that uniquely identifies the VSR. A VSR can be sent multiple times to the sender. All the retransmitted VSRs MUST carry the same Request Id. A different VSR MUST carry a different Request Id.

Reserve1 (2 bytes): Reserved field. The sender SHOULD set it to zero. The receiver MUST ignore it

Version (1 byte): The VSR version number. The sender SHOULD set it to zero. The receiver MUST ignore it

A - K (1 bit): The key frame request flag. Set to 1 when a video key frame is requested.

Reserved2 (7 bits): Reserved field. The sender SHOULD set it to zero. The receiver MUST ignore it

B - Number of Entries (1 byte): The number of VSR entries in this VSR message. It MUST be less than or equal to 20. A VSR message MAY have zero VSR entries when the Requested MSI field value is 0xFFFFFFFF.

C - Entry Length (1 byte): The size of each VSR entry. Set to 0x44.

Reserve3 (4 bytes): Reserved field. The sender SHOULD set it to zero. The receiver MUST ignore it.

The VSR entries follow the VSR header. Each VSR entry represents a set of video parameters that one or multiple receivers request. The VSR entry is defined as follows:


0


1


2


3


4


5


6


7


8


9

1
0


1


2


3


4


5


6


7


8


9

2
0


1


2


3


4


5


6


7


8


9

3
0


1

Payload Type

UCConfig Mode

Flags

Aspect Ratio / Preferred Resolution Bit Mask

Maximum Width

Maximum Height

Minimum bit rate

Reserve / Macroblock Processing Rate Bitmask

Bit rate per level

Bit rate histogram (20 bytes)

Frame rate bit mask

Number of MUST instances

Number of MAY instances

Quality Report Histogram (16 bytes)

Maximum number of pixels

Payload type (1 byte): The encoding type of the requested video. Encoding types are specified in section 2.2.1. Only RT video and H264 encoding types are allowed.

UCConfig Mode (1 byte): The maximum UCConfig Mode the receiver supports. The UCConfig Mode is defined in [MSFT-H264UCConfig].

The value is defined as follows:

            0: MUST NOT be used.

            1: UCConfig Mode 1.

            2 or larger: MUST NOT be used.

Flags (1 byte): The video flags.

Bit 0 (least significant bit): 1 means the receiver supports CGS rewrite; for H264 encoding only.

Bit 1: 1 means the receiver only supports either constrained baseline profile or scalable constrained baseline profile; for H264 encoding only.

Bit 2: 1 means the receiver doesn't support SP frames; for RT Video encoding only.

Bit 3: 1 means the receiver does not support seamless resolution change; for H264 encoding only.

Bit 4-7: Reserved. The sender SHOULD set it to 0. The receiver MUST ignore it

Aspect Ratio Bit Mask/Preferred Resolution Bit Mask (1 byte): 

For video modality

This byte represents the bit mask of aspect ratio, defined as follows:

Bit 0: 4 by 3.

Bit 1: 16 by 9.

Bit 2: 1 by 1.

Bit 3: 3 by 4.

Bit 4: 9 by 16.

Bit 5: 20 by 3.

Bit 6-7: not used. They MUST be ignored.

For application-sharing modality  (a=label:applicationsharing-video)

This byte represents bit mask of the shorter dimension of the preferred resolution, defined as follows in pixels:

Bit 0: 270

Bit 1: 540

Bit 2: 810

Bit 3: 1080

Bit 4: 1350

Bit 5: 1620

Bit 6: 1890

Bit 7: 2160

Maximum Width (2 bytes): The maximum width of the video resolution in pixels.

Maximum Height (2 bytes): The maximum height of the video resolution in pixels.

Minimum bit rate (4 bytes): The minimum video bit rate. The unit is in bits per second.

Reserve / Macroblock Processing Rate Bit Mask (4 bytes):

For video modality

These bytes are reserved. The sender SHOULD set it to 0. The receiver MUST ignore it.

For application-sharing modality  (a=label:applicationsharing-video)

This byte represents the bit mask of the maximum macroblock processing rate per second requested, defined as follows:

Bit 0: 2812

Bit 1: 4218

Bit 2: 8437

Bit 3: 16875

Bit 4: 33750

Bit 5: 67500

Bit 6: 108000

Bit 7: 135000

Bit 8: 198000

Bit 9: 244800

Bit 10: 270000

Bit 11-31: not used. The sender SHOULD set it to 0. The receiver MUST ignore it

Bit rate per level (4 bytes): The bit rate difference for one level in the bit rate histogram.

Bit rate Histogram (20 bytes): The requested bit rate histogram of the receivers represented in this VSR entry. It is an array of 10 16-bit values. The i-th element in the array counts the number of receivers requesting a bit rate ranging from [Minimum bit rate + (i-1) * Bit rate per level] to [Minimum bit rate + i * Bit rate per level].

Frame rate bit mask (4 bytes): The bit mask of frame rate requested, defined as follows:

Bit 0: 7.5 frames per second

Bit 1: 12.5 frames per second

Bit 2: 15 frames per second

Bit 3: 25 frames per second

Bit 4: 30 frames per second

Bit 5: 50 frames per second

Bit 6: 60 frames per second

For video modality

Bit 7-31: not used. The sender SHOULD set it to 0. The receiver MUST ignore it

For application-sharing modality (a=label:applicationsharing-video)

Bit 7: 1.875 frames per second

Bit 8: 3.75 frames per second

Bit 9-31: not used. The sender SHOULD set it to 0. The receiver MUST ignore it

Number of MUST instances (2 bytes): The number of receivers that only accept video payload type defined in this VSR.

Number of MAY instances (2 bytes): The number of receivers that can accept other video payload type.

Quality Report Histogram (16 bytes): The quality report histogram. This is an array of 8 16-bit values. The i-th element in the array counts the number of receivers reporting a quality level of value i, where level 1 is the best quality. Quality report is generated from the receiver, and the mixer aggregates the quality reports it receives from the receivers to the video source.

Maximum number of pixels (4 bytes): The maximum number of pixels the receivers can receive in one video frame.