Share via


Depth Stream

Kinect for Windows 1.5, 1.6, 1.7, 1.8

Each frame of the depth data stream is made up of pixels that contain the distance (in millimeters) from the camera plane to the nearest object. An application can use depth data to track a person's motion or identify background objects to ignore.

The depth data stream merges two separate types of data:

  • Depth data, in millimeters.
  • Player segmentation data. Each player segmentation value is an integer indicating the index of a unique player detected in the scene.

The depth data is the distance, in millimeters, to the nearest object at that particular (x, y) coordinate in the depth sensor's field of view. The depth image is available in 3 different resolutions: 640x480 (the default), 320x240, and 80x60 as specified using the DepthImageFormat Enumeration. The range setting, specified using the DepthRange Enumeration, determines the distance from the sensor for which depth values are received.

Player Segmentation Data

The Kinect runtime processes depth data to identify up to six human figures in a segmentation map. The segmentation map is a bitmap with pixel values corresponding to the index of the person in the field-of-view who is closest to the camera at that pixel position. Player segmentation data is only available in the depth stream when skeletal tracking is enabled.

The player segmentation data is also commonly referred to as player index data. Although the player segmentation data is a separate logical stream, the depth data and the segmentation data are merged into a single pixel value for each frame. The value "0" indicates that no person was found at that location, the values "1" through "6" identify players. The values "1" through "6" map to elements 0 through 5 in the array of skeleton data.

One common use of the segmentation data is as a mask to isolate a specific user or region of interest from the color and depth images.

Extended Depth Data

Some applications need depth data beyond the “too far” limit or closer than the “too near” limit even if the resolution or accuracy goes down. Starting with 1.6.0, the depth and segmentation data can be retrieved in either of two formats:

  • Full depth information - Each pixel is represented by a structure with two fields: a 16-bit depth and a 16-bit player index. All detected depth values, including those outside the reliable range, are reported. Pixels whose depth is unknown (could not be detected) are reported with a depth value of "0". Introduced in 1.6.
  • Packed depth information - Each pixel is represented by one 16-bit value. The 13 high-order bits contain the depth value; the 3 low-order bits contain the player index. Any depth value outside the reliable range is replaced with a special value to indicate that it was too near, too far, or unknown.

In This Section