Share via


KinectInteraction Concepts

Kinect for Windows 1.7, 1.8

There are many concepts in the new KinectInteraction features that you may be encountering for the first time. It is important to get a good understanding of these concepts to understand what can and cannot be done with the new features.

The KinectInteraction Controls have been designed to be compatible with keyboard and mouse control of a Kinect-enabled application as well.

Hand Tracking

The first concept is hand tracking. If you've used the Kinect for Windows SDK before, you may be familiar with skeletal tracking, where the SDK identifies humans in its visual range and creates a skeleton for them. While the skeletal tracking itself is not enhanced by KinectInteraction, the combination of depth information and skeletal tracking information allows KinectInteraction to track a user's hands.

In addition to tracking hands, KinectInteraction can also detect and report hand and arm state, allowing natural gestures such as gripping, releasing, and pressing to be identified. Therefore, a user can now interact with a Kinect-enabled application in a touch-free manner, and at the normal operating range of the sensor (0.4 meters in near mode, up to 3-4 meters in normal mode).

The Kinect sensor must be able to see a hand to track it. The interaction features work best when the user is facing the sensor, and when the user has their hand palm-first to the sensor.

The Physical Interaction Zone (PhIZ)

The Physical Interaction Zone is a mapping between a 3D space in front of the user in the physical world to the 3-D coordinate space that developers build on. There is one PhIZ for each hand for each user currently being tracked (up to 2 users). The PHIZ is the active volume where most interactions are enabled.

What Gets Tracked?

The interaction layer is capable of tracking both hands of 1 or 2 users. One of these users is designated as the primary user, typically the first of the two to interact with the system. The primary user keeps control of the interaction until the system determines that the user is no longer engaged with the application. The primary user is assigned a primary hand (the one the user is using to control the experience), although both hands are tracked. Only the primary hand can be used to control the experience. Since both hands are being tracked, the user can switch their primary hand, by disengaging the current primary hand (say, by lowering that hand to their side) and raising the other hand into the PhIZ.

The application can also tell the interaction layer which of the tracked users is primary.

The user is assigned a user ID based on the input from the skeleton tracking stream.

Hand State

For each hand, a hand state is maintained. This specifies the user the hand belongs to, whether the hand is primary for that user, whether the hand is interactive (see below), and whether the hand is gripping, pressing, or neither.

Tracked vs. Interactive

A tracked hand is one that KinectInteraction is observing, looking for possible interactivity. An interactive hand is a tracked hand that is in the PhIZ and being observed directly for interaction.

The User Viewer

One of the controls available in the Kinect Interaction Controls set is the User Viewer. This is a small window that shows the Kinect Sensor's view of the user. This view is taken from the depth stream, and is available automatically when a Kinect Region is created.

The Hand Pointer

The hand pointer is an object marking the current location of a user's hand within the PhIZ. This can be made visible in the application by use of the Kinect Cursor. If the application has turned cursor visibility on (this is the default when using the C#/WPF controls), the cursor is visible in the Kinect Region(s) defined for the application and is in the shape of a hand. In addition, the visualization of the Kinect Cursor changes shape to a closed fist when the user is gripping a control.

The Hand Pointer and Other Controls

To provide a smooth interaction experience, once the hand pointer has spent some time in the coordinates associated with a given Kinect Interaction Control, the hand pointer is considered captured by that control, until the user moves their hand away from the boundaries of the control. The property of being captured by a given control is exposed in both the hand pointer, and the control that has captured the pointer.

Interaction Types

KinectInteraction provides several ways to interact with a Kinect-enabled application. The provided interactions are:

  • Grip and release
  • Press
  • Scroll

Grip and Release

In the grip interaction, the user has their hand open, palm facing towards the sensor (ideally), and then makes a fist with their hand. This is recognized as a grip, and binds the hand tracking to whichever control is indicated, until the user releases the grip. This can be associated with the Kinect Scroll Viewer control as well, allowing a natural experience for direct manipulation of scrolling, and for scrolling through large lists. The Control Basics sample shows its operation, although paged scroll capabilities are provided in this sample as well.

The release interaction is the opening of the closed fist.

Press

In the press interaction, the user has their hand open, palm facing towards the sensor (ideally), and arm not fully extended towards the sensor. The user then extends the hand toward the sensor, indicating a press. This can also be used with the Scroll Viewer, as shown in the InteractionGallery sample.

Scroll

The Kinect Scroll Viewer control allows scrolling through data that is too large to fit on a single screen. You can design a grip scroll experience, as shown in the Control Basics sample, or a paged scroll experience, as shown in the InteractionGallery example.

The Interaction Stream

The Interaction Stream provides a stream of interaction frames, similar to the stream model of the other data sources (audio stream, depth stream, skeleton stream, etc.). Interaction frames are processed to provide information on the user's interaction with the application, such as hand position, whether the hand is pressing, gripping, or releasing, and the control the user is targeting.