Writing a gesture service with the Kinect for Windows SDK
After further experimenting with the Kinect SDK, it became obvious what needed to come next. If you were to create an application using the Kinect SDK, you will want to be able to control the application using gestures (i.e. waving, swiping, motions to access menus, etc.). From this, we decided to write a gesture service in c# that would analyse the gestures. This blog post outlines how we did this and how you can implement the same functionality.
To be able to recognize gestures, it is first important to understand what makes a gesture.
We concluded that gestures were made up of parts. Each part of a gesture is a specific movement that, when combined with other gesture parts, makes up the whole gesture. For example the diagram below shows the two parts of a simple wave gesture and how they can be identified:
However this is not quite enough to be able to recognize multiple gestures with any degree of accuracy. The problem occurs when you think about multiple gestures being recognized at once.
It’s not as simple as looking for the next part of the gesture. . For example, consider the wave gesture shown above. If I was to drop my hand between the two parts, then it would still be recognized as a wave as both parts of the gesture were completed in the order they were defined; yet I clearly did not perform a wave. To solve this problem we came up with three results that a gesture part can return when it checks to see if it has been completed or not. The diagram below shows these three results and the impact of returning each of them:
A result of ‘Pausing’ allows the system to identify a movement that does not fulfil the gesture but could be a result of the user moving slowly. In short the three results mean the following:
- Fail – The gesture failed. The user moved in a way that was inconsistent with the gesture and as such the gesture will start again at the beginning.
- Pausing – The user did not fail the gesture but they did not perform the next part either. The system will check again for this part after a short pause. A result of pausing can only be returned a maximum of 100 times before the gesture will fail and recognition will start again at the beginning.
- Succeed – the user performed this part of the gesture. After a short pause the system will start looking for the next part of the gesture.
The Solution
The overall gesture service is made up of three main parts each of which is detailed below:
The Gesture Controller:
The gesture controller is a way of controlling all of the gestures that a user can perform. The code for this can be seen below:
1: #region using...
2: using System;
3: using System.Collections.Generic;
4: using Microsoft.Research.Kinect.Nui;
5: #endregion
6:
7: /// <summary>
8: /// The gesture controller
9: /// </summary>
10: public class GestureControler
11: {
12:
13: /// <summary>
14: /// The list of all gestures we are currently looking for
15: /// </summary>
16: private List<Gesture> gestures = new List<Gesture>();
17:
18: /// <summary>
19: /// Initializes a new instance of the <see cref="GestureControler"/> class.
20: /// </summary>
21: public GestureControler()
22: {
23: }
24:
25: /// <summary>
26: /// Occurs when [gesture recognized].
27: /// </summary>
28: public event EventHandler<GestureEventArgs> GestureRecognised;
29:
30: /// <summary>
31: /// Updates all gestures.
32: /// </summary>
33: /// <param name="data">The skeleton data.</param>
34: public void UpdateAllGestures(SkeletonData data)
35: {
36: foreach (Gesture gesture in this.gestures)
37: {
38: gesture.UpdateGesture(data);
39: }
40: }
41:
42: /// <summary>
43: /// Adds the gesture.
44: /// </summary>
45: /// <param name="type">The gesture type.</param>
46: /// <param name="gestureDefinition">The gesture definition.</param>
47: public void AddGesture(GestureType type, IRelativeGestureSegment[] gestureDefinition)
48: {
49: Gesture gesture = new Gesture(type, gestureDefinition);
50: gesture.GestureRecognised += new EventHandler<GestureEventArgs>(this.Gesture_GestureRecognised);
51: this.gestures.Add(gesture);
52: }
53:
54: /// <summary>
55: /// Handles the GestureRecognised event of the g control.
56: /// </summary>
57: /// <param name="sender">The source of the event.</param>
58: /// <param name="e">The <see cref="KinectSkeltonTracker.GestureEventArgs"/> instance containing the event data.</param>
59: private void Gesture_GestureRecognised(object sender, GestureEventArgs e)
60: {
61: if (this.GestureRecognised != null)
62: {
63: this.GestureRecognised(this, e);
64: }
65:
66: foreach (Gesture g in this.gestures)
67: {
68: g.Reset();
69: }
70: }
71: }
A Gesture:
This controls all of the parts of a gesture and which one is currently being checked. It contains an array of IRelativeGestureSegment which are individual implementations of the IRelativeGestureSegment interface (which I will mention later). When a skeleton frame is created it is passed through to each Gesture which then passes it through to the current gesture segment. When the final segment returns a result of ‘Succeed’ it raises a gesture recognized event which is caught by the gesture controller. The code for the Gesture class can be seen below:
1: #region using...
2: using System;
3: using Microsoft.Research.Kinect.Nui;
4: #endregion
5:
6: <summary>
7: /// A single gesture
8: /// </summary>
9: public class Gesture
10: {
11: /// <summary>
12: /// The parts that make up this gesture
13: /// </summary>
14: private IRelativeGestureSegment[] gestureParts;
15:
16: /// <summary>
17: /// The current gesture part that we are matching against
18: /// </summary>
19: private int currentGesturePart = 0;
20:
21: /// <summary>
22: /// the number of frames to pause for when a pause is initiated
23: /// </summary>
24: private int pausedFrameCount = 10;
25:
26: /// <summary>
27: /// The current frame that we are on
28: /// </summary>
29: private int frameCount = 0;
30:
31: /// <summary>
32: /// Are we paused?
33: /// </summary>
34: private bool paused = false;
35:
36: /// <summary>
37: /// The type of gesture that this is
38: /// </summary>
39: private GestureType type;
40:
41: /// <summary>
42: /// Initializes a new instance of the <see cref="Gesture"/> class.
43: /// </summary>
44: /// <param name="type">The type of gesture.</param>
45: /// <param name="gestureParts">The gesture parts.</param>
46: public Gesture(GestureType type, IRelativeGestureSegment[] gestureParts)
47: {
48: this.gestureParts = gestureParts;
49: this.type = type;
50: }
51:
52: /// <summary>
53: /// Occurs when [gesture recognised].
54: /// </summary>
55: public event EventHandler<GestureEventArgs> GestureRecognised;
56:
57: /// <summary>
58: /// Updates the gesture.
59: /// </summary>
60: /// <param name="data">The skeleton data.</param>
61: public void UpdateGesture(SkeletonData data)
62: {
63: if (this.paused)
64: {
65: if (this.frameCount == this.pausedFrameCount)
66: {
67: this.paused = false;
68: }
69:
70: this.frameCount++;
71: }
72:
73: GesturePartResult result = this.gestureParts[this.currentGesturePart].CheckGesture(data);
74: if (result == GesturePartResult.Suceed)
75: {
76: if (this.currentGesturePart + 1 < this.gestureParts.Length)
77: {
78: this.currentGesturePart++;
79: this.frameCount = 0;
80: this.pausedFrameCount = 10;
81: this.paused = true;
82: }
83: else
84: {
85: if (this.GestureRecognised != null)
86: {
87: this.GestureRecognised(this, new GestureEventArgs(this.type, data.TrackingID, data.UserIndex));
88: this.Reset();
89: }
90: }
91: }
92: else if (result == GesturePartResult.Fail || this.frameCount == 50)
93: {
94: this.currentGesturePart = 0;
95: this.frameCount = 0;
96: this.pausedFrameCount = 5;
97: this.paused = true;
98: }
99: else
100: {
101: this.frameCount++;
102: this.pausedFrameCount = 5;
103: this.paused = true;
104: }
105: }
106:
107: /// <summary>
108: /// Resets this instance.
109: /// </summary>
110: public void Reset()
111: {
112: this.currentGesturePart = 0;
113: this.frameCount = 0;
114: this.pausedFrameCount = 5;
115: this.paused = true;
116: }
117: }
The IRelativeGestureSegment:
This is the final part of a gesture. It is essentially the individual segments that make up a gesture. Below is the IRelativeGestureSegment class and the implementations of this class for a wave gesture
1: #region using...
2: using Microsoft.Research.Kinect.Nui;
3: #endregion
4:
5: /// <summary>
6: /// Defines a single gesture segment which uses relative positioning
7: /// of body parts to detect a gesture
8: /// </summary>
9: public interface IRelativeGestureSegment
10: {
11: /// <summary>
12: /// Checks the gesture.
13: /// </summary>
14: /// <param name="skeleton">The skeleton.</param>
15: /// <returns>GesturePartResult based on if the gesture part has been completed</returns>
16: GesturePartResult CheckGesture(SkeletonData skeleton);
17: }
Wave gesture
1: #region using...
2: using Microsoft.Research.Kinect.Nui;
3: #endregion
4:
5: /// <summary>
6: /// the first part of the wave left gesture
7: /// </summary>
8: public class WaveLeftSegment1 : IRelativeGestureSegment
9: {
10: /// <summary>
11: /// Checks the gesture.
12: /// </summary>
13: /// <param name="skeleton">The skeleton.</param>
14: /// <returns>GesturePartResult based on if the gesture part has been completed</returns>
15: public GesturePartResult CheckGesture(SkeletonData skeleton)
16: {
17: // hand above elbow
18: if (skeleton.Joints[JointID.HandLeft].Position.Y > skeleton.Joints[JointID.ElbowLeft].Position.Y)
19: {
20: // hand right of elbow
21: if (skeleton.Joints[JointID.HandLeft].Position.X > skeleton.Joints[JointID.ElbowLeft].Position.X)
22: {
23: return GesturePartResult.Suceed;
24: }
25: // hand has not dropped but is not quite where we expect it to be, pausing till next frame
26: return GesturePartResult.Pausing;
27: }
28:
29: // hand dropped - no gesture fails
30: return GesturePartResult.Fail;
31: }
32: }
33:
34: /// <summary>
35: /// The second part of the wave left gesture
36: /// </summary>
37: public class WaveLeftSegment2 : IRelativeGestureSegment
38: {
39: /// <summary>
40: /// Checks the gesture.
41: /// </summary>
42: /// <param name="skeleton">The skeleton.</param>
43: /// <returns>GesturePartResult based on if the gesture part has been completed</returns>
44: public GesturePartResult CheckGesture(SkeletonData skeleton)
45: {
46: // hand above elbow
47: if (skeleton.Joints[JointID.HandLeft].Position.Y > skeleton.Joints[JointID.ElbowLeft].Position.Y)
48: {
49: // hand right of elbow
50: if (skeleton.Joints[JointID.HandLeft].Position.X < skeleton.Joints[JointID.ElbowLeft].Position.X)
51: {
52: return GesturePartResult.Suceed;
53: }
54: // hand has not dropped but is not quite where we expect it to be, pausing till next frame
55: return GesturePartResult.Pausing;
56: }
57: // hand dropped - no gesture fails
58: return GesturePartResult.Fail;
59: }
60: }
61: }
NOTE: a wave gesture is made up of two parts that are repeated three times. For example the code to create a new Wave gesture would look like this (gestures is the gesture controller):
1: IRelativeGestureSegment[] waveLeftSegments = new IRelativeGestureSegment[6];
2: WaveLeftSegment1 waveLeftSegment1 = new WaveLeftSegment1();
3: WaveLeftSegment2 waveLeftSegment2 = new WaveLeftSegment2();
4: waveLeftSegments[0] = waveLeftSegment1;
5: waveLeftSegments[1] = waveLeftSegment2;
6: waveLeftSegments[2] = waveLeftSegment1;
7: waveLeftSegments[3] = waveLeftSegment2;
8: waveLeftSegments[4] = waveLeftSegment1;
9: waveLeftSegments[5] = waveLeftSegment2;
10: this.gestures.AddGesture(GestureType.WaveLeft, waveLeftSegments);
The full source code for this example (and for skeleton tracking) can be downloaded here. It contains a wave gestures with both hands as well as swipe left, swipe right and a menu gesture. When writing your own gestures it is important to consider the amount of checking that is required and optimize this for each of the parts. Generally smaller segments work better as there is less checking to be done which improves performance.
Written by Michael Tsikkos and James Glading
Comments
Anonymous
August 18, 2011
The comment has been removedAnonymous
March 13, 2012
pls share the codeAnonymous
April 19, 2012
Carl Franklin (from .NET Rocks) has written a gesture recording application and recognition API for .NET. http://gesturepak.com. No code. You, the developer, pose to create the gestures. $99. Looks good.Anonymous
July 25, 2012
I updated this example to work with v1.5 of the SDK and have moved the source into a more re-usable library structure. I also wrote a short blog post on how to use the gesture library. The blog post is at the link below; which contains a link to the new library and demo application: blog.exceptontuesdays.com/.../gestures-with-microsoft-kinect-for-windows-sdk-v1-5Anonymous
August 03, 2012
Hi Nicholas, Thanks for this example. It is very helpful. I am trying to implement push gesture with both hands. Pushing both your hands in forward direction. Can you please provide some guidance on this?Anonymous
October 17, 2012
Hi Bharat, Apologies for not responding sooner -- I don't come back here very often. Hopefully you haven't given up looking here! Are you wanting a gesture, or a real-time action? A gesture (in this library context) is an discrete action that is executed on after it is complete, while a real-time gesture would dynamically update as the motion continues (e.g., pinch to zoom). This library only supports the discrete gestures, and you would need to write your own tracking algorithm to do a real-time gesture. The real-time stuff isn't hard. I've written one that does what you're wanting. If you are still needing help, and hopefully have come back here, I would suggest visiting a site like "www.stackoverflow.com" to ask your question. I frequent that site and watch the "kinect" tag, as do others who can help. I don't normally come back here. :)Anonymous
November 03, 2012
Please be aware that my blog post about updating this library to v1.5+ has moved: www.exceptontuesdays.comAnonymous
April 02, 2013
Any one have any idea of how a jump detection could be implemented using this concept?Anonymous
July 09, 2013
very good tutorial that help me in the process of working with Kinect sensor for robotic applications. Also, I add a link to this tutorial into an article with many more tutorial related to Kinect sensor www.intorobotics.com/working-with-kinect-3d-sensor-in-robotics-setup-tutorials-applicationsAnonymous
January 03, 2014
Thanks man! this was really helpful :)