I was recently working on a project that used the Microsoft Kinect SDK. The goal here was to have a users hand drive a cursor on a large screen, and allow them to navigate around by using a hover-to-click model. One thing that became immediately apparent was that the data coming from the device was very, very jumpy.

While this might not be an issue when you are trying to see if a user is waving their arms around or is standing or sitting, it is a real problem when you are trying to track fine motor movements. For example tracking a single joint such as the right hand and using that to position a screen element is so troublesome that it is almost unusable.

From empirical observation, and examination of the data the problems intensify as one joint (say the hand) moves in front of the other. This makes sense, as the sensor is trying to determine which joint is which and tends to flip flop between the two. This makes for a classic GIGO situation. The kinect runtime does have some smoothing built in:

_kinectRuntime.SkeletonEngine.TransformSmooth = true; var parameters = new TransformSmoothParameters { Smoothing = 0.75f, Correction = 0.0f, Prediction = 0.0f, JitterRadius = 0.02f, MaxDeviationRadius = 0.04f }; _kinectRuntime.SkeletonEngine.SmoothParameters = parameters;

But this seemed to have minimal effect. I decided that I needed something more substantial to control the x,y data points. The thing that I found interesting is that this is a complex problem- seemingly too complex for my non-math background. But even so, there is a relatively simple approach, just a nice weighted average. I played with both a straight algebraic average, and an exponential average. The idea was that if I can smooth the data and reduce the lag just enough it would significantly improve the user experience. Here’s what I did:

A nice simple Exponential average:

public double ExponentialMovingAverage( double[] data, double baseValue ) { double numerator = 0; double denominator = 0; double average = data.Sum(); average /= data.Length; for ( int i = 0; i < data.Length; ++i ) { numerator += data[i] * Math.Pow( baseValue, data.Length - i - 1 ); denominator += Math.Pow( baseValue, data.Length - i - 1 ); } numerator += average * Math.Pow( baseValue, data.Length ); denominator += Math.Pow( baseValue, data.Length ); return numerator / denominator; }

And a weighted average:

public double WeightedAverage( double[] data, double[] weights ) { if ( data.Length != weights.Length ) { return Double.MinValue; } double weightedAverage = data.Select( ( t, i ) => t * weights[i] ).Sum(); return weightedAverage / weights.Sum(); }

The exponential average, in my opinion was better. More smoothing and jitter control with less lag. Good! Here’s how I used it:

private readonly Queue<double> _weightedX = new Queue<double>(); private readonly Queue<double> _weightedY = new Queue<double>(); Point point = ExponentialWeightedAvg( scaledJoint ); private Point ExponentialWeightedAvg( Joint joint ) { _weightedX.Enqueue( joint.Position.X ); _weightedY.Enqueue( joint.Position.Y ); if ( _weightedX.Count > Settings.Default.Smoothing ) { _weightedX.Dequeue(); _weightedY.Dequeue(); } double x = ExponentialMovingAverage( _weightedX.ToArray(), 0.9 ); double y = ExponentialMovingAverage( _weightedY.ToArray(), 0.9 ); return new Point( x, y ); }

Note: The scaledJoint comes from the Kinect SkeletonFrameReady event handler, after using the ScaleTo() extension method from the Coding4Fun Kinect Toolkit.

Once you have your Point, you can use it to place your nicely smoothed cursor (In my case an image of a hand on a canvas) in the right location on every frame from the Kinect, and have it be nice and stable. I found that using just a few points (5-7) was enough to smooth and reduce jitter.

Hi,

In your ExponentialMovingAverage method you have double[] data & double baseValue in your parameters,

where does the data array & the base value comes from or what are they?

Please Help

Thanks

Irfan

Hi Irfan, so the data[] is the array that contains the points you are using for your smoothing. The baseValue is used to help determine the weighting of the exponential curve, a value between 0 and 1. A value of 0.9 which I am using places the greatest weight on the last data points- so the expectation is that this puts the joint in the most accurate location. Hope that helps!

Omg! This works veeeery nice! I had to hardcode the Settings.Default.Smoothing, because isnt that a float?

Hi Afra, glad it helped you out The Smoothing variable is just an int, its the number of points you want to use in your smoothing. I use 5, if you use too many you introduce too much lag, not enough and the smoothing is not good.

Hi,

I was doing a project related with Kinect, I also met the smoothing problem. Based on your method, the function ExponentialWeightedAvg will return a new joint position. Does it mean that all the values of a motion are updated?

For example, we have five values for X and Y

X=[9 8 7 6 5]

Y=[0 1 2 3 4]

all the five pairs are updated? What I want ask is that after updating, all the values are not real values from kinect, but derived from this method, right?

Hope my understanding is right.

BTW, you don’t need to consider about the z value?

Regards

Tony

Hi Tony,

Yes, you are correct in that all the values of a motion are updated. I’m just taking the average (exponential weighted) of those points and creating a new point to use.

I definitely use the z value! In this example, I was just smoothing the 2D data points that I was using to draw to the canvas. But all of my applications now do the smoothing on the x,y,z points of the joints themselves and I just use that smoothed data however I need.

hi Bryan Coon , thx for you sharing, It works very well. Thx again!

Tony

but what Settings.Deafult.smoothing is used for ???

any suggestion

The Settings.Default.Smoothing property is used to provide the number of data points we are averaging. Usually 5-7 or around there. Too few points, and there is not enough smoothing too many and there is too much lag.