Simple Kinect Joint Smoothing

I was recently working on a project that used the Microsoft Kinect SDK. The goal here was to have a users hand drive a cursor on a large screen, and allow them to navigate around by using a hover-to-click model. One thing that became immediately apparent was that the data coming from the device was very, very jumpy.

While this might not be an issue when you are trying to see if a user is waving their arms around or is standing or sitting, it is a real problem when you are trying to track fine motor movements. For example tracking a single joint such as the right hand and using that to position a screen element is so troublesome that it is almost unusable.

From empirical observation, and examination of the data the problems intensify as one joint (say the hand) moves in front of the other. This makes sense, as the sensor is trying to determine which joint is which and tends to flip flop between the two. This makes for a classic GIGO situation. The kinect runtime does have some smoothing built in:

_kinectRuntime.SkeletonEngine.TransformSmooth = true;
 var parameters = new TransformSmoothParameters
{
    Smoothing = 0.75f,
    Correction = 0.0f,
    Prediction = 0.0f,
    JitterRadius = 0.02f,
    MaxDeviationRadius = 0.04f
};
_kinectRuntime.SkeletonEngine.SmoothParameters = parameters;

But this seemed to have minimal effect. I decided that I needed something more substantial to control the x,y data points. The thing that I found interesting is that this is a complex problem- seemingly too complex for my non-math background. But even so, there is a relatively simple approach, just a nice weighted average. I played with both a straight algebraic average, and an exponential average. The idea was that if I can smooth the data and reduce the lag just enough it would significantly improve the user experience. Here’s what I did:

A nice simple Exponential average:

public double ExponentialMovingAverage( double[] data, double baseValue )
{
    double numerator = 0;
    double denominator = 0;

    double average = data.Sum();
    average /= data.Length;

    for ( int i = 0; i < data.Length; ++i )
    {
        numerator += data[i] * Math.Pow( baseValue, data.Length - i - 1 );
        denominator += Math.Pow( baseValue, data.Length - i - 1 );
    }

    numerator += average * Math.Pow( baseValue, data.Length );
    denominator += Math.Pow( baseValue, data.Length );

    return numerator / denominator;
}

And a weighted average:

public double WeightedAverage( double[] data, double[] weights )
{
    if ( data.Length != weights.Length )
    {
        return Double.MinValue;
    }

    double weightedAverage = data.Select( ( t, i ) => t * weights[i] ).Sum();

    return weightedAverage / weights.Sum();
}

The exponential average, in my opinion was better. More smoothing and jitter control with less lag. Good! Here’s how I used it:

private readonly Queue<double> _weightedX = new Queue<double>();
private readonly Queue<double> _weightedY = new Queue<double>();

Point point = ExponentialWeightedAvg( scaledJoint );

private Point ExponentialWeightedAvg( Joint joint )
{
    _weightedX.Enqueue( joint.Position.X );
    _weightedY.Enqueue( joint.Position.Y );

    if ( _weightedX.Count > Settings.Default.Smoothing )
    {
        _weightedX.Dequeue();
        _weightedY.Dequeue();
    }

    double x = ExponentialMovingAverage( _weightedX.ToArray(), 0.9 );
    double y = ExponentialMovingAverage( _weightedY.ToArray(), 0.9 );

    return new Point( x, y );
}

Note: The scaledJoint comes from the Kinect SkeletonFrameReady event handler, after using the ScaleTo() extension method from the Coding4Fun Kinect Toolkit.

Once you have your Point, you can use it to place your nicely smoothed cursor (In my case an image of a hand on a canvas) in the right location on every frame from the Kinect, and have it be nice and stable. I found that using just a few points (5-7) was enough to smooth and reduce jitter.