Integrating Microsoft Kinect with Autocad

Integrating Microsoft® Kinect™ with AutoCAD®

Kean Walmsley – Autodesk

CP3840: This class will focus on learning how to integrate Kinect, Microsoft's ground-breaking controller for the Xbox 360®, with AutoCAD using .NET. We will look at the various drivers, SDKs, and middleware available to .NET developers wanting to work with Kinect, as well as the kind of capabilities a device such as Kinect brings to 3D products, such as AutoCAD. We will cover two main uses of Kinect: as a 3D reality capture device, feeding point clouds into AutoCAD, and as a user-input device, extending the next generation of gesture-detection capabilities to AutoCAD .NET developers.

Learning Objectives

At the end of this class, you will be able to:

Implement gestures to allow users to control drawing, modeling, and navigation operations via Kinect.
Describe industry trends in human-computer interaction and explain how devices such as Kinect are changing the world.
Integrate code that imports point cloud data from Kinect into AutoCAD.
List the available drivers, SDKs, and middleware available for working with Kinect.

About the Speaker

Kean has been with Autodesk since 1995, providing programming support, consulting, training, and evangelism to external developers. He has worked for Autodesk in a number of countries: the U.K., Switzerland, United States, and India. He is currently senior manager of Developer Technical Services (DevTech), the worldwide team of API gurus who provide technical services through the Autodesk Developer Network. He and his family now live in Switzerland.

Introducing Kinect

Since Kinect for Xbox 360® was launched on November 4th, 2010, the device has taken the world by storm: it became the quickest selling consumer electronics device ever (according to the Guiness Book of World Records), selling 8 million units in the first 60 days. This record has since been surpassed, but still.

Kinect was originally intended to be a controller for the Xbox 360 gaming system – allowing you to play games without a controller, or, as Microsoft like to say, you are the controller – but it soon became clear that this type of technology has much broader applications.

How Kinect Works

Active scanning technology – such as laser scanners, for instance – typically work on the basis of “time of flight” measurement: they emit pulses and measure the amount of time it takes for the reflected pulse to be returned,much in the same way as radar or sonar works. This technology – based on technology licensed by Microsoft from PrimeSense – is different: it does not measure and use the time of flight, rather it projects a specific (yet seemingly random) pattern of infra-red dots and checks any deformation of the pattern, effectively determining the topography of the objects (or people) upon which the pattern has been projected.

In addition to the camera being used to detect the pattern of infra-red dots, Kinect devices contain an additional camera to detect visible light (i.e, it also takes pictures :-). It’s with these two cameras – and some onboard electronics – that Kinect is able to generate depth and RGB images of the scene being captured. These images get refreshed frequently (30 times per second), and can be combined to generate point clouds of the scene.

In addition to this hardware-based scene capture, higher level capabilities – such as skeleton tracking –are provided in software by some SDKs and middleware components. It’s via these components that applications can interpret gestures, for instance.

The Race to Hack Kinect

On the day Kinect was released in North America, Adafruit Industries announced a competition to “hack” Kinect. They initially offered a $2,000 bounty for the person reverse-engineering a driver – for any OS, with the primary stipulation being that the implementation be shared via an open source license and posted to GitHub. The bounty increased to $3,000 and interestingly was, it later turns out, contributed – at least in part – by Johnny Lee, an engineer working on the Kinect team, at the time. Microsoft misunderstood the intent of the “hacking” competition – apparently believing there was something more mischievous happening – and spoke out against the efforts. They later clarified their initial reaction, and explained (and this is paraphrased) “we wouldn’t have used a standard USB connector if we didn’t want this to happen”.

On November 6th – two days later – the first “hack” was demonstrated, although the coder, AlexP, somewhat controversially chose not to share his code, but to use it to start a commercial venture (it became the Code Laboratories NUI Platform). He did offer to release the code as open source, should the donation fund he set up receive $10,000 from interested parties.

On November 10th, Héctor Martín uploaded his “hack” to GitHub, creating the libfreenect toolkit (now at the core of the OpenKinect community’s efforts) and effectively winning the bounty. AlexP decided to close the donation fund and, in turn, donated the $457 so far contributed to Héctor, to further rewards and encourage his efforts.

On November 14th, Oliver Kreylos, a researcher at UC Davis, posted a couple of videos to YouTube, showing the ability to capture 3D data coming via Kinect and how to measure the device’s accuracy. Oliver had written his own drivers, but had based the work on that performed by Héctor.

The first video went viral. Within a matter of days it had received 1 million views and for that period was the most watched video on the whole of YouTube. At the time of writing, the video has 2.3 million views.

At this stage, we now have a few low-level drivers providing access to the depth and color data generated by Kinect. The availability of these drivers led to the creation of a whole slew of “Kinect Hacks”. KinectHacks.net, launched on November 12th, became the “go to” destination for such hacks, and – it being the Internet – a number of copycat sites eventually spawned around it. This became the place that aspiring “hackers” would post the fruit of their efforts and earn the respect of the broader public.

In early December, PrimeSense – the company that had licensed much of the 3D capture technology to Microsoft – founded the OpenNI (for Open Natural Interaction) community. They released the OpenNI SDK as an open source component to allow the use of such depth camera technology. This SDK could be used, along with a “hacked” SensorKinect module (which was also open source), to access the data coming from Kinect. In addition to this, OpenNI also released the NITE middleware providing higher-level capabilities such as skeleton tracking and gesture recognition.

While PrimeSense licensed certain technology to Microsoft – and may well have done so at an algorithmic level, also – it’s clear that the core Xbox skeleton tracking is different to that implemented in NITE: if only for the fact you need to strike a calibration pose for NITE-enabled applications to detect you.

It was clear Microsoft had their own software stack related to Kinect, and on June 16th, 2011, they released the first Beta of their own SDK for non-commercial use. The availability of this SDK greatly simplified the configuration needed to harness Kinect in custom applications on Windows, at least, and it is this implementation that is being used in the code in this session.

The availability of technologies allowing people to harness the ground-breaking capabilities of Kinect in their own applications has led to an explosion of creativity. The variety and ingenuity of the “hacks” made available are breath-taking, and this is really just the beginning. It’s clear there are a number of very compelling – as well as downright fun – use cases for this technology, and many of them relate to the world of 3D design.

The samples shown in this session are really just a foundation for you all to take this to the next level. As you get to grips with the technology, it should also become apparent where the real opportunities lie.

Getting Started Working with Kinect and AutoCAD

The first thing you need to start integrating a Kinect with AutoCAD is, of course, a Kinect. Well, and a copy of AutoCAD installed on a Windows machine. It’s worth pointing out that this currently means Windows installed directly onto the system – it will not work if inside a virtual machine, such as inside Parallels on OS X. The latest Beta of the Microsoft Kinect SDK also allows you to create 64-bit applications: previously, while a 64-bit version of the SDK was provided, you could only actually create 32-bit modules.

Something important to note: you need to make sure you have the external USB power cable such as the one provided with Kinect when sold separately – primarily to allow older Xbox 360 systems to work with Kinect (as the device needs more power than is provided via standard USB). If you buy a newer Xbox 360 bundle – where the included Xbox “S” has an appropriate port specifically for Kinect – you don’t get this. In some countries you can buy this adaptor from Microsoft, in others you may have to rely on eBay or its equivalent.

Creating an application that accesses Kinect from inside AutoCAD is actually very straightforward with the Microsoft Kinect SDK.

Install the latest SDK from
Launch Visual Studio 2010 and create a new Class Library project (in this case I suggest using C#, to make it easier to copy & paste code from this document or from “Through the Interface” –
Add project references to AcMgd.dll and AcDbMgd.dll (you should browse to these in the inc folder location of your ObjectARX SDK), making sure they have “Copy Local” set to False
Add a project reference to Microsoft.Research.Kinect.dll. This can be found from the GAC (i.e. the .NET tab in the Add References dialog).
Copy and paste the following code into the main .cs file:

using Autodesk.AutoCAD.ApplicationServices;
using Autodesk.AutoCAD.DatabaseServices;
using Autodesk.AutoCAD.EditorInput;
using Autodesk.AutoCAD.Geometry;
using Autodesk.AutoCAD.GraphicsInterface;
using Autodesk.AutoCAD.Runtime;
using Microsoft.Research.Kinect.Nui;
using System.Collections.Generic;
using System;
namespace KinectSkeletons
{
publicclassKinectSkeletonJig : DrawJig, IDisposable
{
privateRuntime _kinect = null;
privateListLine> _lines;
// Flags to make sure we don't end up both modifying
// and accessing the _lines member at the same time
privatebool _drawing = false;
privatebool _capturing = false;
// An offset value we use to move the mouse back
// and forth by one screen unit
privateint _offset;
public KinectSkeletonJig()
{
// Initialise members
_offset = 1;
_lines = newListLine>();
_kinect = Runtime.Kinects[0];
_drawing = false;
_capturing = false;
// Attach the event handler
_kinect.SkeletonFrameReady +=
newEventHandlerSkeletonFrameReadyEventArgs>(
OnSkeletonFrameReady
);
// Initialise the Kinect sensor
_kinect.Initialize(
RuntimeOptions.UseSkeletalTracking
);
}
publicvoid Dispose()
{
// Uninitialise the Kinect sensor
_kinect.Uninitialize();
// Detach the event handler
_kinect.SkeletonFrameReady -=
newEventHandlerSkeletonFrameReadyEventArgs>(
OnSkeletonFrameReady
);
// Clear our line list
ClearLines();
}
protectedoverrideSamplerStatus Sampler(JigPrompts prompts)
{
// We don't really need a point, but we do need some
// user input event to allow us to loop, processing
// for the Kinect input
PromptPointResult ppr =
prompts.AcquirePoint("\nClick to finish: ");
if (ppr.Status == PromptStatus.OK)
{
// Let's move the mouse slightly to avoid having
// to do it manually to keep the input coming
System.Drawing.Point pt =
System.Windows.Forms.Cursor.Position;
System.Windows.Forms.Cursor.Position =
new System.Drawing.Point(
pt.X, pt.Y + _offset
);
_offset = -_offset;
returnSamplerStatus.OK;
}
returnSamplerStatus.Cancel;
}
protectedoverridebool WorldDraw(WorldDraw draw)
{
if (!_capturing)
{
_drawing = true;
// Draw each of our lines
foreach (Line ln in _lines)
{
// Set the colour and lineweight in the subentity
// traits based on the original line
if (ln != null)
{
draw.SubEntityTraits.Color = (short)ln.ColorIndex;
draw.SubEntityTraits.LineWeight = ln.LineWeight;
ln.WorldDraw(draw);
}
}
_drawing = false;
}
returntrue;
}
void OnSkeletonFrameReady(
object sender, SkeletonFrameReadyEventArgs e
)
{
if (!_drawing)
{
_capturing = true;
// Clear any previous lines
ClearLines();
// Get access to the SkeletonFrame
SkeletonFrame s = e.SkeletonFrame;
// We'll colour the skeletons from yellow, onwards
// (red is a bit dark)
short col = 2;
// Loop through each of the skeletons
for (int i = 0; i < s.Skeletons.Length; i++)
{
SkeletonData data = s.Skeletons[i];
// Add skeleton vectors for tracked/positioned skeletons
if (
data.TrackingState == SkeletonTrackingState.Tracked ||
data.TrackingState == SkeletonTrackingState.PositionOnly
)
{
AddLinesForSkeleton(_lines, data, col++);
}
}
_capturing = false;
}
}
// Get a Point3d from a Kinect Vector
privatestaticPoint3d PointFromVector(Vector v)
{
// Rather than just return a point, we're effectively
// transforming it to the drawing space: flipping the
// Y and Z axes
returnnewPoint3d(v.X, v.Z, v.Y);
}
privatevoid AddLinesForSkeleton(
ListLine> lines, SkeletonData sd, int idx
)
{
// Hard-code lists of connections between joints
int[][] links =
newint[][]
{
// Head to left toe
newint[] { 3, 2, 1, 0, 12, 13, 14, 15 },
// Hips to right toe
newint[] { 0, 16, 17, 18, 19 },
// Left hand to right hand
newint[] { 7, 6, 5, 4, 2, 8, 9, 10, 11 }
};
// Populate an array of joints
Point3dCollection joints = newPoint3dCollection();
for (int i = 0; i < 20; i++)
{
joints.Add(PointFromVector(sd.Joints[(JointID)i].Position));
}
// For each path of joints, create a sequence of lines
foreach (int[] link in links)
{
for (int i = 0; i < link.Length - 1; i++)
{
// Line from this vertex to the next
Line ln =
newLine(joints[link[i]], joints[link[i + 1]]);
// Set the color to distinguish between skeletons
ln.ColorIndex = idx;
// Make tracked skeletons bolder
ln.LineWeight =
(sd.TrackingState == SkeletonTrackingState.Tracked ?
LineWeight.LineWeight050 :
LineWeight.LineWeight000
);
lines.Add(ln);
}
}
}
privatevoid ClearLines()
{
// Dispose each of the lines and clear the list
foreach (Line ln in _lines)
{
ln.Dispose();
}
_lines.Clear();
}
}
publicclassCommands
{
[CommandMethod("ADNPLUGINS", "KINSKEL", CommandFlags.Modal)]
publicvoid KinectSkeletons()
{
Editor ed =
Application.DocumentManager.MdiActiveDocument.Editor;
try
{
// Create and use our jig, disposing afterwards
using (KinectSkeletonJig sj = newKinectSkeletonJig())
{
ed.Drag(sj);
}
}
catch (System.Exception ex)
{
ed.WriteMessage(
"\nUnable to start Kinect sensor: " + ex.Message
);
}
}
}
}

Build the project. You should now be able to NETLOAD the resultant DLL into AutoCAD.
Run the KINSKEL command – you may need to load the supplied DWG file (Kinect.dwg) to make the generated graphics visible on screen.

Taking a closer look at the above code:

Lines 13-241 define the SkeletonJig class, which does most of the work. It’s this class that instantiates and uses the Kinect sensor to track skeleton data and display it inside AutoCAD.

Specifically, lines 29-70 define the constructor, which initializes a number of member variables, and the Dispose() function, which cleans up at the end. It’s during these functions that we register and de-register our event handler (OnSkeletonFrameReady()), which we use to capture the lines representing our skeletons (and which we’ll talk more about later).

Lines 72-124 define the “DrawJig” protocol our class needs to implement: Sampler(), which is called repeatedly as user input messages from the mouse and keyboard are received, and WorldDraw(), which is called to allow us to draw the lines representing our skeletons. To make sure our Sampler() function gets called repeatedly – despite the fact we’re only receiving user input via Kinect and not via the mouse or keyboard – we use a little trick: we move the cursor by one pixel at the end of the Sampler() function, which results in a “mouse move” message being sent, which results in the Sampler() function being called…

Lines 126-229 define our OnSkeletonFrameReady() event handler and its helper functions, PointFromVector() (which translates a function from the Kinect “skeleton space” to real world space where the Z axis is upwards), and AddLinesForSkeleton() (which gets the joint information for a skeleton and links them together into a series of line segments for display). OnSkeletonFrameReady()gets called by the Kinect runtime whenever a new skeleton frame is ready for processing. It’s within this function that we loop through the various skeletons being tracked (actively or just for positioning), and we then call down into AddLinesForSkeleton(), which in turn calls PointFromVector() for each point it needs to map into standard WCS coordinates.

The code uses a couple of Boolean flags to protect our list of lines: it’s important that we avoid it being written to and read from at the same time (and this may happen, depending on how Kinect chooses to call the OnSkeletonFrameReady() callback). If a method is reading the list, we don’t write to it, and if a method is writing to the list, we don’t read it. As the methods get called repeatedly, we can safely assume they’ll be called again before it becomes noticeable.

It’s probably worth mentioning the approach used in AddLinesForSkeleton(): rather than get each of the 20 vertices needed to make up a complete skeleton by ID, we look through and populate an array based on the numerical value in the JointID enumeration. This saves us a lot of code, but it does mean we effectively depend on those enumeration values not changing. To create our three sets of vectors (from the head down through the joints to the left foot, from the middle of the hips to the right foot, and from the left hand to the right hand), we take a list of index values and use them to reference into our array of joint positions. We then create a line for each segment and assign the colour for that skeleton and an appropriate line-weight (depending on whether it is being actively tracked or not).

And that’s really all there is to it: the remaining code is used to make sure we dispose of our temporary lines properly (the ClearLines() helper in lines 231-240) and to implement the KINSKEL command, which instantiates and uses our SkeletonJig class to display skeleton information from Kinect inside AutoCAD:

Additional Material

The session included a number of more complicated samples, demonstrating techniques for integrating AutoCAD with Kinect:

Displaying and importing point clouds

Drawing 3D polylines

Sweeping simple solids

Sweeping segmented solids

Navigating a 3D model

These samples were written using the Microsoft Kinect SDK, but versions of many of them using OpenNI/NITE/nKinect are also available on “Through the Interface”.

For additional samples and further reading, please be sure to visit: