How Kinect Sees In 3d, It’s not Magic, It is Trigonometry and Beams of Light

Kinect is a pretty sophisticated piece of equipment, and is pretty revolutionary as a gaming platform, but the "magic" of how it sees in the dark dates back to Pythagorean Theorem, and all that stuff you forgot from trig class.

The Kinect use an Infrared emitter to blast the room with IR light, the same kind of light that is used by your remote control.  but the light isn't just a single random beam, it is much closer to a laser lights display, or a projector at a planetarium.  Lots of dots are projected out in to the room, creating a Map of the room that can only be seen by the infrared sensor in the Kinect.

Remember A Squared + B Squared = C Squared?  Well the Kinect is using that to measure how far away you are in realtime 30 times per second.

When you were in 7th grade solving for the third leg of a triangle you were doing calculation based on knowing either A and B or A and C, well the IR emitter being the apex of a bunch of triangles is able to look at the size of the dots that are projected, and the distance neighboring dots are from each other.

PrimeSense who built the technology describes it like this:

PrimeSense's technology for acquiring the depth image is based on Light Coding™. Light Coding works by coding the scene volume with near-IR light. The IR Light Coding is invisible to the human eye. The solution then utilizes a standard off-the-shelf CMOS image sensor to read the coded light back from the scene. PrimeSense's SoC chip is connected to the CMOS image sensor, and executes a sophisticated parallel computational algorithm to decipher the received light coding and produce a depth image of the scene. The solution is immune to ambient light.

Which doesn’t tell you much, so let me illustrate.

Kinect with Night Shot

You can see that there are lots of dots being projected in to the room as captured by Andrewe1's video, but those dots aren't as random as they look.  Objects that are closer have smaller dots that are closer together, and things that are farther away have larger dots farther apart.

How Kinect Sees In 3d

Yes that is an ugly drawing, but it shows what is happening, the dots on close objects will be closer together than the dots farther from the projector.  The calculation is slightly more complex than that, but not much.  Clusters of similarly distanced dots are assumed to be an object, and that is how Kinect then maps you on to a skeleton that controls your avatar.

The Facial recognition part of Kinect works because the Kinect sensor is high enough resolution that even at 6 feet away it can do what your lap top would do at 2 feet.

To me the really impressive part is the sound processing, which I will explain in another article later.