How Does The Xbox Kinect Work

By Robert Cong & Ryan Winters
Product Marketing Managers

Gaming Technology Takes Giant Leap

Video games are no longer just for the agile finger and thumb crowd. The gaming industry has gone full motion first with the Wii console and now with Microsoft's Xbox Kinect. Gone are need for controllers; this is full-body gaming. In this article we review how it works and after extensive personal testing we'll pass judgment on whether this technology is ready for primetime.

This little black box contains multiple features that allow gamers to be the controller and feel as if they are actually in the game. When we heard about this sensor giving players' the ability to transform their entire bodies into a game controller, we were convinced that we just had to investigate what really goes on inside of this revolutionary device.

The Xbox Kinect The Xbox Kinect


The software is what makes the Kinect a breakthrough device. Developers for the Kinect gathered an incredible amount of data regarding motion-capture of actual moving things in real-life scenarios. Processing all of this data using a special artificial intelligence machine-learning algorithm allows the Kinect to map the visual data it collects to models representing people of different backgrounds (age, height, gender, body type, clothing and more). This is just one of the ways that developers were able to help the Kinect "learn" about its surroundings and what it is actually seeing.

The Kinect's "brain" is really the secret. Stored in the system is enough intelligence to analyze what it sees and align that with stored collection of skeletal structures to interpret your movements. Once the brain has enough data on your body parts, it outputs this reference data into a simplified 3D avatar shape. Beyond gauging player movements, the Kinect must also judge the distances of different points on your body throughout the entire game. To do this it uses a host of sensors and analyzes all this data 30 times a second.

Microsoft's Computers Microsoft's clusters of computers are the "learning brain" that feeds all Kinects

Depth perception using the infrared cameraDepth perception using the infrared camera

Transfer of information Transfer of information from the camera to the TV screen you see


The Kinect contains three vital pieces that work together to detect your motion and create your physical image on the screen: an RGB color VGA video camera, a depth sensor, and a multi-array microphone.

The camera detects the red, green, and blue color components as well as body-type and facial features. It has a pixel resolution of 640x480 and a frame rate of 30 fps. This helps in facial recognition and body recognition.

The depth sensor contains a monochrome CMOS sensor and infrared projector that help create the 3D imagery throughout the room. It also measures the distance of each point of the player's body by transmitting invisible near-infrared light and measuring its "time of flight" after it reflects off the objects.

The microphone is actually an array of four microphones that can isolate the voices of the player from other background noises allowing players to use their voices as an added control feature.

These components come together to detect and track 48 different points on each player's body and repeats 30 times every second.

Xbox player Graphical image of an Xbox player

Putting both hardware and software together give the Kinect the ability to generate 3D images and recognize human beings within its field of vision. It can analyze the person in front of it and go through multiple "filters" to try and determine which type of body structure matches with the correct type programmed in its system. Data is constantly being transferred back and forth between the Kinect and the objects in its field of vision while you simply enjoy the fun of being a character in a game, without holding anything in your hands.

Game Kinect Sports Hurdles Game: Kinect Sports Hurdles (Jump over hurdles without leaving the living room)

As great as it sounds to be able to play a game without the controller, it doesn't stop at just playing video games. There are tons of possible applications with the Kinect far from the gaming world. Read about some of the truly remarkable applications below.

Robotic Applications

Because the Kinect has an infrared projector, infrared camera and color camera, it's a great imaging tool, even for robots. In order to enhance the range and autonomous nature of robots, they need to be able to see the environment around them. One way they do this is through simultaneous localization and mapping, or SLAM.

Traditionally, these kinds of sensors are either expensive and cumbersome or cheap and unreliable. Laser arrays are expensive and heavy and can only map in two dimensions. Stereo cameras are light and can map in 3D, but require colossal computing power. Ken Conley of Willow Garage can now sell his Kinect-equipped TurtleBot for $500. A gigantic savings from the previous non-Kinect version that cost over $250,000!

The TurtleBot is a customizable mobile robotic platform that rides on an iRobot Create platform and uses the open-source ROS (Robot Operating System) platform. The TurtleBot uses the Kinect to see the world in 3D and for detecting and tracking people. Right out of the box, you can program TurtleBot to build maps of your home and navigate from your kitchen to your favorite seat in the living room. It also has the capability to take pictures from around your house and stitch them together to create a 360-degree panorama.

With heavier-duty and more robust platforms, a user can also give gesture commands to control the robot, like the PR2 Robot, and even remotely control the limbs as if they were your own. Perhaps one day we can use them to lift heavy objects for us and do other chores.

Scientific Applications

Last year, Ph.D student, Ken Mankoff squeezed his way into a small cavern underneath the Rieperbreen Glacier in Svalbard, Norway with a backpack containing a laptop, a battery pack and a Kinect. Once inside, he used the Kinect sensor to scan the cave floor in 3D to map its size and the irregularities on the surface. This helps the scientists better understand how the ice above flows toward the sea. The Kinect is quickly becoming a vital tool because of the 3D data it captures in visible and infrared wavelengths with very high accuracy.

The Kinect is in a league of its own effectively capturing 9 million data points per second. Traditional scanning tools can be bulky and use LIDAR (Light Detection And Ranging) to send laser pulses to accurately measure surfaces over many miles, but these systems cost between $10,000 and $200,000 and have be ordered from special manufacturers and operated by trained professionals. On the other hand, the Kinect costs around $120 and takes measurements in the three to 16 foot range, and it can fit in your pocket. The Kinect is an inspiring device because of its low cost and most students are already familiar with it.

Small and cheap hardly means incapable. The Kinect even has relevance in space. Naor Movshovitz, a planetary Ph.D. student at UC Santa Cruz, said the data would be useful for future missions where we may have to deflect medium to large asteroids that threaten to impact Earth.

We have pretty good data on how objects impact the Earth surface, but how do impacts differ when there is extremely low gravity? The idea is to use one of NASA's gravity-reduced airplanes to study how a small projectile would impact a dirt pile, while the Kinect would be used to measure the three-dimensional position of objects to get data about how debris is ejected after the projectile's impact.

3-D data from the Kinect's scan Images: A sample of the 3-D data from the Kinect's scan of the glacier cave. Ken Mankoff.

Sign Language

Students from Georgia Tech created CopyCat, a platform designed to collect gesture data for the ASL (American Sign Language) recognition system and as a practical application to help deaf children develop working memory and language skills while they play the game. The Kinect sensor has helped eliminate the necessity of colored gloves with wrist-mounted accelerometers. The goal is to encourage more complex sign construction rather than the normal one or two sign phrases.


This isn't just cool; this is as important a tool as the computer mouse. While still in its infancy, we fully expect Kinect and its future generations to not only transform the way we game, but the way we control all the machines in our life. This technology gets our highest rating!

Additional links:

How Motion Detection Works in Xbox Kinect

How Microsoft Kinect Works

Inside Project Natal's Brain

Scientists Hack Kinect to Study Glaciers and Asteroids

10 Hacks That Make Microsoft's Kinect a Killer Controller

Robert Cong is a graduate from Cal Poly, San Luis Obispo in Electrical Engineering. His interests include sports, movies, music, and playing with cool, new gadgets.

Ryan Winters is a Product Manager at Jameco Electronics and a Bay Area, California native. He is mostly self-taught and his hobbies include working on cars and computers, fiddling with electronic gadgets and experimenting with robotics.