How 3-D Gestures Work

A Little Light Gesturing

What travels at 299,792,458 meters per second in a vacuum? No, it's not a dust bunny. It's light. It might seem like trivia to you, but the speed of light comes in handy when you're building a 3-D gesture system, particularly if it's a time-of-flight arrangement.

This type of 3-D gesture system pairs a depth sensor and a projector with the camera. The projector emits light in pulses -- typically it's infrared light, which is outside the spectrum of visible light for humans. The sensor detects the infrared light reflected off everything in front of the projector. A timer measures how long it takes for the light to leave the projector, reflect off objects and return to the sensor. As objects move, the amount of time it takes the light to travel will vary and the computer interprets the data as movements and commands.

Imagine you're playing a tennis video game using a 3-D gesture system. You stand at the ready, waiting to receive a serve from your highly seeded computer opponent. The 3-D gesture system takes note of where you are in relation to your surroundings -- the infrared light hits you and reflects back to the sensor, giving the computer all the data it needs to know your position.

Your opponent serves the ball and you spring into motion, swinging your arm forward to intercept the ball. During this time, the projector continues to fire out pulses of infrared light millions of times per second. As your hand moves away from and then toward the camera, the amount of time it takes for the infrared light to reach the sensor changes. These changes are interpreted by the computer's software as movement and further interpreted as video game commands. Your video game representation returns the serve, wins a point and the virtual crowd goes wild.

Another way to map out a three-dimensional body is to use a method called structured light. With this approach, a projector emits light -- again outside the spectrum of visible light -- in a grid pattern. As the grid encounters physical objects, it distorts. A sensor detects this distortion and sends the data to a computer, which measures the distortion. As you move about, your movements will cause the grid to distort in different ways. These differences create the data that the computer needs to interpret your movements as commands.

A 3-D gestures system doesn't have to rely on a single technological approach. Some systems could use a combination of multiple technologies in order to figure out where you are and what you're doing.