How Content-recognition Software Works

Developing the Software

Limewire is one of several file sharing programs giving media companies massive headaches.
Limewire is one of several file sharing programs giving media companies massive headaches.
Photo used under the GNU Free Documentation License

Several software companies plan to offer programs that can analyze audio and video clips, compare them to a database of content and determine whether they are from sources that are protected by copyright. Such software provides an efficient and relatively inexpensive alternative to combing through the vast amount of content on the Internet. It's also more reliable than asking your friend if he knows what song is on the radio.

You might think creating a program that recognizes video or audio content shouldn't be complicated, but it's proving to be a real challenge. For one thing, there are dozens of ways to encode a sound or video file, so creating a program that looks for matching code isn't very useful. After all, a WAV file and an MP3 file of the same song won't look the same from a programming-language perspective. In addition, songs and videos can be recorded at different bit rates, which means that two MP3 files of the same song may not match. Software that identifies songs via cell phone has to be able to identify the track despite the quality of the recording or the interfering background noise.

There are other challenges as well. Some video pirates bring recording devices into films and capture movies on their own cameras. Some projectionists have been known to set up a digital video camera in the projection room, recording a first-run movie on its premiere night. Other people who bypass legal distribution might crop a video or otherwise alter it. Any program designed to find recordings like these can't rely only on programming language or identical files.

In the next section, we'll look at the process for identifying audio files and how it compensates for these challenges.