How Digital Fingerprinting Works

Hunting down movies and music on sites like YouTube is fun, but unlicensed sharing of copyrighted material is illegal.
Screenshot by HowStuffWorks.com staff

Picture this: The classic David Bowie and Queen collaboration "Under Pressure" pops into your head. You haven't heard it in years, and suddenly you can think of nothing but the lyrics. But you can't remember them all, so you instinctively hop online and look up the song and give it a listen. Where do you go first? You could go to iTunes, where you might find the song available as a 99 cent MP3. But instead, you go to YouTube. And then you want to share your find with your friends, so you post a link on Twitter. Within minutes a dozen people are listening to that same song without paying a dime.

This is the kind of situation digital content creators aren't too wild about. Someone owns the rights to most of the content available on the Web, and all too often it's distributed without permission. How can companies monetize something that's so easily (and frequently) duplicated and shared for free?

Advertisement

One answer is a process called digital fingerprinting. Digital fingerprinting technology relies on complex computer-driven analysis to identify a piece of media like a song or video clip. Here's where the fingerprint analogy is born: Just like every person has a unique fingerprint, every piece of media has identifying features that can be spotted by smart software. But what good does this kind of identification really do? Sites like YouTube can scan files and match their fingerprints against a database of copyrighted material and stop users from uploading copyrighted files. Sounds simple, right? Surprisingly, people often confuse digital fingerprinting with watermarking or don't have a clear picture of what the technology entails.

Part of the problem is that the term "digital fingerprinting" can actually refer to two entirely different things. The first meaning we've already covered, but the second works from a more traditional fingerprint analogy, equating your personal computer to an online fingerprint that can be used to track your online activity. Both concepts refer to a unique identifier, but with completely different functionalities -- this second meaning has nothing to do with spotting copyrighted songs or videos. Neither one involves scanning real fingerprints, but they're pretty cool technologies anyway. Let's take a look at how they work.

 

 

Watermarking vs. Fingerprinting

It's easy to mix up a digital fingerprint and a digital watermark, but these are two very different technologies with somewhat similar goals. When fingerprints are most commonly mentioned in popular culture, they're referenced in spy movies or mysteries as visible identifying markings people leave behind. Well, that's not how a digital fingerprint works -- you'll never see any visible evidence that a digital fingerprint exists. The term watermark, on the other hand, typically refers to a completely visible marking on a digital file. Watermarks serve to curb the unlawful dissemination of content by annoyance more than smart technology [source: Milano].

A watermark is a logo or other identifying marking placed on an image or video that is visible at all times. The watermark aims to discourage Internet users from taking a photograph or a video from one Web site and using it for their own purposes without acknowledging the source. It's pretty hard to pretend a photo belongs to you when it has someone else's logo plastered all over it! Unfortunately, there's nothing that really guarantees a watermark will be effective. Pirates can still share watermarked videos, and some photos with smaller watermarks can easily be cropped to hide the identity of their rightful owner. A second form of watermarking adds an imperceptible bit of data to a file that can be used for tracking purposes. While this may sound even less useful than a visible watermark, it actually allows content owners to track the origin of a file by its unique watermark [source: Milano].

Advertisement

Digital fingerprinting offers an even more promising way to restrict the spread of copyrighted material. The very makeup of a file, which you could call its DNA, can be analyzed and recognized by a computer program designed to filter out licensed material. That fingerprint represents the digital equivalent of a red flag -- when a computer system knows how to interpret its message, it acts as a warning that says "I'm copyrighted!" Of course, it's not quite that simple. The next page will dive into the technology that throws up that red flag.

Digital Fingerprinting Technology

The digital fingerprinting service Audible Magic promises to help companies monetize their copyrighted content.
The digital fingerprinting service Audible Magic promises to help companies monetize their copyrighted content.
Screenshot by HowStuffWorks.com staff

Unlike watermarking, digital fingerprinting never involves modifying a file. The most promising use of digital fingerprinting is preventive rather than tracking-based. For fingerprinting to work, software has to be able to accurately identify a piece of media and relate that file to an external database. To achieve this, fingerprinting software samples an audio or video file to pick out tiny portions of the file that are unique to that piece of media. Those samples could involve a handful of red pixels that make up a character's hat at the 57-minute mark of a movie or the exact pitch of a singer's voice 30 seconds into a song [source: Milano]. Those are extremely simplistic examples -- fingerprinting entails sampling multiple criteria to form an accurate representation of the media in question.

One major digital fingerprinting company, Audible Magic, works for huge content providers including NBC Universal, Sony Music and 20th Century Fox [source: Audible Magic]. Audible Magic boasts that its CopySense technology can identify the source of a video clip within five seconds of playback and can identify an audio file within 10 seconds. And supposedly, that's under any conditions. Not only is Audible Magic's software designed to identify a pristine copy of a movie, the company claims its software can recognize a piece of media that was, say, recorded off a movie theater screen with a handheld camera.

Advertisement

Identification is based on what Audible Magic calls "the perceptual characteristics of audio and video." The system is smart enough to see past transformative changes to audio and video files so that transcoding between file formats, equalizing audio, cropping an image or even blurring a picture can't fool CopySense [source: Audible Magic].

Does that mean Audible Magic can identify every piece of content on the Internet? Nope -- fingerprinting only works with media that has been analyzed and has a reference file uploaded to a database. That file contains all the perceptual characteristics Audible Magic uses to identify a song of video. Audible Magic's Global Rights Registry Database covers millions of files from its clients.

And that's the gist of digital fingerprinting as a means of copy protection. Up next: how a second kind of fingerprinting can track your identity online.

Digital Fingerprinting and Your Identity

Every time you surf online, you're leaving invisible fingerprints all over the Web.
Every time you surf online, you're leaving invisible fingerprints all over the Web.
©iStockphoto.com/barisonal

The last page delved into the technology powering digital fingerprinting as we typically think of it, but the term sometimes refers to an altogether different form of data tracking. This can be pretty confusing. We all know what fingerprinting normally means (we only have one set of fingerprints!), but entering the digital world opens up room for ambiguity. In recent years, digital fingerprinting has been used to describe a method of identity tracking -- essentially, every computer has a unique fingerprint that makes it trackable across the Internet.

You've probably heard of IP addresses, the unique numbers attached to every computer on the Internet. An IP address isn't an exact identity card for a computer. Real fingerprints never change, but Internet service providers (ISPs) can change users' IP addresses. Digital fingerprinting accounts for other details to pin down the identity of your computer. And here's where things get a little scary: It's shockingly easy for Web sites to read various bits of data about your computer and figure out who you are. The IP address is just the first step -- this shows who your ISP is and what country you live in. The login identity you choose on a Web site can be another clue. If you use the same login on multiple sites, that name may be easy to track down with a simple Google search. The operating system installed on your computer, be it Windows or Mac OS X or Linux, tightens the focus. Even the web browser you use (and the specific version you're running, like Google Chrome 11.0.696.60 or Firefox 3.6.17) adds detail to your digital fingerprint [source: Wall Street Journal].

Advertisement

As you can see, the technology exists to track your activities on the Internet. But is this really a bad thing? Well, that depends on how much you value your privacy. One use of this technology lies in targeted advertising, which would take your data into account to provide ads more likely to appeal to your interests. If you've ever noticed Google Ads grabbing keywords off a page you're viewing to provide more topical ads or been surprised when Web sites mysteriously identify the city you live in, you've seen a basic version of this technology at work. As this kind of tracking becomes more common, advertising and tracking companies will supposedly offer opt-out Web sites (much like telemarketer "do not call" lists) that will protect your anonymity. Still, you should always be aware of how easily your identity can be traced.

Reasons for Digital Fingerprinting

The last two pages established that the term "digital fingerprinting" applies to two entirely different technologies. The thing they have in common, of course, is a computerized form of identification. Now that we've established how each technology works, let's examine how each is used. YouTube presents an easy starting point. Copyright infringement constantly threatens the video site, and in 2007 Viacom sued Google for $1 billion over clips available on YouTube [source: CNET]. Google didn't upload the clips itself, but it didn't stop users from uploading the clips, either. Policing a site as large as YouTube is a huge challenge -- how can Google keep unlicensed content out?

With digital fingerprinting. Google uses software it calls YouTube Video Identification to sort through uploaded videos and recognize copyrighted content. It also gives copyright owners the control to deny uploads or even monetize their content [source: YouTube]. This form of digital fingerprinting actually serves two purposes: It protects Google from harmful lawsuits and limits the unlicensed spread of copyrighted material. Ideally, this means both the companies that own the copyright and the companies who host that content online are protected by fingerprinting. The content isn't spread illegally, and sites like YouTube avoid nasty lawsuits.

Advertisement

Of course, digital fingerprinting doesn't have to be a restrictive technology. Another excellent example of fingerprinting at work is Shazam, the music identification app that can match a song's audio sample to a musical database [source: Everything Else Matters Too]. On smart phones, Shazam uses a microphone to pick up audio from a song, analyzes it, and uses that data to find a match. Shazam then pulls up a page of information on the song and artist and provides quick access to a music store where an MP3 of the song can be purchased.

We've described how digital fingerprinting can be used to track PCs across the Internet based on various characteristics that make up a digital fingerprint. That same tracking technology can be used for security, as well. Pirates and Internet users who upload and download illicit material can be identified, tracked and even arrested using the power of digital fingerprinting. And because identification doesn't rely on an IP address alone, pirates who access the Internet from different places on the same device can still be pinned down.

Obviously, tracking criminals is a noble use of digital fingerprinting -- but if this is starting to sound like an invasion of privacy to you, you might be onto something.

Legality

Google Ads offers advertisers services that tailor offers to individual users. It's a powerful marketing tool, but is it an invasion of privacy?
Google Ads offers advertisers services that tailor offers to individual users. It's a powerful marketing tool, but is it an invasion of privacy?
Screenshot by HowStuffWorks.com staff

As you've probably figured out by this point, digital fingerprinting can be a powerful -- perhaps even invasive -- technology. Do you like the thought of your every online move being tracked, even if it's only for the purpose of targeted advertising? Here's a better question: Is it even legal?

Identity tracking fingerprinting treads on shaky ethical ground that may be deemed overly invasive and unlawful in the future. But because it's a developing technology, those legal issues are still being sorted out. And with the Internet being a global network, laws regarding digital fingerprinting may develop completely differently from one country to another.

Advertisement

According to Canada's guidelines, a digital fingerprint likely constitutes personal information, so usage of that information could be in violation of Canadian privacy laws. Canadian organizations are required to exhaust every possible noninvasive method of personal identification before resorting to methods like fingerprinting. Because fingerprinting "may collect more information than is necessary to identify fraudulent and duplicate respondents in online research," Canadian organizations could get in trouble for tracking people unless they've received permission or exhausted all other opportunities [source: Verrinder].

The first form of digital fingerprinting we covered -- matching identifying characteristics of copyrighted media to a database -- doesn't suffer from the same ethical challenges as identity tracking. License holders have the right to protect their content, and nothing about this form of fingerprinting invades the user's privacy. Ideally, fingerprinting will actually decrease the number of copyright infringement lawsuits by stopping the illegal dissemination of licensed media. Viacom's $1 billion lawsuit against YouTube was thrown out of court in 2010 because Google was found to be in compliance with the Digital Millennium Copyright Act (DMCA). Because the site took down illegal videos when notified, it was protected under the DMCA and wasn't held liable for the actions of its users [source: Schonfeld]. With better fingerprinting technology, the lawsuit may never have arisen at all. That statement puts a lot of faith into fingerprinting technology, bringing us to our last topic: How well does it really work?

Effectiveness of Digital Fingerprinting

Digital fingerprinting of copyrighted content is great in theory, but does it really stop internet piracy?
Digital fingerprinting of copyrighted content is great in theory, but does it really stop internet piracy?
©iStockphoto.com/Brasil2

Digital fingerprinting sounds like the perfect technology to combat Internet piracy. It prevents users from spreading copyrighted content and potentially bypasses the hassle and expense of lawsuits. Once implemented by an organization, digital fingerprinting is a largely automated system, which means less work for content providers and media sites alike. Of course, all that convenience assumes one critical thing: that digital fingerprinting actually works.

Digital fingerprinting must be able to identify thousands or millions of pieces of content -- content that can be disseminated in many media formats, cropped or edited in unexpected ways, or even recorded off a movie theater screen. Video elements like color, bitrate and even resolution can vary from video to video. With all those variables, can digital fingerprinting really work?

Advertisement

In 2007, Audible Magic's Copysense fingerprinting technology was put to the test in an online video site called Soapbox. Soapbox was a Microsoft project that allowed users to upload videos a la YouTube. Even with Audible Magic's fingerprinting technology at work, tech site Gigaom was easily able to upload a copyrighted video from Comedy Central's "The Daily Show" [source: Gigaom]. It took days for the clip to be taken down from Soapbox -- even after Gigaom contacted Microsoft and Audible Magic for comment. Thinking the clip would then be indexed and protected against illicit sharing, Gigaom tried to upload it again. It worked. They had similar success on Myspace, which also employs Audible Magic's fingerprinting.

Audible Magic protects against 11 million songs, movies and television shows. But with decades of media at our fingertips in digital form, the software obviously can't safeguard against all illegal uploads. Digital fingerprinting also can't stop most peer-to-peer file sharing, which distributes material directly between users. The effectiveness of digital fingerprinting in the future is entirely up in the air. If companies like Audible Magic continue to improve their recognition systems and expand their fingerprint databases, sites with user-generated content will be easier to maintain and the technology that identifies media will be more powerful than ever. Who knows? In 20 years, apps like Shazam may be able to differentiate between two live concert versions of "Free Bird" based on the length of a guitar solo. Now that would be accuracy!

Related Articles

Sources

  • AudibleMagic.com. "Content Owners." (May 1, 2011).http://audiblemagic.com/customers-contentregistration.php
  • AudibleMagic.com. "Technology Overview." (May 1, 2011).http://audiblemagic.com/technology.php
  • Broache, Anne and Greg Sandoval. "Viacom sues Google over YouTube clips." March 13, 2007. (May 6, 2011).http://news.cnet.com/Viacom-sues-Google-over-YouTube-clips/2100-1030_3-6166668.html
  • Businessweek.com. "Pirate-Proofing Hollywood." June 11, 2007. (May 2, 2011).http://www.businessweek.com/magazine/content/07_24/b4038073.htm?campaign_id=rss_tech
  • EverythingElseMattersToo.com. "How Shazam works." June 3, 2010. (May 7, 2011).http://everythingelsematterstoo.blogspot.com/2010/11/how-shazam-works.html
  • Gannes, Liz. "Does Digital Fingerprinting Work?: An Investigative Report." June 8, 2007. (April 27, 2011).
  • http://gigaom.com/video/does-digital-fingerprinting-work-an-investigative-report/
  • Milano, Dominic. "Content Control: Digital Watermarking and Fingerprinting." (April 28, 2011).https://www.digimarc.com/resources/docs/Rhozet_wp_Fingerprinting_Watermarking.pdf
  • Schonfeld, Erick. "Judge Throws Out Viacom Case Against YouTube (Court Document)." June 23, 2010. (May 6, 2011).http://techcrunch.com/2010/06/23/youtube-declares-victory-in-viacom-case/
  • Verrinder, James. "Digital fingerprinting 'may be unlawful in Canada', warns MRIA." Sept. 29, 2008. (May 3, 2011).http://www.research-live.com/news/legal/digital-fingerprinting-may-be-unlawful-in-canada-warns-mria/4001050.article
  • Wall Street Journal. "What They Know: Your Digital Fingerprint." Nov. 30, 2010. (April 28, 2011).http://online.wsj.com/video/what-they-know-your-digital-fingerprint/49B4A220-88A5-4F53-BA89-20BBB0A83CB2.html
  • YouTube.com. "YouTube Video Identification Beta." (May 5, 2011).http://www.youtube.com/t/video_id_about