How Gnutella Works


At its peak, Napster was perhaps the most popular Web site ever created. In less than a year, it went from zero to 60 million visitors per month. Then it was shut down by a court order because of copyright violations, and wouldn't relaunch until 2003 as a legal music-download site.

The original Napster became so popular so quickly because it offered a unique product -- free music that you could obtain nearly effortlessly from a gigantic database. You no longer had to go to the music store to get music. You no longer had to pay for it. You no longer had to worry about cueing up a CD and finding a cassette to record it onto. And nearly every song in the universe was available.

Advertisement

Given that it was distributing an illegal product, the original Napster's key weakness lay in its architecture -- the way that the creators designed the system. When the courts decided that Napster was promoting copyright infringement, it was very easy for a court order to shut the site down.

The fact that Napster promoted copyright violations did not matter to its users. Most of them have turned to a new file-sharing architecture known as Gnutella. In this article, you will learn about the differences between Gnutella and the old Napster that allow Gnutella to survive today despite a hostile legal environment.

Napster's Architecture

On the Web as it is normally implemented, there are Web servers that hold information and process requests for that information (see How Web Servers Work for details). Web browsers allow individual users to connect to the servers and view the information. Big sites with lots of traffic may have to buy and support hundreds of machines to support all of the requests from users.

Napster pioneered the concept of peer-to-peer file sharing. With the old version of Napster (Napster relaunched itself in 2003 as a legal, pay-for-music site), individual people stored files that they wanted to share (typically MP3 music files) on their hard disks and shared them directly with other people. Users ran a piece of Napster software that made this sharing possible. Each user machine became a mini server.

Advertisement

If you logged into the old Napster to download a song, here's what happened:

  1. You started the Napster software on your machine. Your machine became a small server able to make files available to other Napster users.
  2. Your machine connected to Napster's central servers. It told the central servers which files were available on your machine. So the Napster central servers had a complete list of every shared song available on every hard disk connected to Napster at that time.
  3. You typed in a query for a song. Let's say you were looking for the song "Roxanne" by The Police. Napster's central servers listed all of the machines storing that song.
  4. You picked a version of the song from the list.
  5. Your machine connected to the user's machine that had that song, and downloaded the song directly from that machine.

The creator of Napster had a couple of reasons for this approach:

  • Napster eventually grew to have billions of songs available. There is no way a central server could have had enough disk space to hold all the songs, or enough bandwidth to handle all the requests.
  • Napster was trying to take advantage of a loophole in copyright law that allows friends to share music with friends. The legal concept behind Napster was, "All of these people are sharing the songs on their hard disks with their friends." The courts did not agree with that logic, but it gave Napster enough time to prove the concept and grow to massive size.

This approach worked great and made fantastic use of the Internet's architecture. By spreading the load for file downloading across millions of machines, Napster accomplished what would have been impossible any other way.

The central database for song titles was Napster's Achilles' heel. When the court ordered Napster to stop the music, the absence of a central database killed the entire original Napster network.

With the original Napster gone, what you had at that point was something like 100 million people around the world hungry to share more and more files. It was only a matter of time before another system came along to fill the gap.

Gnutella's Architecture

Currently, the most popular system for sharing files is another peer-to-peer network called Gnutella, or the Gnutella network. There are two main similarities between Gnutella and the old Napster:

  • Users place the files they want to share on their hard disks and make them available to everyone else for downloading in peer-to-peer fashion.
  • Users run a piece of Gnutella software to connect to the Gnutella network.

There are also two big differences between Gnutella and the old Napster:

Advertisement

  • There is no central database that knows all of the files available on the Gnutella network. Instead, all of the machines on the network tell each other about available files using a distributed query approach.
  • There are many different client applications available to access the Gnutella network.

Because of both of these features, it would be difficult for a simple court order to shut Gnutella down. The court would have to find a way to block all Gnutella network traffic at the ISP and the backbone levels of the Internet to stop people from sharing.

Gnutella Clients

The original Napster had one piece of "client software" -- the software that users ran on their machines to access the Napster servers. Gnutella has dozens of clients available. Some of the popular Gnutella clients include:

How a Gnutella client finds a song

Given that there is no central server to store the names and locations of all the available files, how does the Gnutella software on your machine find a song on someone else's machine? The process goes something like this:

Advertisement

  • You type in the name of the song or file you want to find.
  • Your machine knows of at least one other Gnutella machine somewhere on the network. It knows this because you've told it the location of the machine by typing in the IP address, or because the software has an IP address for a Gnutella host pre-programmed in. Your machine sends the song name you typed in to the Gnutella machine(s) it knows about.
  • These machines search to see if the requested file is on the local hard disk. If so, they send back the file name (and machine IP address) to the requester.
  • At the same time, all of these machines send out the same request to the machines they are connected to, and the process repeats.
  • A request has a TTL (time to live) limit placed on it. A request might go out six or seven levels deep before it stops propagating. If each machine on the Gnutella network knows of just four others, that means that your request might reach 8,000 or so other machines on the Gnutella network if it propagates seven levels deep.

It is an extremely simple and clever way of distributing a query to thousands of machines very quickly.

This approach has one big advantage -- Gnutella works all the time. As long as you can get to at least one other machine running Gnutella software, you are able to query the network. No court order is going to shut this system down, because there is no one machine that controls everything. However, Gnutella has at least three disadvantages:

  • There is no guarantee that the file you want is on any of the 8,000 machines you can reach.
  • Queries for files can take some time to get a complete response. It might be a minute or more before all of the responses, seven levels deep, come in.
  • Your machine is part of this network. It is answering requests and passing them along, and in the process routing back responses as well. You give up some amount of your bandwidth to handle requests from all the other users.

Apparently, these disadvantages are minor, because people have downloaded hundreds of millions of copies of Gnutella clients.

XoloX Example: Searching

XoloX is a typical, fairly simple program for connecting into the Gnutella network. It does not have some of the bells and whistles of the more sophisticated clients, but it does work, it is a small file to download (only 600 kilobytes or so), it has no "spyware" or bundled pop-up advertising mixed in with it, and it is very easy to install and use. Its simplicity makes it useful to demonstrate how a typical Gnutella client works.

There are three big things you can do with XoloX: search for files, transfer files to your machine and look at your downloaded files. There are three buttons at the top of the XoloX window that let you toggle between these three activities.

Advertisement

The figure above shows a typical screenshot during a search. All you do is type in the name (or keywords) of the file you are looking for. You can also select the file type: audio, video, etc., or "All Types." Your XoloX client sends out a message containing your search string, and over the course of 30 to 60 seconds a search window fills with results from the thousands of other machines that are processing your query.

One thing you will notice in the search window is a score. The score represents the number of machines currently online that have the same file available. By choosing a file with a high score, you increase your odds of actually getting the file you want.

XoloX Example: Downloading

To download a file, you simply double-click it in the search window. This sends the file name to the Transfer window. Once a filename is in the transfer window, your copy of XoloX will connect to the peer machine to download the file. One nice thing about XoloX/Gnutella is that if multiple machines have the file available, your client can connect to several of them simultaneously to download the file very quickly. In the figure below, you can see that Filename1.avi in particular is taking advantage of this capability to download the file at a rate of 69.2 kilobytes per second. XoloX is estimating 43 minutes to complete the download of over 100 megabytes.

When you pick a file for downloading, it is fairly common for nothing to happen. That is, XoloX cannot connect to the machine that has the file, or the machine holding the file is already busy helping other people. You can solve this problem either by waiting (eventually a busy machine can get unbusy), by choosing files with high scores (increasing the likelihood of finding an unbusy machine), or by deleting a file that is going nowhere from the transfer window and replacing it with an identical file from the search window.

Advertisement

Once you have the files on your machine, you can find them in a XoloX directory and in the Files window of XoloX. You can share all the files you've downloaded with other people if you like. You do this by first specifying the directories and file types you want to share in the Preferences dialog:

You can also control how much outgoing bandwidth you allow XoloX to consume when people download files from you:

This can keep people from chewing up all your upstream bandwidth.

Is Gnutella Legal?

Gnutella itself is legal. There is no law against sharing public domain files. It's when people use Gnutella to distribute copyrighted music and films that its use becomes illegal. This is the problem that got Napster in trouble. The music industry is officially upset about Gnutella, but there is currently no easy way to control it.

Attacking the Gnutella architecture is one way to disrupt file-sharing activities. There are currently two approaches being used:

Advertisement

  1. Overloading the Gnutella network with a flood of bogus search packets.
  2. Filling Gnutella servers with corrupted files.

Gnutella's many developers have adapted to problems in the past, so it is probable that new software can work around these threats and keep the files flowing.

The debate at the moment is how much financial damage file-sharing actually causes. Is a shared file a theft, or is it a form of free advertising and exposure just like airtime on the radio is?

For more information on file sharing and related topics, including some different perspectives on the legality of sharing copyrighted music, check out the links on the next page.

Related HowStuffWorks Articles

More Great Links

News