How the Wikipedia Scanner Works


Virgil Griffith created the Wikipedia Scanner to catch companies and organizations who were editing Wikipedia articles to their own benefit.
Virgil Griffith created the Wikipedia Scanner to catch companies and organizations who were editing Wikipedia articles to their own benefit.
Image courtesy Jake Appelbaum/Virgil Griffith

If you've ever used the online encyclopedia Wikipedia, you might have noticed that editing an article is as easy as clicking the "edit this page" tab, making your changes and clicking "save." But it's this ease of editing that's both Wikipedia's greatest strength and its weakness: Anyone can edit Wikipedia -- even anonymously.

Stories frequently surface about an organization or individual altering a Wikipedia entry, either maliciously or for the sake of a practical joke. But limiting people's ability to easily edit Wikipedia would infringe on one of its most celebrated features and what's arguably the key to its success.

Advertisement

Wikipedia has a variety of tools to deal with malicious editors. The site offers extensive tutorials about what it looks for in a good encyclopedia entry and how to clean up articles. Many Wikipedians -- frequent editors of the encyclopedia -- show fierce loyalty to the site, acting as its watchdogs and correcting potentially damaging edits. Wikipedia keeps a record of all changes so that a defaced or otherwise unsatisfactory article can easily revert to an older version. The site also has several features that can be used or requested by Wikipedians, including locking down pages, blocking certain users and flagging articles that violate the site's guidelines.

Despite all of these tools, Wikipedians can't be everywhere. With more than 1.9 million English-language articles alone, the potential for abuse is enormous, especially by anonymous users [source: Wikimedia]. Enter the WikiScanner, a free program unveiled in August 2007 by Virgil Griffith, a graduate student studying computation and neural systems at the California Institute of Technology and a visiting researcher at the Santa Fe Institute.

Whenever an unregistered, anonymous user edits a Wikipedia entry, the site logs the user's IP address, the unique string of numbers that identifies each computer connected to the Internet. The WikiScanner uses these records to trace the IP addresses of anonymous Wikipedia editors. By comparing the IP addresses against a database of companies that own them, the Scanner can name the editor, or at least the organization responsible for the user's access.

The results have been astounding -- tens of millions of anonymous edits, performed by more than 180,000 organizations, some of them in clear violation of Wikipedia policy. Among the many organizations cited: the FBI, CIA, Britain's Labour Party, the Vatican, Wal-Mart, the Republican Party, the Church of Scientology, Dell Computers, Microsoft, Apple and the United Nations.

In this article, we'll take a close look at how the WikiScanner works, look at some particularly controversial or simply bizarre examples of anonymous editing and consider some of the reactions to the WikiScanner. We'll also show you how you can use the WikiScanner to see what some organizations and companies may be up to on Wikipedia.

Creating the Wikipedia Scanner

The "history" tab on a Wikipedia entry allows you to see
The "history" tab on a Wikipedia entry allows you to see

In creating the WikiScanner, Virgil Griffith took advantage of one of Wikipedia's main features -- its extensive records and backlogs. To see an example of what these records look like, simply click on the "history" tab at the top of a Wikipedia entry. There you can see the names of registered users who edited the article, the IP addresses of anonymous editors and notes on changes. You can also easily compare different versions of an article or leave a message for an editor, even an anonymous one.

Wikipedia makes it possible to download a complete version of the encyclopedia, including every article and the records of all edits performed. Griffith downloaded the complete encyclopedia and extracted all of the anonymous changes and their associated IP addresses. He came up with 34,417,493 anonymous edits made from Feb. 7, 2002 through Aug. 4, 2007 from 2,668,095 different IP addresses [source: Griffith].

Advertisement

A variety of services, some of them publicly available, make it possible to trace IP addresses to the corporations, government entities and organizations that own them. Using software from IP2Location, a company that sells programs that allow users to match IP addresses with their owners, Griffith found that 187,529 organizations had performed anonymous edits of Wikipedia entries [source: Griffith].

That's about all there is to it. It may seem like a simple idea in theory, but no one before Virgil Griffith had taken a comprehensive look at anonymous Wikipedia edits and tried to find all of the users behind them. Now accessible through Griffith's Web site, the WikiScanner allows users to search for specific organizations, a single IP address, a range of IP addresses, or a Wikipedia entry. After its initial launch, Griffith temporarily disabled some of the site's features due to the heavy traffic it was receiving but later returned with enough bandwidth "to handle the web traffic of a small country" [source: Griffith].

Several Web sites, notably "Wired," have established blogs or forums where users can post their favorite WikiScanner results. Griffith's site has a selection of "Editor's Picks" that link to edits made by prominent organizations, divided into categories like government, education, policy, corporate and news. He also provides other intriguing lists, such as the number of anonymous edits from .gov addresses. NASA comes in first with 6,846 edits [source: Griffith]. The Department of Defense Network Information Center, the registered owner of the army.mil domain name, comes in first among .mil addresses with 43,823 edits [source: Griffith].

So what kind of results did the WikiScanner dig up? Read on to find out.

Exposed by the Wikipedia Scanner

Someone on ExxonMobil’s network altered sections of the article about the Exxon Valdez disaster, minimizing the effects of the oil spill.
Someone on ExxonMobil’s network altered sections of the article about the Exxon Valdez disaster, minimizing the effects of the oil spill.

With more than 34 million anonymous edits performed by 187,529 organizations, it will take quite a while before anyone sifts through the data and finds all of the potentially controversial edits exposed by the WikiScanner. And of course, anonymous editing will go on as long as it remains a feature of Wikipedia. But many organizations have already been outed by the WikiScanner. In this section, we'll look at what Griffith and other users have found.

Not all anonymous editing is malicious. Some of it comes from users who may not want to register for the site or who want to fix a simple grammatical error without bothering to log in. Many of the edits uncovered by the WikiScanner were harmless. Even so, some of the anonymous edits seem malicious or designed to serve an organization's particular interests. Others appear to make the articles into an advertisement or press release. (Wikipedia has a tag it uses to flag articles that appear to be advertisements for a company or product.)

Advertisement

Diebold Election Systems received a lot of attention for the anonymous edits performed by people with access to their network. The maker of electronic voting machines had already been subject to controversy about the quality of its machines and contributions made by the company CEO to George Bush's political campaigns. The WikiScanner revealed that in November 2005, 15 paragraphs that discussed these controversies were deleted by a user traced to a Diebold-owned IP address. Soon after, another user put the deleted articles back in the article and issued the anonymous editor a warning on his "talk page."

A user traced to a Democratic party IP address edited entries about conservative talk show host Rush Limbaugh, calling him "idiotic," "racist" and a "bigot" [source: Wikipedia]. The editor also wrote that most of Limbaugh's audience was "legally retarded," described his point of view as "ridiculous" and linked the word "ridiculous" to Wikipedia's article on Conservatism [source: Wikipedia]. Users of Democratic Party computers also performed simple copy edits on articles about presumably uncontroversial topics, such as British tennis player Tim Henman, burrito chain Baja Fresh and green roofs.

Someone at the National Rifle Association made changes to a page about rumors surrounding the Sept. 11 attacks. The anonymous editor altered a paragraph to draw a connection between the Iraqi government under Saddam Hussein and the Sept. 11 attacks [source: Griffith]. Someone at the same organization twice edited the Wikipedia entry about the history of the liger, a cross between a male lion and a female tiger [source: Griffith].

Besides various spats between political rivals, the WikiScanner revealed instances of companies editing pages about their competitors. Apple edited Microsoft pages, and someone at Microsoft did so for Apple. British newspaper "The Guardian" edited an entry for competing paper "The Times."

Many of the anonymous edits are quite humorous, especially when considering the source. A CIA web surfer contributed a long entry about lightsaber combat. Someone at DARPA, the Department of Defense's highly sophisticated research agency, edited entries about "The Real World: Denver," actor Shia LaBeouf and hockey player Bill Guerin. None of those edits appeared malicious or against Wikipedia policy.

Consequences of the Wikipedia Scanner

The Wikipedia Scanner offers a variety of ways to search for anonymous edits and links to an editor's list of particularly "salacious" edits.
The Wikipedia Scanner offers a variety of ways to search for anonymous edits and links to an editor's list of particularly "salacious" edits.

On his personal Web site, Griffith says that he came up with the idea for the WikiScanner after hearing that Congressmen were whitewashing their own Wikipedia entries by taking out negative statements or replacing negative words with positive ones [source: Griffith]. For example, a "controversial" or "temperamental" politician might instead be called "dynamic."

Among the other reasons for creating the WikiScanner, Griffith said that he wished "to create minor public relations disasters for companies and organizations I dislike" and "to see what 'interesting organizations' (which I am neutral towards) are up to" [source: Griffith]. He also jokes that he wants "to improve virgil.gr's Google pagerank [sic] for the query 'virgil.'"

Advertisement

The announcement of the WikiScanner spawned numerous newspaper articles and spurred amateur detectives around the world to go hunting for nefarious editing. It appears Griffith achieved one of his goals in that many corporations now face questions over the practices of people with access to their networks.

If the WikiScanner traces an edit to a company's IP address, it doesn't mean that an employee performed the edits or that the company sanctioned or ordered the action. The program can't trace edits to a specific person -- only the IP address and the company that owns it. But Griffith noted that if the edit was performed during the workday, then the person who made the changes was likely "either an agent of that company or a guest that was allowed access to their network" [source: Griffith].

Many of the anonymous edits represent violations of Wikipedia's guidelines, especially its conflict of interest (COI) behavioral rule. Editors who have potential COIs are not prevented from editing articles that may raise a conflict of interest, but they are advised to use discretion [source: Wikipedia]. Some organizations are clearly editing articles to their benefit, likely violating Wikipedia rules in the process.

One of Wikipedia's fundamental principles is maintaining a neutral point of view (NPOV), which Wikipedia founder Jimmy Wales calls "absolute and non-negotiable" [source: Wikimedia]. Many of the controversial edits uncovered by the WikiScanner violate the NPOV principle.

Griffith said that he prefers that open platforms like Wikipedia allow users to edit anonymously, but that tools like WikiScanner could be useful in counteracting malicious editing [source: Griffith]. It could be used to help maintain the integrity of controversial pages and to track when a page is vandalized. Or, the tool may result in people being more careful about anonymous edits, performing them from home or public computer systems.

Virgil Griffith says that he has not spoken with the Wikimedia Foundation, the nonprofit organization behind Wikipedia, but that their comments in the press seemed to be positive. A spokesperson for Wikipedia told BBC News that the company "really [values] transparency and the scanner really takes this to another level" [source: BBC News]. The spokesman also said that the WikiScanner may in the future prevent people with conflicts of interest from editing articles [source: BBC News].

In an interview with the "Times Online," Griffith said that his next project might concern the personal information people post on social networking sites [source: Times Online].

For more information about the WikiScanner and to see lists of some of the organizations exposed by the WikiScanner, please check out the links on the next page.

Related HowStuffWorks Articles

More Great Links

Sources

  • "CIA, FBI computers used for Wikipedia edits." Reuters. ZDNet News. Aug. 17, 2007. http://news.zdnet.com/2100-9588_22-6203109.html
  • "Definition of whitewash." Merriam-Webster Online Dictionary. http://mw1.merriam-webster.com/dictionary/whitewash
  • "List of Wikipedias." Wikimedia. http://meta.wikimedia.org/wiki/List_of_Wikipedias
  • "Student's program sends PR chaos in Wiki-scandal." maltaStar.com. Aug. 16, 2007. http://www.maltastar.com/pages/msFullArt.asp?an=14323
  • "Wikipedia: Conflict of interest." Wikipedia. http://en.wikipedia.org/wiki/Wikipedia:Conflict_of_interest
  • "Wikipedia: Neutral point of view." Wikipedia. http://en.wikipedia.org/wiki/Wikipedia:Neutral_point_of_view
  • Blakely, Rhys. "Exposed: guess who has been polishing their Wikipedia entries?" Times Online. Aug. 15, 2007. http://business.timesonline.co.uk/tol/business/industry_sectors/media/article2264150.ece
  • Borland, John. "See Who's Editing Wikipedia - Diebold, the CIA, a Campaign." Wired. Aug. 14, 2007. http://www.wired.com/politics/onlinerights/news/2007/08/wiki_tracker
  • Elseworth, Catherine. "Wikipedia sleuth's tool reveals entry fiddling." The Telegraph. Aug. 20, 207. http://www.telegraph.co.uk/news/main.jhtml?xml=/news/2007/08/16/wiki116.xml
  • Fildes, Jonathan. "Wikipedia 'shows CIA page edits.'" BBC News. Aug. 15, 2007. http://news.bbc.co.uk/2/hi/technology/6947532.stm
  • Griffith, Virgil. "List anonymous wikipedia edits from interesting organizations." http://wikiscanner.virgil.gr/
  • Griffith, Virgil. "WikiScanner FAQ." http://virgil.gr/31.html
  • Johnson, Bobbie. "Companies and party aides cast censorious eye over Wikipedia." Guardian Unlimited. Aug. 15, 2007. http://www.guardian.co.uk/technology/2007/aug/15/wikipedia.corporateaccountability?gusrc=rss&feed=networkfront