What is ‘big data’?

How Big Data is Analyzed and Used

Server farms like this one in San Jose, Calif. are processing massive amounts of data in an effort to identify patterns and associations.
Server farms like this one in San Jose, Calif. are processing massive amounts of data in an effort to identify patterns and associations.
© Bob Sacha/Corbis

Big data has to be collected, massaged, linked together and interpreted for it to be of any use to anyone. Companies and other entities need to filter the vast amount of available data to get to what's most relevant to them. Fortunately, hardware and software that can process, store and analyze huge amounts of information are becoming cheaper and faster, so the work no longer requires massive and prohibitively expensive supercomputers. Some of the software is becoming more user friendly so that it doesn't necessarily take a team of programmers and data scientists to wrangle the data (although it never hurts to have knowledgeable people who can understand your requirements).

Companies take advantage of cloud computing services so that they don't even have to buy their own computers to do all that data crunching. Data centers, also called server farms, can distribute batches of data for processing over multiple servers, and the number of servers can be scaled up or down quickly as needed. This scalable distributed computing is accomplished using innovative tools like Apache Hadoop, MapReduce and Massively Parallel Processing (MPP). NoSQL databases have been developed as more easily scalable alternatives to traditional SQL-based database systems.

Much of this big data processing and analysis is aimed at finding patterns and correlations that provide insights that can be exploited or used to make decisions. Businesses can now mine massive amounts of data for information about consumer habits, their products' popularity or more efficient ways to do business. Big data analytics can be used to target relevant ads, products and services at the customers they believe are most likely to buy them, or to create ads that are more likely to appeal to the public at large. Companies are now even starting to do things like send real-time ads and coupons to people via their smartphones for places that are near locations where they have recently used their credit cards.

It's not just for making us buy stuff, however. Businesses can use the information to improve efficiency and practices, such as finding the most cost-effective delivery routes or stocking merchandise more appropriately. Government agencies can analyze traffic patterns, crime, utility usage and other statistics to improve policy decisions and public service. Intelligence agencies can use it to, well, spy, and hopefully foil criminal and terrorist plots. News outfits can use it to find trends and develop stories, and, of course, write more articles about big data.

In essence, big data allows entities to use nearly real-time data to inform decisions, rather than relying mostly on old information as in the past. But this ability to see what's going on with us in the present, and even sometimes to predict our future behavior, can be a bit creepy.