Advertisement

How the Google Cloud Works


It might not look like it, but buildings just like this one are where Google stores its clouds.
Craig Mitchelldyer/Getty Images

When Larry Page and Sergey Brin began work on a project called BackRub, they probably didn't envision the enormous corporation that would grow out of their early efforts. This project evolved into Google, a juggernaut of a company that competes on a global scale with other mega corporations. While the company has products ranging from Web-based e-mail to collaborative office applications, its corporate mission has remained the same. Google intends "to organize the world's information and make it universally accessible and useful" [source: Google].

Google's search engine is the oldest and likely most famous tool in the company's arsenal. But the engineers at Google view organizing the world's information as a job too complex for just a search engine. The company designs tools and services that relate to its mission, sometimes in ways that aren't obvious at first glance. One major focus for the company is cloud computing.

Advertisement

Google isn't alone in offering cloud computing services. Companies like Apple and Microsoft offer products that either directly involve cloud computing services or rely on them in some way. Amazon, the online retail giant, has a thriving cloud storage business. That doesn't mean Amazon has a warehouse filled with fluffy, white clouds. It means the company rents out storage space within its massive data centers. If you are running a company that wants to offer a Web site or service to customers, you may consider using a company like Amazon to host your data.

Google also has a reputation for building enormous data centers. In The Dalles, Ore., Google has a data center built next to the Columbia River. Each building in this data center is about the size of a football field. The location has many features that make it attractive for a data center: It's near hydroelectric dams, which makes power accessible and economical. It's also in an area with a fiber-optic network, which allows for lightning-fast data transmissions [source: Markoff and Hansell].

These huge facilities are necessary for Google to carry out its corporate mission. Not only must the company search and index the world's information on the Web but it also has to provide the power for a growing network of cloud computing services. Now, let's take a look at exactly what cloud computing means.

What is Cloud Computing?

Cloud computing is a popular buzz term in technology circles. The phrase has a vague sound to it. What exactly is cloud computing?

At its most basic level, cloud computing is a model for remote computer access. The idea is simple: You use your computer and an Internet connection to make contact with a remote server. This server, which is really just a computer, runs applications using its hardware. You're able to influence the application by executing commands through a Web browser or other user interface. But the remote server is doing all the heavy lifting.

Advertisement

Why would you want to use a cloud computing system? One reason is that it lets you access applications your own computer might not be able to execute. Your computer only has to run a Web browser or simple user interface. In most cloud computing applications, this client-side program places minimal demands on your machine's resources. That means you can take advantage of a variety of programs and services without having to continually invest in the fastest computers. Since the cloud computing service is handling all the processor work, you just need a machine capable of connecting to the Internet.

Another major selling point for cloud computing services is that they allow you to access your data on a variety of devices no matter where you are. If you rely on your own computer to execute programs, you're limited to that machine unless you make special arrangements. You may have to e-mail a file to yourself so that you can access it on another device. You may have to set up a home network to allow file transfers between machines. And there's the risk that you'll duplicate the file in the process, which can be confusing further down the road -- which file is the real one? Cloud computing services store your information on remote servers. You can log into the cloud computing service using your account login and password. You don't have to use the same computer or device each time.

Google is in a particularly good position when it comes to cloud computing. It's a large, stable company, which means customers can be reasonably confident that their services and data won't disappear overnight. Its leadership team includes engineers who know how to create solutions for computer centers. And the company has demonstrated that its philosophy of using inexpensive equipment rather than cutting-edge machines works.

Next, we'll take a closer look at how Google creates a cloud.

The Anatomy of a Cloud

Google's approach to cloud computing may seem perplexing at first. You might think a huge corporation worth billions of dollars would have data centers packed with state of the art, high-tech servers and machines that go ping. Wouldn't Google executives want the best equipment?

But Google's approach is more pragmatic. The company purchases mid-range servers for its data centers. The company has a good reason for this approach. Should something break, it's relatively easy and inexpensive to get a replacement. Repair and maintenance can be huge costs for a data center -- each building may house thousands of machines. To ensure services remain online, Google dedicates several servers to provide the same function. That way, should one server malfunction, another can take its place with a minimal interruption in services. It builds redundancy into the system.

Advertisement

Google's philosophy is to keep the back end system as simple as possible. As systems become more complex, the opportunity for problems to arise increases. Simplifying a system reduces the chance for problems even if the system itself is enormous. The Google cloud's foundation is the Google File System. This is a distributed computing system that handles information requests through basic file commands like open, read and write.

The entire file system consists of networks called clusters. The Google File System relies on master servers to coordinate data requests -- each cluster has a single master server. When you interact with information stored on the cloud, your actions translate into data requests. A request may be something simple, like viewing a file, or may involve more complex actions, such as formatting or writing new data. Your computer acts as a client -- a machine that sends data requests to other machines. Ultimately, a master server takes the request and sends a message to the Google machine that houses the data -- Google calls these machines chunkservers. The chunkserver sends the data directly to the client -- the information never passes through the master server.

Because Google stores several copies of each piece of information for the sake of redundancy, making changes to data in the cloud is a little complicated. First, your write request goes to a master server. The master server chooses one chunkserver storing the appropriate data to respond to your request -- this becomes the primary replica chunkserver. The master server tells the client the location of all replica chunkservers storing your file. When you make changes, those changes go to the first replica chunkserver to which your computer can connect. The write request moves through the system to all the replica chunkservers, including the primary replica. The primary replica makes the actual change to the data and then sends a message to all other replica chunkservers to do the same. Once the primary replica receives confirmation that all copies of the data have changed, it sends a notification to the client.

Now that we have the technical details out of the way, let's take a look at some of the things you can do with the Google cloud.

Google Cloud Connect

One of the challenges of working with electronic documents is finding a simple way to collaborate with other people. Using the old method of opening up an application on your computer, creating a file, saving it and then sending it to someone else invites problems. First among those is that this approach generates two copies of the document. If you make changes to your copy while other people make changes to their copies of that same file, how do you incorporate all the changes? Which version of the file is the correct one? What happens if someone opens an older copy of the file and makes changes, not knowing that a more current version of the document already exists? File management becomes challenging.

Google Cloud Connect approaches this problem by leveraging the cloud and the application programming interface (API) for Microsoft Office. After installing a plug-in for the Microsoft Office suite of programs, you can save files to the cloud. This means the cloud copy of the file becomes the master document that everyone uses. Google Cloud Connect assigns each file a unique URL. You can share this URL with others to let them view the document. If you designate someone as an editor, that person can then download the document and open it in Microsoft Office.

Advertisement

If you make changes to the document, those changes will show up for everyone else viewing it. Should other editors make changes, you'll see them reflected in your copy. When multiple people make changes to the same section of a document, Cloud Connect gives you the chance to choose which set of changes to keep.

So how does it work? When you upload a document to Google Cloud Connect, the service inserts some metadata into the file. Metadata is information about other information. In this case, the metadata identifies the file so that changes will track across all copies. The back end is similar to the Google File System and relies on the Google Docs infrastructure. As the documents sync to the master file, Google Cloud Connect sends the updated data out to all downloaded copies of the document using the metadata to guide updates to the right files.

Microsoft offers its own online collaboration tool called SharePoint. But unlike Google Cloud Connect, SharePoint isn't free. Businesses interested in SharePoint must purchase a license to use it on their computers. But since SharePoint is a Microsoft product for Microsoft Office applications, there's a tight integration of features that Google can't match.

Next, we'll look at Google's Cloud Print service.

Google Cloud Print

On Aug. 10, 2011, IBM executive Mark Dean caused a bit of a stir when he referred to the world being in a "post-PC era." Dean wasn't saying that the PC was dead or obsolete. But he was pointing out how people are using mobile devices more often when they perform basic activities traditionally done on computers. Smartphones and tablets are pushing desktop and laptop computers to a support role. One of those tasks is printing.

Traditionally, to send a print job to a printer you'd either have to connect the printer directly to your computer, or connect both the printer and your computer to a network. Google Cloud Print is a service that extends the printer's function to any device that can connect to the Internet. You can be on the other side of the world and send a print job to the machine sitting on your desk at home.

Advertisement

To use Google Cloud Print, you need the following:

  • a free Google profile
  • an app, program or Web site that incorporates the Google Cloud Print feature
  • at least one cloud-ready printer or printer connected to a computer logged onto the Internet

When you use Google Cloud Print through an app or Web site, the print request goes through the Google servers. Google routes the request to the appropriate printer associated with your Google account. If you register more than one printer -- there's no limit to the number of printers you can connect to your account -- you'll have to designate the machine you want the print job to go to. Assuming the respective printer is on and has an active Internet connection, paper and ink, the print job should execute on the machine even if you're in another part of the world. You can share your printer with other people, allowing them to send you printed documents through Google Cloud Print.

Because most printers aren't t cloud-ready, most Google Cloud Print users will need to have a computer act as a liaison. Google Cloud Print is an extension built into the Google Chrome. browser Google turns the setting off by default -- you have to choose to enable it. Once enabled, the service activates a small piece of code called a connector. The connector's job is to interface between the printer and the outside world. The connector uses your computer's printer software to send commands to the printer. As of this writing, Google has connectors built for PCs and Macs and is working on one for Linux machines.

If you have a cloud-ready printer, you can connect the printer to the Internet directly without the need for a dedicated computer. You have to register the cloud printer with Google Cloud Print to take advantage of its capabilities. The big advantage of the cloud printer is that you don't have to keep a computer powered on, online and connected to your Google account in order to receive print jobs. You connect a cloud printer to the Google Cloud Print service by registering the printer's unique email address with Google.

Because Google allows app and Web site developers to incorporate Google Cloud Print into their products as they see fit, there's no standard approach to executing a print job. You might see one user interface on one site and a completely different approach on another. Also, Google Cloud Print depends upon developers incorporating the feature into their products. Not every app or site will have Google Cloud Print built into it, which limits its functionality. Naturally, Google builds the service into its own products but many people rely on services from multiple sources and may find Google Cloud Print doesn't have a wide enough adoption to meet all their needs.

Google Music Cloud

Mobile access to music isn't a new trend. We've had car radios and portable radios for decades. Then came inventions like the portable cassette player, portable CD player and MP3 players. With each generation of product, we expanded our options to take our music with us on the go. But each of these gadgets gave us limited access and it wasn't always easy to share music across multiple devices. Google's Music service aims to change that.

At its most basic level, Google Music is a cloud storage service coupled with a simple music player interface. You can upload songs to your Google Music account and access them with a computer or Internet-capable device using the Google Music app. Google allows you to upload up to 20,000 songs for free. Google limits the file size for an individual song to 250 megabytes, which might require you to use a lower bit rate when converting tracks to digital files.

Advertisement

Google Music supports MP3 and aac files across all platforms. The Windows version of Google Music supports wma files. Linux computers support ogg files. And while you can upload FLAC files to Google Music, Google will transcode those files into MP3 formats at 320 kbps. Because MP3 is a lossy format, this compression might have an impact on the sound quality.

While you can log into your Google Music account from multiple computers and devices, only one device can actually play music at any given time. Two people can't listen to different devices accessing the same account at the same time. This is how Google prevents people from using Google Music as a way to encourage piracy.

Even with Google's protection in place, the music industry isn't thrilled with Google Music. Google sought out deals with the record industry before launching Google Music but didn't make much progress. Eventually, the company decided to move forward with a beta test of Google Music without licenses. From Google's perspective, Google Music is like any other storage device. If you purchase a song, you're allowed to transfer that song to an MP3 player or smartphone. You could also store that song on a hard drive connected to your computer. You could even transfer it to a video game console. Google Music is like any other data storage device -- it's just that this storage device might be hundreds of miles away from the person who bought the song.

Google is still trying to make deals with record labels. Right now, the only way to get your music onto Google's service is to upload it yourself. If you have a slow connection and a large music library, this could take hours. With the proper licensing agreements, Google could incorporate a sales platform that would allow you to buy music and automatically store it to your Google Music account.

Google's cloud services are likely just the beginning of a full suite of products that will shift computing away from the consumer and onto servers. As broadband penetration spreads across the globe and the focus shifts to inexpensive computers and mobile devices, cloud services will become more compelling. Using cloud services requires a level of trust in the provider. Google will have to prove that it is reliable and ethical with its cloud services or risk alienating users. Are you willing and ready to have a company like Google handle your data and provide your computer services?

Related Articles

More Great Links

Sources

  • Bright, Peter. "Hands-on: Google Cloud Connect for Office not ready for prime time." Ars Technica. February 2011. (Aug. 23, 2011) http://arstechnica.com/microsoft/news/2011/03/hands-on-google-cloud-connect-for-office-not-ready-for-prime-time.ars
  • Bruno, Antony. "Why Record Labels and Google Music Couldn't Agree on the Cloud." The Hollywood Reporter. May 12, 2011. (Aug. 24, 2011) http://www.hollywoodreporter.com/news/why-record-labels-google-music-187889
  • Dean, Mark. "IBM Leads the Way in the Post-PC Era." A Smarter Planet Blog. Aug. 10, 2011. (Aug. 23, 2011) http://asmarterplanet.com/blog/2011/08/ibm-leads-the-way-in-the-post-pc-era.html
  • Ghemawat, Sanjay, Gobioff, Howard and Leung, Shun-Tak. "The Google File System." Google. 2003. (Aug. 23, 2011) http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/gfs-sosp2003.pdf
  • Google. "Company." Aug. 23, 2011. http://www.google.com/about/corporate/company/
  • Google. "Google Cloud Connect." Aug. 23, 2011. http://tools.google.com/dlpage/cloudconnect
  • Google. "Google Cloud Connect for Microsoft Office available to all." Google Docs Blog. Feb. 24, 2011. (Aug. 23, 2011) http://googledocs.blogspot.com/2011/02/google-cloud-connect-for-microsoft.html
  • Google. "Google Cloud Print." Aug. 23, 2011. http://www.google.com/chrome/intl/en/p/cloudprint.html
  • Google. "Google Data Centers." Aug. 24, 2011. http://www.google.com/corporate/datacenter/
  • Google. "Google Music Beta." Aug. 24, 2011. http://music.google.com/music/listen#start_pl
  • Google. "Music Help." Aug. 24, 2011. http://www.google.com/support/music/
  • Harris, Robin. "Google File System Evaluation." StorageMojo. June 13, 2006. (Aug .24, 2011) http://storagemojo.com/?page_id=152
  • Markoff, John. "Hiding in Plain Sight, Google Seeks More Power." The New York Times. June 14, 2006. (Aug. 23, 2011) http://www.nytimes.com/2006/06/14/technology/14search.html
  • Metz, Cade. "Google Cloud Connect: The limits of a Microsoft makeover." The Register. February 2011. (Aug. 24, 2011) http://www.theregister.co.uk/2011/02/26/google_cloud_connect_caveat/
  • Pham, Alex and Guynn, Jessica. "Google Music launches without label deals." Los Angeles Times. May 10, 2011. (Aug. 24, 2011) http://latimesblogs.latimes.com/entertainmentnewsbuzz/2011/05/google-music-launches-without-label-agreements.html