How Google Deep Dream Works

Computer Brains and Bikes

You can see that Deep Dream took an image of a beetle and used its data about similar creatures to reconstruct the original photo subject and background.
You can see that Deep Dream took an image of a beetle and used its data about similar creatures to reconstruct the original photo subject and background.
Deep Dream upload by HowStuffWorks staff

Neural networks don't automatically set about identifying data. They actually require a bit of training —they need to be fed sets of data to use as reference points. Otherwise they'd just blindly sift through data, unable to make any sense of it.

According to Google's official blog, the training process is based on repetition and analysis. For example, if you want to train an ANN to identify a bicycle, you'd show it many millions of bicycles. In addition, you'd clearly specify — in computer code, of course — what a bicycle looks like, with two wheels, a seat and handlebars.

Then researchers turn the network loose to see what results it can find. There will be errors. The program might, for instance, return a series of images including motorcycles and mopeds. In those cases, programmers can tweak the code to clarify to the computer that bicycles don't include engines and exhaust systems. Then they run the program, again and again, fine-tuning the software until it returns satisfactory results.

The Deep Dream team realized that once a network can identify certain objects, it could then also recreate those objects on its own. So a network that knows bicycles on sight can then reproduce an image of bicycles without further input. The idea is that the network is generating creative new imagery thanks to its ability to classify and sort images.

Interestingly, even after sifting through millions of bicycle pictures, computers still make critical mistakes when generating their own pictures of bikes. They might include partial human hands on the handlebars or feet on the pedals. This happens because so many of the test images include people, too, and the computer eventually can't discern where the bike parts end and the people parts begin.

These kinds of mistakes happen for numerous reasons, and even software engineers don't fully understand every aspect of the neural networks they build. But by knowing how neural networks work you can begin to comprehend how these flaws occur.

The artificial neurons in the network operate in stacks. Deep Dream may use as few as 10 or as many as 30. Each layer picks up on various details of an image. The initial layers might detect basics such as the borders and edges within a picture. Another might identify specific colors and orientation. Other layers may look for specific shapes that resemble objects like a chair or light bulb. The final layers may react only to more sophisticated objects such as cars, leaves or buildings.

Google's developers call this process inceptionism in reference to this particular neural network architecture. They even posted a public gallery to show examples of Deep Dream's work.

Once the network has pinpointed various aspects of an image, any number of things can occur. With Deep Dream, Google decided to tell the network to make new images.