Just Below the Surface
As we've already noted, there are millions upon millions of sub-pages strewn throughout millions of domains. There are internal pages with no external links, such as internal.howstuffworks.com, which are used for site maintenance purposes. There are unpublished or unlisted blog posts, picture galleries, file directories, and untold amounts of content that search engines just can't see.
Here's just one example. There are many independent newspaper Web sites online, and sometimes, search engines index a few of the articles on those sites. That's particularly true for major news stories that receive a lot of media attention. A quick Google search will undoubtedly unveil many dozens of articles on, for example, World Cup soccer teams.
But if you're looking for a more obscure story, you may have to go directly to a specific newspaper site and then browse or search content to find what you're looking for. This is especially true as a news story ages. The older the story, the more likely it's stored only on the newspaper's archive, which isn't visible on the surface Web. Subsequently, that story may not appear readily in search engines -- so it counts as part of the deep Web.