There are two fundamental reasons, and it goes right back to Google's mission. The first part of Google's mission is to organize the world's information. It turns out that a lot of the world's information is spoken, and we need to make that discoverable, searchable, organizable -- even if it's the audio track of a YouTube video, or a voicemail or whatever. The other part of Google's mission is to make all that information universally acceptable and useful, and so as an example, one really key part of that is how do you interact with the Internet when you're mobile? When you're mobile, you have small keypads, you may be walking down a street, riding your bike, driving a car, or whatever, and it's just often more convenient to talk than type.
So we want to make speech ubiquitously available input/output mode, so that whenever the end user feels like that's the mode by which I want to interact, we want it to be available, and available with such high performance that when they prefer speech, they just naturally use it.