In my most recent project, I put together a voice controlled iRobot Create using the Android ADK, an iRobot Create and my Nexus S. The Android speech recognition API takes care of listening for speech, determining when to end the speech input, and also sending the resulting recording off to “the cloud” for processing. In the end, what you get back is a list of possible matches (this isn’t an exact science, after all).
There are two ways to incorporate speech recognition into an application. In the first approach, an ACTION_RECOGNIZE_SPEECH intent is broadcast by your application using startActivityForResult. The results are obtained by defining an onActivityResult method in your class. As you can see in the Voice Recognition API demo, it is very simple to write an application using this interface! The problem I had with this approach is that there was too little control over the speech recognition error handling. Also, I really wanted the speech recognition to be running all of the time. So, in the end I decided to use the second approach, using the SpeechRecognizer directly in my code. This actually didn’t make the code all that much more complicated. As an added bonus, your application is not being paused and resumed in order to get the results from the speech recognition activity.
Having the mechanics out-of the way, the next thing I did was to create a list of voice commands. The list of speech recognition matches was compared against the command list. If there was a match, I added the entire list of matches to a hash table, storing the actual command as the value. Thus, any time a close match came up, it would be found in the hash table, with the entry being the (hopefully) intended command.
Now we have the name of a voice command. We could write another if/else statement to perform the appropriate function call for each of the commands, or we could do something a little fancier. Using reflection, I turned the command name into a method call. So, to implement the command “forward,” you simply have to add a method called forward to the class!
Now, it isn’t quite that slick. I still keep an if/else statement in order to get a match on the speech recognition results, and to store close matches in the hash table. I’ll have to experiment with removing that code to see how it fares.
To browse through the code yourself, check out http://code.google.com/p/adk-moto/source/browse/src/com/jmoyer/adk_moto/ADKMoto.java. It’s GPLv3 licensed, so cut-n-paste away into your open source projects!