Friday, January 14, 2011

River Goes to the Movies

I've done tons of work on River in the past few days. I spent very long hours learning about regex and strings in C++, and even longer hours applying that knowledge, all just to get basic functionality. The good news is, I can now program similar tasks very quick, since I finally understand it all.


IMDB Integration

River can now use IMDB to get info about Movies and TV Shows.

Sample Commands:
River, what year did the movie AI come out?
River, when was the movie I, Robot released?
What is the movie The Matrix about?
(In context) When did it come out?
(In context) What is the full plot summary?


One of the features that makes her seem the most "real" is the ability to pull up information in context. She remembers what movie I last asked about, so if I ask more information, she is able to immediately tell me the information I want, without having me repeat the name of the movie. It's still a little slower than I'd like, but I believe I can eventually speed that up by parsing the page using only exact string searches rather than regex. For now however, regex is significantly less code, and much faster to program. Since everything is very modular, I can easily go back and update the class in charge of parsing the information to make it perform more efficiently.


New Discoveries

As I was playing around with the IMDB features, I realized just how important it is that I implement unicode support fairly soon. The web is full of unicode characters, and until I make all the necessary changes to River, she's going to have problems with them. She won't crash when she encounters them, but she does read the latin1 encoding characters off as letters.


Next Steps

This weekend I hope to make progress on implementing an ongoing speech recognition training ability. If I say something and River hears me incorrectly, I'll say "Train Speech", and then type what it was I said. She'll then use the recorded audio along with the text to improve her hearing capabilities.

I intend to program the speech recognition training feature slowly. I expect it to have a long learning curve for me, and while it will improve the program significantly, it's not exactly "fun" to work on. As a result, I am also focusing on other simpler, but more fun features. I'll be adding to the IMDB functions, as well as hopefully adding Twitter, Facebook, and Google News integration.


Demonstration Video

You can't see River (there's nothing to see but debug statements), but you can hear her.

No comments:

Post a Comment