Tuesday 22 May 2012

Speech Input to Browser App

Capturing Speech in the Web Browser


Recently I have been looking to integrate audio into the web application, and as I understand HTML5 has a very simple speech-to-text input element.

I've found an example here

And the full specification here

Specific Uses

This speech-to-text input element can be used to allow voice input to the Kinect Kiosk.

Users shall be able to ask their questions, and the system shall interpret their speech and attempt to infer an answer using Sitepals AIML.

Limitations

The speech-to-text element requires users to press the button to start a recording, part of the specification requires that users know they are being recorded.

It also requires Google Chrome at the moment, and is not a standard for web browsers or W3C.  In other browser like IE and Firefox, the speech element appears as a normal textbox.

Other Work


I've also been making progress on face/motion detection using the Kinect and OpenCV, though the classifiers are still not working.  I'll make another post on my progress with that when it is running, and then I'll provide some sample code.

No comments:

Post a Comment