Adding a Voice Interface to a Web Application (Video)

Matt Buck

At a recent Austin on Rails meeting, we gave a short talk on the benefits of adding an interactive voice interface to a web application. Over the course of the afternoon we spent preparing for the presentation, we were able to add voice interaction to the Spree Open Source ecommerce platform.

Why voice?

When someone uses a piece of software, the software's first job is to divine the user's intent. What does this series of points, clicks, keyboard presses, swipes, and/or gestures mean? What is it that the user wishes to do? This process of expressing intent by navigating a visual (or physical) user interface is the primary means by which we have used technology to date. No matter what series of points, clicks, and keystrokes we require the user to enter, it all began in the user's mind with that intent:

  • I want to check my savings account balance.
  • I want to publish my latest blog post draft.
  • I want to see the shipping address for the latest order to my store.

That intent is formulated as a series of words in the user's mind. Rather than expressing that intent via pointing and clicking, we can instead allow users to express that intent in the most natural way possible: by speaking it:

  • “Show me my savings account balance.”
  • “Publish the latest draft blog post.”
  • “Show me the shipping address for the latest order.”

This kind of organic, natural communication with software will eventually become the primary way we interact with technology. But it's already possible to make these kinds of interactions a reality today.

How we did it

The code for the demo in the above presentation can be found on GitHub. We made use of Google Chrome's support for voice recognition via the Web Speech API, and leveraged the Api.ai Natural Language Understanding platform for parsing user input into machine-readable intents.

Let's talk!

If you want to learn more about Voxable can help you give voice to your product, please get in touch!