Speech Recognition & Text to Speech in the Browser

Speech Recognition

If you’ve ever clicked the microphone on the right-hand side of the Google search input, you’ve already experienced the power of the SpeechRecognition API. The speech recognizer will listen to what you say and convert your words to a string.

Browser support is currently limited to just Chrome for now. Firefox support can be enabled by setting the media.webspeech.recognition.enable flag in about:config.

// This API is currently prefixed in Chrome
var SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;

// Create a new recognizer
var recognizer = new SpeechRecognition();

// Start producing results before the person has finished speaking
recognizer.interimResults = true;

// Set the language of the recognizer
recognizer.lang = 'en-US';

// Define a callback to process results
recognizer.onresult = function (event) {
  var result = event.results[event.resultIndex];
  console.log('Interim result', result[0].transcript);
};

// Start listening...
recognizer.start();

Demo

For more information, visit the Mozilla Developer Network and Web Speech API Specification

Speech Synthesis or Text to Speech

Another API that might surprise you is the browser’s speechSynthesis API. It allows you to make your browser talk. 

Browser support is a little better for this API, as it works on both Chrome and Safari (including mobile browsers).

speechSynthesis.speak(
  new SpeechSynthesisUtterance('Hello World')
);

You can also specify which voice to use:

var voices = speechSynthesis.getVoices();
var utterance = new SpeechSynthesisUtterance('Hello World');
utterance.voice = voices[1];
speechSynthesis.speak(utterance);

For more information, visit the Mozilla Developer Network and Web Speech API Specification