IBM Watson Speech Services for Web Browsers

Allows you to easily add voice recognition and synthesis to any web app with minimal code.

Warning This library is still has a few rough edges and may yet see breaking changes.

For Web Browsers Only

This library is primarily intended for use in browsers. Check out watson-developer-cloud to use Watson services (speech and others) from Node.js.

However, a server-side component is required to generate auth tokens. The examples/ folder includes example Node.js and Python servers, and SDKs are available for Node.js, Java, Python, and there is also a REST API.

Installation - standalone

Pre-compiled bundles are available from on GitHub Releases - just download the file and drop it into your website: https://github.com/watson-developer-cloud/speech-javascript-sdk/releases

Installation - npm with browserify

This library is built with browserify and easy to use in browserify-based projects :

npm install --save watson-speech

API & Examples

The basic API is outlined below, see complete API docs at http://watson-developer-cloud.github.io/speech-javascript-sdk/master/

See several examples at https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/static/

All API methods require an auth token that must be generated server-side. (See https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/ for a couple of basic examples in Node.js and Python.)

`WatsonSpeech.TextToSpeech`

`.synthesize({text, token})` -> `<audio>`

Speaks the supplied text through an automatically-created <audio> element. Currently limited to text that can fit within a GET URL (this is particularly an issue on Internet Explorer before Windows 10 where the max length is around 1000 characters after the token is accounted for.)

Options:

text - the text to transcribe // todo: list supported languages
voice - the desired playback voice's name - see .getVoices(). Note that the voices are language-specific.
autoPlay - set to false to prevent the audio from automatically playing

`WatsonSpeech.SpeechToText`

`.recognizeMicrophone({token})` -> Stream

Options:

keepMic: if true, preserves the MicrophoneStream for subsequent calls, preventing additional permissions requests in Firefox
Other options passed to RecognizeStream
Other options passed to WritableElementStream if options.outputElement is set

Requires the getUserMedia API, so limited browser compatibility (see http://caniuse.com/#search=getusermedia) Also note that Chrome requires https (with a few exceptions for localhost and such) - see https://www.chromium.org/Home/chromium-security/prefer-secure-origins-for-powerful-new-features

Pipes results through a FormatStream by default, set options.format=false to disable.

Known issue: Firefox continues to display a microphone icon in the address bar after recording has ceased. This is a browser bug.

`.recognizeFile({data, token})` -> Stream

Can recognize and optionally attempt to play a File or Blob (such as from an <input type="file"/> or from an ajax request.)

Options:

data: a Blob or File instance.
play: (optional, default=false) Attempt to also play the file locally while uploading it for transcription
Other options passed to RecognizeStream
Other options passed to WritableElementStream if options.outputElement is set

playrequires that the browser support the format; most browsers support wav and ogg/opus, but not flac.) Will emit a playback-error on the RecognizeStream if playback fails. Playback will automatically stop when .stop() is called on the RecognizeStream.

Pipes results through a TimingStream by if options.play=true, set options.realtime=false to disable.

Pipes results through a FormatStream by default, set options.format=false to disable.

Changes

There have been a few breaking changes in recent releases:

Removed SpeechToText.recognizeElement() due to quality issues
renamed recognizeBlob to recognizeFile to make the primary usage more apparent
Changed playFile option of recognizeBlob() to just play, corrected default

See CHANGELOG.md for a complete list of changes.

todo

Further solidify API
break components into standalone npm modules where it makes sense
run integration tests on travis (fall back to offline server for pull requests)
add even more tests
better cross-browser testing (IE, Safari, mobile browsers - maybe saucelabs?)
update node-sdk to use current version of this lib's RecognizeStream (and also provide the FormatStream + anything else that might be handy)
move result and results events to node wrapper (along with the deprecation notice)
improve docs
consider a wrapper to match https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html
support a "hard" stop that prevents any further data events, even for already uploaded audio, ensure timing stream also implements this.
look for bug where single-word final results may omit word confidence (possibly due to FormatStream?)
fix bug where TimingStream shows words slightly before they're spoken

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
dist		dist
examples		examples
jsdoc		jsdoc
speech-to-text		speech-to-text
test		test
text-to-speech		text-to-speech
util		util
.editorconfig		.editorconfig
.eslintignore		.eslintignore
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.npmignore		.npmignore
.travis.yml		.travis.yml
CHANGELOG.md		CHANGELOG.md
README.md		README.md
index.js		index.js
karma.conf.js		karma.conf.js
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IBM Watson Speech Services for Web Browsers

For Web Browsers Only

Installation - standalone

Installation - npm with browserify

API & Examples

`WatsonSpeech.TextToSpeech`

`.synthesize({text, token})` -> `<audio>`

`WatsonSpeech.SpeechToText`

`.recognizeMicrophone({token})` -> Stream

`.recognizeFile({data, token})` -> Stream

Changes

todo

About

Uh oh!

Releases

Packages

Languages

dongweibox/speech-javascript-sdk

Folders and files

Latest commit

History

Repository files navigation

IBM Watson Speech Services for Web Browsers

For Web Browsers Only

Installation - standalone

Installation - npm with browserify

API & Examples

WatsonSpeech.TextToSpeech

.synthesize({text, token}) -> <audio>

WatsonSpeech.SpeechToText

.recognizeMicrophone({token}) -> Stream

.recognizeFile({data, token}) -> Stream

Changes

todo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`WatsonSpeech.TextToSpeech`

`.synthesize({text, token})` -> `<audio>`

`WatsonSpeech.SpeechToText`

`.recognizeMicrophone({token})` -> Stream

`.recognizeFile({data, token})` -> Stream

Packages