Skip to content

fix: Update camelCase parameter and fix bug in TTS #153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 22, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@ dist/*.js
dist/*.map
gh-pages/
.idea
watson-speech-*.tgz
9 changes: 9 additions & 0 deletions .npmignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,12 @@ gh-pages/
scripts/
docs/
.env
.github/
CHANGELOG.md
bower.json
karma.conf.js
speech-to-text
text-to-speech
util
webpack.config.js
watson-speech-*.tgz
19 changes: 11 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,16 @@
IBM Watson Speech Services for Web Browsers
===========================================
# IBM Watson Speech Services for Web Browsers

[![Build Status](https://travis-ci.org/watson-developer-cloud/speech-javascript-sdk.svg?branch=master)](https://travis-ci.org/watson-developer-cloud/speech-javascript-sdk)
[![npm-version](https://img.shields.io/npm/v/watson-speech.svg)](https://www.npmjs.com/package/watson-speech)

Allows you to easily add voice recognition and synthesis to any web app with minimal code.

### Built for Browsers

This library is primarily intended for use in web browsers. Check out [watson-developer-cloud](https://www.npmjs.com/package/watson-developer-cloud) to use Watson services (speech and others) from Node.js.

However, a **server-side component is required to generate auth tokens**. The `examples/` folder includes example Node.js and Python servers, and SDKs are available for [Node.js](https://github.com/watson-developer-cloud/node-sdk#authorization), [Java](https://github.com/watson-developer-cloud/java-sdk), [Python](https://github.com/watson-developer-cloud/python-sdk/blob/master/examples/authorization_v1.py), and there is also a [REST API](https://cloud.ibm.com/docs/services/watson?topic=watson-gs-tokens-watson-tokens).


### Installation - standalone

Pre-compiled bundles are available from on GitHub Releases - just download the file and drop it into your website: https://github.com/watson-developer-cloud/speech-javascript-sdk/releases
Expand Down Expand Up @@ -61,19 +60,23 @@ See [CHANGELOG.md](CHANGELOG.md) for a complete list of changes.
## Development

### Use examples for development

The provided examples can be used to test developmental code in action:
* `cd examples/`
* `npm run dev`

- `cd examples/`
- `npm run dev`

This will build the local code, move the new bundle into the `examples/` directory, and start a new server at `localhost:3000` where the examples will be running.

Note: This requires valid service credentials.

### Testing

The test suite is broken up into offline unit tests and integration tests that test against actual service instances.
* `npm test` will run the linter and the offline tests
* `npm run test-offline` will run the offline tests
* `npm run test-integration` will run the integration tests

- `npm test` will run the linter and the offline tests
- `npm run test-offline` will run the offline tests
- `npm run test-integration` will run the integration tests

To run the integration tests, a file with service credentials is required. This file must be called `stt-auth.json` and must be located in `/test/resources/`. There are tests for usage of both CF and RC service instances. For testing CF, the required keys in this configuration file are `username` and `password`. For testing RC, a key of either `iam_acess_token` or `iam_apikey` is required. Optionally, a service URL for an RC instance can be provided under the key `rc_service_url` if the service is available under a URL other than `https://stream.watsonplatform.net/speech-to-text/api`.

Expand Down
7 changes: 3 additions & 4 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
API & Examples
--------------
## API & Examples

The basic API is outlined below, see complete API docs at http://watson-developer-cloud.github.io/speech-javascript-sdk/master/

See several basic examples at http://watson-speech.mybluemix.net/ ([source](https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/))

See a more advanced example at https://speech-to-text-demo.mybluemix.net/

All API methods require an auth token that must be [generated server-side](https://github.com/watson-developer-cloud/node-sdk#authorization).
All API methods require an auth token that must be [generated server-side](https://github.com/watson-developer-cloud/node-sdk#authorization).
(See https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/ for a couple of basic examples in Node.js and Python.)

_NOTE_: The `token` parameter only works for CF instances of services. For RC services using IAM for authentication, the `access_token` parameter must be used.
_NOTE_: The `token` parameter only works for CF instances of services. For RC services using IAM for authentication, the `accessToken` parameter must be used.
56 changes: 28 additions & 28 deletions docs/SPEECH-TO-TEXT.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,47 +8,47 @@ The core of the library is the [RecognizeStream] that performs the actual transc

_NOTE_ The RecognizeStream class lives in the Watson Node SDK. Any option available on this class can be passed into the following methods. These parameters are documented at http://watson-developer-cloud.github.io/node-sdk/master/classes/recognizestream.html

### [`.recognizeMicrophone({token||access_token})`](http://watson-developer-cloud.github.io/speech-javascript-sdk/master/module-watson-speech_speech-to-text_recognize-microphone.html) -> Stream
### [`.recognizeMicrophone({token||accessToken})`](http://watson-developer-cloud.github.io/speech-javascript-sdk/master/module-watson-speech_speech-to-text_recognize-microphone.html) -> Stream

Options:
* `keepMicrophone`: if true, preserves the MicrophoneStream for subsequent calls, preventing additional permissions requests in Firefox
* `mediaStream`: Optionally pass in an existing media stream rather than prompting the user for microphone access.
* Other options passed to [RecognizeStream]
* Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
* Other options passed to [FormatStream] if `options.format` is not set to false
* Other options passed to [WritableElementStream] if `options.outputElement` is set
Options:

Requires the `getUserMedia` API, so limited browser compatibility (see http://caniuse.com/#search=getusermedia)
- `keepMicrophone`: if true, preserves the MicrophoneStream for subsequent calls, preventing additional permissions requests in Firefox
- `mediaStream`: Optionally pass in an existing media stream rather than prompting the user for microphone access.
- Other options passed to [RecognizeStream]
- Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
- Other options passed to [FormatStream] if `options.format` is not set to false
- Other options passed to [WritableElementStream] if `options.outputElement` is set

Requires the `getUserMedia` API, so limited browser compatibility (see http://caniuse.com/#search=getusermedia)
Also note that Chrome requires https (with a few exceptions for localhost and such) - see https://www.chromium.org/Home/chromium-security/prefer-secure-origins-for-powerful-new-features

No more data will be set after `.stop()` is called on the returned stream, but additional results may be recieved for already-sent data.


### [`.recognizeFile({data, token||access_token})`](http://watson-developer-cloud.github.io/speech-javascript-sdk/master/module-watson-speech_speech-to-text_recognize-file.html) -> Stream
### [`.recognizeFile({data, token||accessToken})`](http://watson-developer-cloud.github.io/speech-javascript-sdk/master/module-watson-speech_speech-to-text_recognize-file.html) -> Stream

Can recognize and optionally attempt to play a URL, [File](https://developer.mozilla.org/en-US/docs/Web/API/File) or [Blob](https://developer.mozilla.org/en-US/docs/Web/API/Blob)
(such as from an `<input type="file"/>` or from an ajax request.)

Options:
* `file`: a String URL or a `Blob` or `File` instance. Note that [CORS] restrictions apply to URLs.
* `play`: (optional, default=`false`) Attempt to also play the file locally while uploading it for transcription
* Other options passed to [RecognizeStream]
* Other options passed to [TimingStream] if `options.realtime` is true, or unset and `options.play` is true
* Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
* Other options passed to [FormatStream] if `options.format` is not set to false
* Other options passed to [WritableElementStream] if `options.outputElement` is set
Options:

`play` requires that the browser support the format; most browsers support wav and ogg/opus, but not flac.)
- `file`: a String URL or a `Blob` or `File` instance. Note that [CORS] restrictions apply to URLs.
- `play`: (optional, default=`false`) Attempt to also play the file locally while uploading it for transcription
- Other options passed to [RecognizeStream]
- Other options passed to [TimingStream] if `options.realtime` is true, or unset and `options.play` is true
- Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
- Other options passed to [FormatStream] if `options.format` is not set to false
- Other options passed to [WritableElementStream] if `options.outputElement` is set

`play` requires that the browser support the format; most browsers support wav and ogg/opus, but not flac.)
Will emit an `UNSUPPORTED_FORMAT` error on the RecognizeStream if playback fails. This error is special in that it does not stop the streaming of results.

Playback will automatically stop when `.stop()` is called on the returned stream.
Playback will automatically stop when `.stop()` is called on the returned stream.

For Mobile Safari compatibility, a URL must be provided, and `recognizeFile()` must be called in direct response to a user interaction (so the token must be pre-loaded).

[RecognizeStream]: http://watson-developer-cloud.github.io/node-sdk/master/classes/recognizestream.html
[TimingStream]: http://watson-developer-cloud.github.io/speech-javascript-sdk/master/TimingStream.html
[FormatStream]: http://watson-developer-cloud.github.io/speech-javascript-sdk/master/FormatStream.html
[WritableElementStream]: http://watson-developer-cloud.github.io/speech-javascript-sdk/master/WritableElementStream.html
[SpeakerStream]: http://watson-developer-cloud.github.io/speech-javascript-sdk/master/SpeakerStream.html
[CORS]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS

[recognizestream]: http://watson-developer-cloud.github.io/node-sdk/master/classes/recognizestream.html
[timingstream]: http://watson-developer-cloud.github.io/speech-javascript-sdk/master/TimingStream.html
[formatstream]: http://watson-developer-cloud.github.io/speech-javascript-sdk/master/FormatStream.html
[writableelementstream]: http://watson-developer-cloud.github.io/speech-javascript-sdk/master/WritableElementStream.html
[speakerstream]: http://watson-developer-cloud.github.io/speech-javascript-sdk/master/SpeakerStream.html
[cors]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS
17 changes: 9 additions & 8 deletions docs/TEXT-TO-SPEECH.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,18 @@

## [`WatsonSpeech.TextToSpeech`](http://watson-developer-cloud.github.io/speech-javascript-sdk/master/module-watson-speech_text-to-speech.html)

### [`.synthesize({text, token||access_token})`](http://watson-developer-cloud.github.io/speech-javascript-sdk/master/module-watson-speech_text-to-speech_synthesize.html) -> `<audio>`
### [`.synthesize({text, token||accessToken})`](http://watson-developer-cloud.github.io/speech-javascript-sdk/master/module-watson-speech_text-to-speech_synthesize.html) -> `<audio>`

Speaks the supplied text through an automatically-created `<audio>` element.
Speaks the supplied text through an automatically-created `<audio>` element.
Currently limited to text that can fit within a GET URL (this is particularly an issue on [Internet Explorer before Windows 10](http://stackoverflow.com/questions/32267442/url-length-limitation-of-microsoft-edge)
where the max length is around 1000 characters after the token is accounted for.)

Options:
* text - the text to speak
* url - the Watson Text to Speech API URL (defaults to https://stream.watsonplatform.net/text-to-speech/api)
* voice - the desired playback voice's name - see .getVoices(). Note that the voices are language-specific.
* customization_id - GUID of a custom voice model - omit to use the voice with no customization.
* autoPlay - set to false to prevent the audio from automatically playing
Options:

- text - the text to speak
- url - the Watson Text to Speech API URL (defaults to https://stream.watsonplatform.net/text-to-speech/api)
- voice - the desired playback voice's name - see .getVoices(). Note that the voices are language-specific.
- customization_id - GUID of a custom voice model - omit to use the voice with no customization.
- autoPlay - set to false to prevent the audio from automatically playing

Relies on browser audio support: should work reliably in Chrome and Firefox on desktop and Android. Edge works with a little help. Safari and all iOS browsers do not seem to work yet.
Loading