This is a guide to the WebSocket server protocol.
Note: The protocol is still in early development and may change in future releases. Many features are currently missing, and the server hasn't been thoroughly tested.
echogarden serve [options]
Options (all optional):
port
: Port number. Defaults to45054
secure
: Start a secure server? Defaults tofalse
certPath
: Path to a certificate file, required whensecure = true
keyPath
: Path to a private key file, required whensecure = true
deflate
: Use per-message deflate. Defaults totrue
maxPayload
: Maximum raw message payload size (in bytes). Defaults to1000 * 1000000
(1GB)useWorkerThread
: Run worker in a separate thread. Defaults totrue
(recommended leaving as is)
For Node.js clients, a simple client class allows to wrap communications with the server in a more convenient interface, without needing to know the details of the protocol.
Currently, the client is embedded in the main codebase. This means you have to import the echogarden
package to use it:
import { WebSocket } from 'ws'
import { Client } from 'echogarden'
const ws = new WebSocket('ws://localhost:45054')
ws.on("open", async () => {
const client = new Client(ws)
const { audio } = await client.synthesize("Hello World", { engine: 'espeak' })
})
TODO:
- Separate the client to an independent, lightweight, Node.js package, with browser compatibility
- Add support for cancellation signals
- Document how to use with a background worker
The protocol is based on binary WebSocket messages, for both request and response objects. Messages are encoded using the MessagePack encoding scheme.
All messages are objects and have the basic structure:
{
messageType: string
requestId: string
// ... other data specific for the target message type
}
The messageType
property is a string representing the operation to perform. These operations are parallel to the methods provided by the Node.js API. They can be one of:
SynthesisRequest
VoiceListRequest
RecognitionRequest
AlignmentRequest
SpeechTranslationRequest
SpeechLanguageDetectionRequest
TextLanguageDetectionRequest
When sending a message, requestId
should contain a long random string that uniquely identifies your request, like cb7e0f3ec835a213b005c4424c8d5775
.
For example, this message requests synthesis:
{
messageType: 'SynthesisRequest',
requestId: 'cb7e0f3ec835a213b005c4424c8d5775',
input: 'Hello World!',
options: {
engine: 'espeak',
language: 'en-GB'
}
}
After a message is sent, the server may send one or more response messages. Each response message includes the requestId
identifier that was given on the request it replies to. The response message messageType
can be:
SynthesisResponse
,SynthesisSegmentEvent
,SynthesisSentenceEvent
VoiceListResponse
RecognitionResponse
AlignmentResponse
SpeechTranslationResponse
SpeechLanguageDetectionResponse
TextLanguageDetectionResponse
The properties included in the response objects, are similar to the ones returned by the API.
Example response, for the above synthesis request:
{
messageType: 'SynthesisResponse',
requestId: 'cb7e0f3ec835a213b005c4424c8d5775',
audio: {
sampleRate: 22050,
channels: [ ... ] // An array of `Float32Array` channel data
}
// ... other result object properties
}
To cancel an existing request, the client can send a CancellationRequest
message, with the same requestId
of an ongoing request, like:
{
messageType: 'CancellationRequest',
requestId: 'cb7e0f3ec835a213b005c4424c8d5775'
}
TODO: Cancellation requests are currently only supported for synthesis operations. Extend support for other operations.
You can use the startServer
method to start a new server.
async function startServer(serverOptions: ServerOptions, onStarted: (options: ServerOptions) => void)
Example:
import { startServer } from 'echogarden'
await startServer({ port: 1234 }, () => {
console.log("Server is started!")
})
The method would return when the server is closed.
TODO: accept a signal to stop the server.
When an HTTP web-based user interface is developed, it would be integrated with the WebSocket server and share the same port.
Currently, if you start a non-secure server and try to open http://localhost:45054
(or another port you specified), you should see a placeholder message saying:
This is the Echogarden HTTP server!
If you can't see this message, there may be a configuration issue, or some other problem.
Expose more operations to the server.