Skip to content

Commit

Permalink
More TTS providers (mastra-ai#1589)
Browse files Browse the repository at this point in the history
Inspired by Orate, add more TTS providers
  • Loading branch information
abhiaiyer91 authored Jan 28, 2025
1 parent 164b420 commit 6caa4b3
Show file tree
Hide file tree
Showing 82 changed files with 6,121 additions and 57 deletions.
14 changes: 14 additions & 0 deletions .changeset/smart-ligers-dream.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
'@mastra/speech-elevenlabs': patch
'@mastra/speech-replicate': patch
'@mastra/speech-speechify': patch
'@mastra/speech-deepgram': patch
'@mastra/speech-google': patch
'@mastra/speech-openai': patch
'@mastra/speech-playai': patch
'@mastra/speech-azure': patch
'@mastra/speech-murf': patch
'@mastra/speech-ibm': patch
---

Speech modules for TTS providers
85 changes: 84 additions & 1 deletion docs/src/pages/docs/reference/tts/generate.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ const outputPath = path.join(process.cwd(), 'test-outputs/open-aigenerate-test.m
writeFileSync(outputPath, audioResult);
```

### Basic Audio Generation (OpenAI)
### Basic Audio Generation (PlayAI)

```typescript
import { PlayAITTS } from '@mastra/tts'
Expand All @@ -103,6 +103,89 @@ const outputPath = path.join(process.cwd(), 'test-outputs/open-aigenerate-test.m
writeFileSync(outputPath, audioResult);
```

### Azure Generation

```typescript
import { AzureTTS } from '@mastra/tts'

const tts = new AzureTTS({
model: {
name: 'en-US-JennyNeural',
apiKey: process.env.AZURE_API_KEY,
region: process.env.AZURE_REGION,
},
});

const { audioResult } = await tts.generate({ text: "What is AI?" });
await writeFile(path.join(process.cwd(), '/test-outputs/azure-output.mp3'), audioResult);
```

### Deepgram Generation

```typescript
import { DeepgramTTS } from '@mastra/tts'

const tts = new DeepgramTTS({
model: {
name: 'aura',
voice: 'asteria-en',
apiKey: process.env.DEEPGRAM_API_KEY,
},
});

const { audioResult } = await tts.generate({ text: "What is AI?" });
await writeFile(path.join(process.cwd(), '/test-outputs/deepgram-output.mp3'), audioResult);
```

### Google Generation

```typescript
import { GoogleTTS } from '@mastra/tts'

const tts = new GoogleTTS({
model: {
name: 'en-US-Standard-A',
credentials: process.env.GOOGLE_CREDENTIALS,
},
});

const { audioResult } = await tts.generate({ text: "What is AI?" });
await writeFile(path.join(process.cwd(), '/test-outputs/google-output.mp3'), audioResult);
```

### IBM Generation

```typescript
import { IbmTTS } from '@mastra/tts'

const tts = new IbmTTS({
model: {
voice: 'en-US_AllisonV3Voice',
apiKey: process.env.IBM_API_KEY,
},
});

const { audioResult } = await tts.generate({ text: "What is AI?" });
await writeFile(path.join(process.cwd(), '/test-outputs/ibm-output.mp3'), audioResult);
```

### Murf Generation

```typescript
import { MurfTTS } from '@mastra/tts'

const tts = new MurfTTS({
model: {
name: 'GEN2',
voice: 'en-US-natalie',
apiKey: process.env.MURF_API_KEY,
},
});

const { audioResult } = await tts.generate({ text: "What is AI?" });
await writeFile(path.join(process.cwd(), '/test-outputs/murf-output.mp3'), audioResult);
```

## Related Methods

For streaming audio responses, see the [`stream()`](./stream.mdx) method documentation.
94 changes: 93 additions & 1 deletion docs/src/pages/docs/reference/tts/providers-and-models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,96 @@ description: Overview of supported TTS providers and their models.
| ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| ElevenLabs | `eleven_multilingual_v2`, `eleven_flash_v2_5`, `eleven_flash_v2`, `eleven_multilingual_sts_v2`, `eleven_english_sts_v2` |
| OpenAI | `tts-1`, `tts-1-hd` |
| PlayAI | `PlayDialog`, `Play3.0-mini` |
| PlayAI | `PlayDialog`, `Play3.0-mini` |
| Azure | Various voices available through Azure Cognitive Services |
| Deepgram | `aura` and other models with voice options like `asteria-en` |
| Google | Various voices through Google Cloud Text-to-Speech |
| IBM | Various voices including `en-US_AllisonV3Voice` |
| Murf | `GEN1`, `GEN2` with various voices like `en-US-natalie` |

## Configuration

Each provider requires specific configuration. Here are examples for each provider:

### ElevenLabs Configuration
```typescript
const tts = new ElevenLabsTTS({
model: {
name: 'eleven_multilingual_v2',
apiKey: process.env.ELEVENLABS_API_KEY,
},
});
```

### OpenAI Configuration
```typescript
const tts = new OpenAITTS({
model: {
name: 'tts-1', // or 'tts-1-hd' for higher quality
apiKey: process.env.OPENAI_API_KEY,
},
});
```

### PlayAI Configuration
```typescript
const tts = new PlayAITTS({
model: {
name: 'PlayDialog', // or 'Play3.0-mini'
apiKey: process.env.PLAYAI_API_KEY,
},
userId: process.env.PLAYAI_USER_ID,
});
```

### Azure Configuration
```typescript
const tts = new AzureTTS({
model: {
name: 'en-US-JennyNeural',
apiKey: process.env.AZURE_API_KEY,
region: process.env.AZURE_REGION,
},
});
```

### Deepgram Configuration
```typescript
const tts = new DeepgramTTS({
model: {
name: 'aura',
voice: 'asteria-en',
apiKey: process.env.DEEPGRAM_API_KEY,
},
});
```

### Google Configuration
```typescript
const tts = new GoogleTTS({
model: {
name: 'en-US-Standard-A',
credentials: process.env.GOOGLE_CREDENTIALS,
},
});
```

### IBM Configuration
```typescript
const tts = new IbmTTS({
model: {
voice: 'en-US_AllisonV3Voice',
apiKey: process.env.IBM_API_KEY,
},
});
```

### Murf Configuration
```typescript
const tts = new MurfTTS({
model: {
name: 'GEN2',
voice: 'en-US-natalie',
apiKey: process.env.MURF_API_KEY,
},
});
119 changes: 116 additions & 3 deletions docs/src/pages/docs/reference/tts/stream.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ The `stream()` method is used to interact with the TTS model to produce an audio

## Examples

### Basic Audio Stream (ElevenLabs)
### ElevenLabs Streaming

```typescript
import { ElevenLabsTTS } from '@mastra/tts'
Expand Down Expand Up @@ -83,7 +83,7 @@ for await (const chunk of audioResult) {
writeStream.end()
```

### Basic Audio Stream (OpenAI)
### OpenAI Streaming

```typescript
import { OpenAITTS } from '@mastra/tts'
Expand Down Expand Up @@ -126,7 +126,7 @@ for await (const chunk of audioResult) {
writeStream.end()
```

### Basic Audio Stream (PlayAI)
### PlayAI Streaming

```typescript
import { PlayAITTS } from '@mastra/tts'
Expand Down Expand Up @@ -168,4 +168,117 @@ for await (const chunk of audioResult) {
}

writeStream.end()
```

### Azure Streaming

```typescript
import { AzureTTS } from '@mastra/tts'

const tts = new AzureTTS({
model: {
name: 'en-US-JennyNeural',
apiKey: process.env.AZURE_API_KEY,
region: process.env.AZURE_REGION,
},
});

const { audioResult } = await tts.stream({ text: "What is AI?" });

// Create a write stream
const outputPath = path.join(process.cwd(), '/test-outputs/azure-stream.mp3');
const writeStream = createWriteStream(outputPath);

// Pipe the audio stream to the file
audioResult.pipe(writeStream);
```

### Deepgram Streaming

```typescript
import { DeepgramTTS } from '@mastra/tts'

const tts = new DeepgramTTS({
model: {
name: 'aura',
voice: 'asteria-en',
apiKey: process.env.DEEPGRAM_API_KEY,
},
});

const { audioResult } = await tts.stream({ text: "What is AI?" });

// Create a write stream
const outputPath = path.join(process.cwd(), '/test-outputs/deepgram-stream.mp3');
const writeStream = createWriteStream(outputPath);

// Pipe the audio stream to the file
audioResult.pipe(writeStream);
```

### Google Streaming

```typescript
import { GoogleTTS } from '@mastra/tts'

const tts = new GoogleTTS({
model: {
name: 'en-US-Standard-A',
credentials: process.env.GOOGLE_CREDENTIALS,
},
});

const { audioResult } = await tts.stream({ text: "What is AI?" });

// Create a write stream
const outputPath = path.join(process.cwd(), '/test-outputs/google-stream.mp3');
const writeStream = createWriteStream(outputPath);

// Pipe the audio stream to the file
audioResult.pipe(writeStream);
```

### IBM Streaming

```typescript
import { IbmTTS } from '@mastra/tts'

const tts = new IbmTTS({
model: {
voice: 'en-US_AllisonV3Voice',
apiKey: process.env.IBM_API_KEY,
},
});

const { audioResult } = await tts.stream({ text: "What is AI?" });

// Create a write stream
const outputPath = path.join(process.cwd(), '/test-outputs/ibm-stream.mp3');
const writeStream = createWriteStream(outputPath);

// Pipe the audio stream to the file
audioResult.pipe(writeStream);
```

### Murf Streaming

```typescript
import { MurfTTS } from '@mastra/tts'

const tts = new MurfTTS({
model: {
name: 'GEN2',
voice: 'en-US-natalie',
apiKey: process.env.MURF_API_KEY,
},
});

const { audioResult } = await tts.stream({ text: "What is AI?" });

// Create a write stream
const outputPath = path.join(process.cwd(), '/test-outputs/murf-stream.mp3');
const writeStream = createWriteStream(outputPath);

// Pipe the audio stream to the file
audioResult.pipe(writeStream);
```
6 changes: 5 additions & 1 deletion examples/dane/src/mastra/agents/package-publisher.ts
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,11 @@ const packages_llm_text = `
- Format: @mastra/vector-{name} -> vector-stores/{name}
- Special case: @mastra/vector-astra -> vector-stores/astra
## 4. Integrations - STRICT RULES:
## 4. Speech packages - STRICT RULES:
- ALL speech packages must be directly under speech/
- Format: @mastra/speech-{name} -> speech/{name}
## 5. Integrations - STRICT RULES:
- ALL integration packages are under integrations/
@mastra/apollos -> integrations/apollo
@mastra/ashby -> integrations/ashby
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
"build:packages": "pnpm --filter \"./packages/*\" build",
"build:vector-stores": "pnpm --filter \"./vector-stores/*\" build",
"build:deployers": "pnpm --filter \"./deployers/*\" build",
"build:deployers:dev": "pnpm --filter \"./deployers/*\" build:dev",
"build:speech": "pnpm --filter \"./speech/*\" build",
"build:cli": "pnpm --filter ./packages/cli build",
"build:deployer": "pnpm --filter ./packages/deployer build",
"build:core": "pnpm --filter ./packages/core build",
Expand Down
Loading

0 comments on commit 6caa4b3

Please sign in to comment.