A Swift CLI tool to help you build custom speech recognition models for iOS 17+ using SFCustomLanguageModelData
.
Sometimes SFSpeechRecognizer
can get confused on specific technical terms or proper nouns. Starting in iOS 17, you can build a speech model using SFCustomLanguageModelData
that includes specific vocabulary and phrases that use that vocabulary. This CLI tool helps you build phrases and templates easily.
- Easy to use builder pattern for adding phrases and templates
- Support for X-SAMPA pronunciation strings
- Template system for generating common phrase patterns
- Built-in examples for custom vocabulary handling
- Generates
.bin
files ready to use in your iOS projects
- Clone this repo
- Start adding your phrases and templates in
main.swift
. To build the pronunciation,SFCustomLanguageModelData
uses X-SAMPA strings (X-SAMPA Reference). AI is really good at generating these, so highly recommend you use an AI tool (Cursor or other AI-enabled IDEs are really good for this). - Run the tool:
Or open and run in Xcode
swift main.swift [optional_output_path]
- It will generate a
.bin
file that you can drag into your iOS project
var builder = SpeechTrainingBuilder()
// 1. Add custom vocabulary with pronunciations
builder.addPhrase("Winawer", count: 100, pronunciation: "wIn'aU@r")
// 2. Add context phrases
builder.addPhrase("Play the Winawer variation", count: 50)
// 3. Create templates for common patterns
builder.addTemplate(
classes: [
"prefix": ["Let's play", "Consider"],
"opening": ["the Winawer"],
"suffix": ["variation", "defense"]
],
template: "<prefix> <opening> <suffix>",
count: 500
)
When building a speech model:
- Start with custom vocabulary and their pronunciations
- Add common phrases that use these words
- Create templates that combine custom and standard vocabulary
- Include variations of how people naturally speak these phrases
This helps the model understand both the pronunciation and context of your custom vocabulary.
[Coming Soon: Link to SpeechRecognizerService repo for easy integration]
- iOS 17.0+
- macOS 14.0+
- Xcode 15.0+
- Swift 5.9+
For more information about speech recognition in iOS, see Apple's documentation:
The X-SAMPA pronunciation strings can be tricky to get right. Here are some tips:
- Use AI tools to help generate the strings
- Common patterns:
- Stress mark:
'
before stressed syllable - Syllable boundary:
.
- Schwa sound:
@
- Long vowels: Add
:
- Example:
'tEm.poU
for "tempo"
- Stress mark:
MIT License