Developer guide

Adding PySpeech

You'll need to download the latest release and add PySpeech.dll as a reference in your project.

Once that's done, you'll want to add it as a BepInDependency, like so:

[BepInDependency("JS03.PySpeech")]
public class YourPlugin : BaseUnityPlugin {
    // ...
}

Alternatively, you can add it as a soft dependency if your mod does not rely on speech recognition to work:

[BepInDependency("JS03.PySpeech", BepInDependency.DependencyFlags.SoftDependency)]
public class YourPlugin : BaseUnityPlugin {
    // ...
}

Voice Commands

Registering Phrases

To register phrases, all you have to do is call Speech.RegisterPhrases:

string[] phrases = new string[] { "cool phrase", "another cool phrase" };
Speech.RegisterPhrases(phrases);

You can call Speech.RegisterPhrases whenever and wherever you'd like, there's no limitations in that regard.

Custom `SpeechRecognized` event

Creating your own SpeechRecognized event allows you to execute whatever code you want every time the Whisper model gives output.

Here's a simple example of how to set one up:

Speech.RegisterCustomHandler((obj, recognized) => {
    Plugin.logger.LogInfo($"Whisper output: {recognized.Text}");
});

These can be set up anywhere, just like Speech.RegisterPhrases.

Similarity threshold

When the Whisper model gives output, if any phrases have been registered the API will do a case insensitive similarity check between all the phrases and the recognized phrase to see which one has the highest similarity score.

To check if any of your phrases was said, you can use Speech.IsAboveThreshold(phrases, similarityThreshold), which tells you if the recognized phrase matches your specific set of phrases and if the similarity score is above your desired threshold.

Here's an example using Speech.IsAboveThreshold:

string [] phrases = new string[] { "test phrase" }
Speech.RegisterPhrases(phrases);
Speech.RegisterCustomHandler((obj, recognized) => {

    // Check if your phrases contain the best match and its similarity score is above 0.5f
    if (Speech.IsAboveThreshold(phrases, 0.5f)) {
        Plugin.logger.LogInfo("Test phrase was said!");
    }
});

Real time speech transcription

Thanks to Whisper, PySpeech allows you to capture what the player is saying in real time at all times while playing.

If you're not interested in doing voice commands and would instead like to do real time speech transcription, you just have to create your own SpeechRecognized event as shown in the Voice Commands section. No need to register phrases or do similarity checks:

Speech.RegisterCustomHandler((obj, recognized) => {
    Plugin.logger.LogInfo($"Whisper output: {recognized.Text}");
});

Models

PySpeech offers 3 different Whisper models players can choose from:

Tiny: small, fast, not very accurate.
Base: medium, not as fast, more accurate.
Small: big, slow, very accurate.

If you'd like for the players to use a specific model, make it known in your mod description.

The default is the Tiny model.

Languages

Players can also choose the language they want Whisper to recognize between the 100 supported languages, which are listed here.

There is also a Multilingual option that allows for real time language identification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Developer guide

Adding PySpeech

Voice Commands

Registering Phrases

Custom `SpeechRecognized` event

Similarity threshold

Real time speech transcription

Models

Languages

Uh oh!

Clone this wiki locally

Developer guide

Adding PySpeech

Voice Commands

Registering Phrases

Custom SpeechRecognized event

Similarity threshold

Real time speech transcription

Models

Languages

Uh oh!

Clone this wiki locally

Custom `SpeechRecognized` event