Update voice-settings.mdx

willfrey · Sep 8, 2023 · 3aa11f3 · 3aa11f3
1 parent f395212
commit 3aa11f3
Showing 1 changed file with 7 additions and 11 deletions.
diff --git a/speech-synthesis/voice-settings.mdx b/speech-synthesis/voice-settings.mdx
@@ -4,29 +4,25 @@ description: "A guide on using stability, similarity sliders for tailored voice
 ---
 
 
-Our users have found different workflows that suit them. The one you'll see most often is setting stability around 50 and similarity near 80, with minimal changes thereafter. Of course, this all depends on the original voice and the style of performance you're aiming for.
+Our users have found different workflows that work for them. The one you'll see most often is setting stability around 50 and similarity near 80, with minimal changes thereafter. Of course, this all depends on the original voice and the style of performance you're aiming for.
 
-The AI is non-deterministic, which means that each time you press generate, you will get slightly different performance, even with the exact same settings.
+It's important to note that the AI is non-deterministic; setting the sliders to specific values won't guarantee the same results every time. Instead, the sliders function more as a range, determining how wide the randomization can be between each generation. Setting stability low means a wider range of randomization, often resulting in a more emotive performance, but this is also highly dependent on the voice itself.
+
+Hovering over the `!` icon next to the sliders will provide additional information.
 
 For a more lively and dramatic performance, it is recommended to set the stability slider lower and generate a few times until you find a performance you like.
 
 On the other hand, if you want a more serious performance, even bordering on monotone on very high values, it is recommended to set the stability slider higher. And since it's more consistent and stable, you usually don't need to do as many generations to get what you are looking for. Experiment to find what works best for you!
 
-Some users have taken it a step further with the API, making the sliders dynamic based on text length.
-
-It's important to note that the AI is non-deterministic; setting the sliders to a specific values won't guarantee the same results every time. Instead, the sliders function more as a range, determining how wide the randomization can be between each generation. Setting stability low means a wider range of randomization, often resulting in a more emotive performance, but this is also highly dependent on the voice itself.
-
-Hovering over `!` icon next to the sliders will provide additional information.
-
 
 ## Stability
 
-The stability slider determines how stable the voice is and the randomness of each new generation. Lowering this slider introduces a broader emotional range for the character - this, as mentioned before, is also influenced heavily by the original voice. Setting the slider too low may result in odd performances that are overly random and cause the character to speak too quickly. On the other hand, setting it too high can lead to a monotonous voice with limited emotion.
+The stability slider determines how stable the voice is and the randomness between each generation. Lowering this slider introduces a broader emotional range for the voice. As mentioned before, this is also influenced heavily by the original voice. Setting the slider too low may result in odd performances that are overly random and cause the character to speak too quickly. On the other hand, setting it too high can lead to a monotonous voice with limited emotion.
 
 
 ## Similarity
 
-The similarity slider dictates how closely the AI should adhere to the original voice when attempting to replicate it. If the original audio is of poor quality and the similarity slider is set too high, the AI may reproduce artefacts or background noise when trying to mimic the voice if those were present in the original recording.
+The similarity slider dictates how closely the AI should adhere to the original voice when attempting to replicate it. If the original audio is of poor quality and the similarity slider is set too high, the AI may reproduce artifacts or background noise when trying to mimic the voice if those were present in the original recording.
 
 
 ## Style Exaggeration
@@ -36,6 +32,6 @@ With the introduction of the newer models, we also added a style exaggeration se
 In general, we recommend keeping this setting at 0 at all times.
 
 
-## Speaker boost
+## Speaker Boost
 
 This is another setting that was introduced in the new models. The setting itself is quite self-explanatory – it boosts the similarity to the original speaker. However, using this setting requires a slightly higher computational load, which in turn increases latency. The differences introduced by this setting are generally rather subtle.