Feat: Style text: make emotions and style similar to the style text by mixing bert #240

litagin02 · 2023-12-16T03:55:53Z

This adds new emotion-style reference, "style text".
This mixes the bert of original text and the mean of the bert of the style text, and the resulting voice sounds like a voice of the original text but with the nuance, emotion, tone of the reading of style text.
For example, original text "This is a test of style text." with style text "Oh my god, I'm very sad, disappointed..." yields the reading of the original text with very sad emotion.

I'm Japanese so I mainly test with my own Japanese models, and this emotion reference seems to work well (while the text prompt and audio prompt is not so effective for models which are finetuned over the pretrained 2.2 models).
Also I tested with Chinese and English using the pretrained 2.2 model, and it looks good.
I add this feature for 2.1 and 2.2 models, and don't touch older models (when using older models it just simply ignore style text).

(By the result of my experiments, it seems that bert has lots of information of emotion and reading style.)

feat: update fastapi.py. 添加更多错误日志信息

fix: update fastapi.py. 2.2 reference适配

* Add files via upload * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add files via upload * Add files via upload * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Delete attentions_onnx.py * Delete models_onnx.py * Add files via upload * Add files via upload * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update __init__.py * Update __init__.py * Update __init__.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Stardust-minus · 2023-12-16T03:59:52Z

Interesting,we will have a test.
Thanks for your great idea!

…y mixing bert (#240) (#241) * fix:(oldVersion210) Load on demand Emotion model * feat: update fastapi.py. 添加更多错误日志信息 * Switch pyopenjtalk to pyopenjtalk-prebuilt * fix: update fastapi.py. 2.2 reference适配 * Update resample.py * 修复Onnx导出的BUG (#237) * Add files via upload * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add files via upload * Add files via upload * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Delete attentions_onnx.py * Delete models_onnx.py * Add files via upload * Add files via upload * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update __init__.py * Update __init__.py * Update __init__.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- * Fix onnx * Format export * Feat: style-text and bert mixing (JA only) * Ensure the same tensor shape * Update * update gradio version * Fix * Style text for chinese and english (ver 2.2) * Style text for chinese and english (ver 2.1) * Style text in FastAPI * Translate style text desc in chinese --------- Co-authored-by: litagin02 <139731664+litagin02@users.noreply.github.com> Co-authored-by: Sora <654163754@qq.com> Co-authored-by: Sihan Wang <wangsihan1995@gmail.com> Co-authored-by: Ναρουσέ·μ·γιουμεμί·Χινακάννα <40709280+NaruseMioShirakana@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Fix inputs of duration discriminator * Add LSTM * Update models.py * Update tensorboard scalar * Noise injection for minimizing modality gap * Update infer.py * support bf16 run * del unused_para flag * support bf16 config * add grad clip * fix(logger and grad):add dur grad,fix grad clip * Update webui_preprocess.py * Fix English G2P * fix(bert_gen):add pass * Pass SDP to DD * Update webui_preprocess.py * Update config.json * Update webui.py * Update chinese_bert.py * Upload webui for deploy * Update webui.py * torch.save as pt not npy * Update config.json * add freeze emo vq * Update webui_preprocess.py * Fix tone_sandhi.py * Comment up grad clip * Fix in-place addition * Add SLM discriminator * Add DDP for WD * Feat: Style text: make emotions and style similar to the style text by mixing bert (#240) (#241) * fix:(oldVersion210) Load on demand Emotion model * feat: update fastapi.py. 添加更多错误日志信息 * Switch pyopenjtalk to pyopenjtalk-prebuilt * fix: update fastapi.py. 2.2 reference适配 * Update resample.py * 修复Onnx导出的BUG (#237) * Add files via upload * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add files via upload * Add files via upload * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Delete attentions_onnx.py * Delete models_onnx.py * Add files via upload * Add files via upload * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update __init__.py * Update __init__.py * Update __init__.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- * Fix onnx * Format export * Feat: style-text and bert mixing (JA only) * Ensure the same tensor shape * Update * update gradio version * Fix * Style text for chinese and english (ver 2.2) * Style text for chinese and english (ver 2.1) * Style text in FastAPI * Translate style text desc in chinese --------- Co-authored-by: litagin02 <139731664+litagin02@users.noreply.github.com> Co-authored-by: Sora <654163754@qq.com> Co-authored-by: Sihan Wang <wangsihan1995@gmail.com> Co-authored-by: Ναρουσέ·μ·γιουμεμί·Χινακάννα <40709280+NaruseMioShirakana@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Remove CLAP * Revert "Remove CLAP" This reverts commit 62fd59b. Revert * Remove CLAP * bf16 audo grad cilp * Update webui and infer utils * Update webui.py * Update webui.py * Update webui-preprocess.py * Update webui_preprocess.py --------- Co-authored-by: Sihan Wang <wangsihan1995@gmail.com> Co-authored-by: OedoSoldier <31711261+OedoSoldier@users.noreply.github.com> Co-authored-by: litagin02 <139731664+litagin02@users.noreply.github.com> Co-authored-by: Sora <654163754@qq.com> Co-authored-by: Ναρουσέ·μ·γιουμεμί·Χινακάννα <40709280+NaruseMioShirakana@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Stardust-minus and others added 20 commits December 13, 2023 16:25

fix:(oldVersion210) Load on demand Emotion model

e220e63

feat: update fastapi.py. 添加更多错误日志信息

252ddd5

Merge pull request fishaudio#235 from jiangyuxiaoxiao/master

2f1cee0

feat: update fastapi.py. 添加更多错误日志信息

Switch pyopenjtalk to pyopenjtalk-prebuilt

dbc2e9a

fix: update fastapi.py. 2.2 reference适配

9e4cc4c

Merge pull request fishaudio#236 from jiangyuxiaoxiao/master

7a2053e

fix: update fastapi.py. 2.2 reference适配

Update resample.py

bf9ab35

Fix onnx

203638b

Format export

8d6307f

Feat: style-text and bert mixing (JA only)

9c69949

Ensure the same tensor shape

99f7aab

Update

daf2776

update gradio version

21e9698

Fix

120564a

Merge branch 'master' into style-text

9edc071

Style text for chinese and english (ver 2.2)

99b261b

Style text for chinese and english (ver 2.1)

88aecdc

Style text in FastAPI

8fb80dc

Translate style text desc in chinese

0f24236

Stardust-minus changed the base branch from master to dev-bert-style December 16, 2023 04:05

Merge branch 'dev-bert-style' into style-text

96d9031

Stardust-minus merged commit 3eb0630 into fishaudio:dev-bert-style Dec 16, 2023
1 of 2 checks passed

Stardust-minus mentioned this pull request Dec 16, 2023

Feat: Style text: make emotions and style similar to the style text b… #241

Merged

litagin02 deleted the style-text branch December 20, 2023 01:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Style text: make emotions and style similar to the style text by mixing bert #240

Feat: Style text: make emotions and style similar to the style text by mixing bert #240

litagin02 commented Dec 16, 2023

Stardust-minus commented Dec 16, 2023

Feat: Style text: make emotions and style similar to the style text by mixing bert #240

Feat: Style text: make emotions and style similar to the style text by mixing bert #240

Conversation

litagin02 commented Dec 16, 2023

Stardust-minus commented Dec 16, 2023