Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Style text: make emotions and style similar to the style text by mixing bert #240

Merged
merged 21 commits into from
Dec 16, 2023

Conversation

litagin02
Copy link
Contributor

This adds new emotion-style reference, "style text".
This mixes the bert of original text and the mean of the bert of the style text, and the resulting voice sounds like a voice of the original text but with the nuance, emotion, tone of the reading of style text.
For example, original text "This is a test of style text." with style text "Oh my god, I'm very sad, disappointed..." yields the reading of the original text with very sad emotion.

I'm Japanese so I mainly test with my own Japanese models, and this emotion reference seems to work well (while the text prompt and audio prompt is not so effective for models which are finetuned over the pretrained 2.2 models).
Also I tested with Chinese and English using the pretrained 2.2 model, and it looks good.
I add this feature for 2.1 and 2.2 models, and don't touch older models (when using older models it just simply ignore style text).

(By the result of my experiments, it seems that bert has lots of information of emotion and reading style.)

Stardust-minus and others added 20 commits December 13, 2023 16:25
feat: update fastapi.py. 添加更多错误日志信息
fix: update fastapi.py. 2.2 reference适配
* Add files via upload

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add files via upload

* Add files via upload

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Delete attentions_onnx.py

* Delete models_onnx.py

* Add files via upload

* Add files via upload

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update __init__.py

* Update __init__.py

* Update __init__.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@Stardust-minus
Copy link
Member

Interesting,we will have a test.
Thanks for your great idea!

@Stardust-minus Stardust-minus changed the base branch from master to dev-bert-style December 16, 2023 04:05
@Stardust-minus Stardust-minus merged commit 3eb0630 into fishaudio:dev-bert-style Dec 16, 2023
1 of 2 checks passed
Stardust-minus added a commit that referenced this pull request Dec 16, 2023
…y mixing bert (#240) (#241)

* fix:(oldVersion210) Load on demand Emotion model

* feat: update fastapi.py. 添加更多错误日志信息

* Switch pyopenjtalk to pyopenjtalk-prebuilt

* fix: update fastapi.py. 2.2 reference适配

* Update resample.py

* 修复Onnx导出的BUG (#237)

* Add files via upload

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add files via upload

* Add files via upload

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Delete attentions_onnx.py

* Delete models_onnx.py

* Add files via upload

* Add files via upload

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update __init__.py

* Update __init__.py

* Update __init__.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------



* Fix onnx

* Format export

* Feat: style-text and bert mixing (JA only)

* Ensure the same tensor shape

* Update

* update gradio version

* Fix

* Style text for chinese and english (ver 2.2)

* Style text for chinese and english (ver 2.1)

* Style text in FastAPI

* Translate style text desc in chinese

---------

Co-authored-by: litagin02 <139731664+litagin02@users.noreply.github.com>
Co-authored-by: Sora <654163754@qq.com>
Co-authored-by: Sihan Wang <wangsihan1995@gmail.com>
Co-authored-by: Ναρουσέ·μ·γιουμεμί·Χινακάννα <40709280+NaruseMioShirakana@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Stardust-minus added a commit that referenced this pull request Dec 19, 2023
* Fix inputs of duration discriminator

* Add LSTM

* Update models.py

* Update tensorboard scalar

* Noise injection for minimizing modality gap

* Update infer.py

* support bf16 run

* del unused_para flag

* support bf16 config

* add grad clip

* fix(logger and grad):add dur grad,fix grad clip

* Update webui_preprocess.py

* Fix English G2P

* fix(bert_gen):add pass

* Pass SDP to DD

* Update webui_preprocess.py

* Update config.json

* Update webui.py

* Update chinese_bert.py

* Upload webui for deploy

* Update webui.py

* torch.save as pt not npy

* Update config.json

* add freeze emo vq

* Update webui_preprocess.py

* Fix tone_sandhi.py

* Comment up grad clip

* Fix in-place addition

* Add SLM discriminator

* Add DDP for WD

* Feat: Style text: make emotions and style similar to the style text by mixing bert (#240) (#241)

* fix:(oldVersion210) Load on demand Emotion model

* feat: update fastapi.py. 添加更多错误日志信息

* Switch pyopenjtalk to pyopenjtalk-prebuilt

* fix: update fastapi.py. 2.2 reference适配

* Update resample.py

* 修复Onnx导出的BUG (#237)

* Add files via upload

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add files via upload

* Add files via upload

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Delete attentions_onnx.py

* Delete models_onnx.py

* Add files via upload

* Add files via upload

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update __init__.py

* Update __init__.py

* Update __init__.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------



* Fix onnx

* Format export

* Feat: style-text and bert mixing (JA only)

* Ensure the same tensor shape

* Update

* update gradio version

* Fix

* Style text for chinese and english (ver 2.2)

* Style text for chinese and english (ver 2.1)

* Style text in FastAPI

* Translate style text desc in chinese

---------

Co-authored-by: litagin02 <139731664+litagin02@users.noreply.github.com>
Co-authored-by: Sora <654163754@qq.com>
Co-authored-by: Sihan Wang <wangsihan1995@gmail.com>
Co-authored-by: Ναρουσέ·μ·γιουμεμί·Χινακάννα <40709280+NaruseMioShirakana@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Remove CLAP

* Revert "Remove CLAP"

This reverts commit 62fd59b.

Revert

* Remove CLAP

* bf16 audo grad cilp

* Update webui and infer utils

* Update webui.py

* Update webui.py

* Update webui-preprocess.py

* Update webui_preprocess.py

---------

Co-authored-by: Sihan Wang <wangsihan1995@gmail.com>
Co-authored-by: OedoSoldier <31711261+OedoSoldier@users.noreply.github.com>
Co-authored-by: litagin02 <139731664+litagin02@users.noreply.github.com>
Co-authored-by: Sora <654163754@qq.com>
Co-authored-by: Ναρουσέ·μ·γιουμεμί·Χινακάννα <40709280+NaruseMioShirakana@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@litagin02 litagin02 deleted the style-text branch December 20, 2023 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants