Skip to content

Commit

Permalink
Updata Readme.md
Browse files Browse the repository at this point in the history
  • Loading branch information
ylzz1997 committed May 22, 2023
1 parent 95ea8a8 commit 28dd4fa
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 8 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,10 @@ This project is only a framework project, which does not have the function of sp

The singing voice conversion model uses SoftVC content encoder to extract source audio speech features, then the vectors are directly fed into VITS instead of converting to a text based intermediate; thus the pitch and intonations are conserved. Additionally, the vocoder is changed to [NSF HiFiGAN](https://github.com/openvpi/DiffSinger/tree/refactor/modules/nsf_hifigan) to solve the problem of sound interruption.

### 🆕 4.0-Vec768-Layer12 Version Update Content
### 🆕 4.1-Stable Version Update Content

- Feature input is changed to [Content Vec](https://github.com/auspicious3000/contentvec) Transformer output of 12 layer, the branch is not compatible with 4.0 model
- Update the shallow diffusion, you can use the shallow diffusion model to improve the sound quality
- Feature input is changed to [Content Vec](https://github.com/auspicious3000/contentvec) Transformer output of 12 layer, And compatible with 4.0 branches.
- Update the shallow diffusion, you can use the shallow diffusion model to improve the sound quality.

### 🆕 Questions about compatibility with the 4.0 model

Expand All @@ -53,7 +53,7 @@ The singing voice conversion model uses SoftVC content encoder to extract source
```
"model": {
.........
"ssl_dim": 768,
"ssl_dim": 256,
"n_speakers": 200,
"speech_encoder":"vec256l9"
}
Expand Down
6 changes: 3 additions & 3 deletions README_zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,9 @@

歌声音色转换模型,通过SoftVC内容编码器提取源音频语音特征,与F0同时输入VITS替换原本的文本输入达到歌声转换的效果。同时,更换声码器为 [NSF HiFiGAN](https://github.com/openvpi/DiffSinger/tree/refactor/modules/nsf_hifigan) 解决断音问题

### 🆕 4.0-Vec768-Layer12 版本更新内容
### 🆕 4.1-Stable 版本更新内容

+ 特征输入更换为 [Content Vec](https://github.com/auspicious3000/contentvec) 的第12层Transformer输出
+ 特征输入更换为 [Content Vec](https://github.com/auspicious3000/contentvec) 的第12层Transformer输出,并兼容4.0分支
+ 更新浅层扩散,可以使用浅层扩散模型提升音质

### 🆕 关于兼容4.0模型的问题
Expand All @@ -51,7 +51,7 @@
```
"model": {
.........
"ssl_dim": 768,
"ssl_dim": 256,
"n_speakers": 200,
"speech_encoder":"vec256l9"
}
Expand Down
2 changes: 1 addition & 1 deletion sovits4_for_colab.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@
"\n",
"#@markdown\n",
"\n",
"!git clone https://github.com/svc-develop-team/so-vits-svc -b 4.0-Vec768-Layer12\n",
"!git clone https://github.com/svc-develop-team/so-vits-svc -b 4.1-Stable\n",
"%pip uninstall -y torchdata torchtext\n",
"%pip install --upgrade pip setuptools numpy numba\n",
"%pip install pyworld praat-parselmouth fairseq tensorboardX torchcrepe librosa==0.9.1 pyyaml pynvml pyloudnorm\n",
Expand Down

0 comments on commit 28dd4fa

Please sign in to comment.