Skip to content

Commit 60b950c

Browse files
1. add huggingface SVS; 2. add inference logic from raw inputs; 3. fix typo in readme.
1 parent 774998c commit 60b950c

File tree

3 files changed

+16
-10
lines changed

3 files changed

+16
-10
lines changed

docs/README-SVS.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
## DiffSinger (SVS)
88

99
### PART1. [Run DiffSinger on PopCS](README-SVS-popcs.md)
10-
In PART1, we only focus on spectrum modeling (acoustic model) and assume the ground-truth (GT) F0 to be given as the pitch information following these papers [1][2][3]. If you want to conduct experiments on F0 prediction, please move to PART2.
10+
In PART1, we only focus on spectrum modeling (acoustic model) and assume the ground-truth (GT) F0 to be given as the pitch information following these papers [1][2][3]. If you want to conduct experiments with F0 prediction, please move to PART2.
1111

1212
Thus, the pipeline of this part can be summarized as:
1313

@@ -57,20 +57,20 @@ Thus, the pipeline of [2.B](README-SVS-opencpop-e2e.md) can be summarized as:
5757
Click here for detailed instructions: [link](README-SVS-opencpop-e2e.md).
5858

5959
### FAQ
60-
Q: Why do I need F0 in Vocoders?
60+
Q1: Why do I need F0 in Vocoders?
6161

62-
A: See vocoder parts in HiFiSinger, DiffSinger or SingGAN. This is a common practice now.
62+
A1: See vocoder parts in HiFiSinger, DiffSinger or SingGAN. This is a common practice now.
6363

64-
Q: Why not run MIDI version SVS on PopCS dataset? or Why not release MIDI labels for PopCS dataset?
64+
Q2: Why not run MIDI version SVS on PopCS dataset? or Why not release MIDI labels for PopCS dataset?
6565

66-
A: Our laboratory has no funds to label PopCS dataset. But there are funds for labeling other singing datasets, which is coming soon.
66+
A2: Our laboratory has no funds to label PopCS dataset. But there are funds for labeling other singing dataset, which is coming soon.
6767

68-
Q: Why " 'HifiGAN' object has no attribute 'model' "?
68+
Q3: Why " 'HifiGAN' object has no attribute 'model' "?
6969

70-
A: Please put the pretrained vocoders in your `checkpoints` dictionary.
70+
A3: Please put the pretrained vocoders in your `checkpoints` dictionary.
7171

72-
Q: How to check whether I use GT information or predicted information during inference from packed test set?
72+
Q4: How to check whether I use GT information or predicted information during inference from packed test set?
7373

74-
A: Please see codes [here](https://github.com/MoonInTheRiver/DiffSinger/blob/55e2f46068af6e69940a9f8f02d306c24a940cab/tasks/tts/fs2.py#L343).
74+
A4: Please see codes [here](https://github.com/MoonInTheRiver/DiffSinger/blob/55e2f46068af6e69940a9f8f02d306c24a940cab/tasks/tts/fs2.py#L343).
7575

7676
...

inference/svs/base_svs_infer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ def preprocess_word_level_input(self, inp):
123123
# 0 0 1
124124
if len(note_in_this_word) > 1: # is_slur = True, we should repeat the YUNMU to match the 2nd, 3rd... notes.
125125
for idx in range(1, len(note_in_this_word)):
126-
ph_lst.append(ph_in_this_word[1])
126+
ph_lst.append(ph_in_this_word[-1])
127127
note_lst.append(note_in_this_word[idx])
128128
midi_dur_lst.append(midi_dur_in_this_word[idx])
129129
is_slur.append(1)

inference/svs/gradio/gradio_settings.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,12 @@ example_inputs:
1515
你 说 你 不 SP 懂 为 何 在 这 时 牵 手 AP<sep>D#4/Eb4 | D#4/Eb4 | D#4/Eb4 | D#4/Eb4 | rest | D#4/Eb4 | D4 | D4 | D4 | D#4/Eb4 | F4 | D#4/Eb4 | D4 | rest<sep>0.113740 | 0.329060 | 0.287950 | 0.133480 | 0.150900 | 0.484730 | 0.242010 | 0.180820 | 0.343570 | 0.152050 | 0.266720 | 0.280310 | 0.633300 | 0.444590
1616
- |-
1717
小酒窝长睫毛AP是你最美的记号<sep>C#4/Db4 | F#4/Gb4 | G#4/Ab4 | A#4/Bb4 F#4/Gb4 | F#4/Gb4 C#4/Db4 | C#4/Db4 | rest | C#4/Db4 | A#4/Bb4 | G#4/Ab4 | A#4/Bb4 | G#4/Ab4 | F4 | C#4/Db4<sep>0.407140 | 0.376190 | 0.242180 | 0.509550 0.183420 | 0.315400 0.235020 | 0.361660 | 0.223070 | 0.377270 | 0.340550 | 0.299620 | 0.344510 | 0.283770 | 0.323390 | 0.360340
18+
- |-
19+
小酒窝长睫毛AP那是可爱猪宝宝<sep>C#4/Db4 | F#4/Gb4 | G#4/Ab4 | A#4/Bb4 F#4/Gb4 | F#4/Gb4 C#4/Db4 | C#4/Db4 | rest | C#4/Db4 | A#4/Bb4 | G#4/Ab4 | A#4/Bb4 | G#4/Ab4 | F4 | C#4/Db4<sep>0.407140 | 0.376190 | 0.242180 | 0.509550 0.183420 | 0.315400 0.235020 | 0.361660 | 0.223070 | 0.377270 | 0.340550 | 0.299620 | 0.344510 | 0.283770 | 0.323390 | 0.360340
20+
- |-
21+
我真的SP爱你SP句句不轻易<sep>D4 | A4 | F#4 | rest | A4 | D4 | rest | B4 | A4 F#4 | F#4 | A4 | A4<sep>0.8 | 0.4 | 0.967 | 0.3 | 0.4 | 0.967 | 0.4 | 0.8 | 0.4 0.4 | 0.25 | 0.967 | 0.9
22+
- |-
23+
好冷啊 AP 我在东北玩泥巴<sep>F4 | F4 | D4 | rest | D4 | D4 | C4 | C4 | B3 | C4 | D4<sep>0.5 | 0.3 | 0.3 | 0.3 | 0.2 | 0.2 | 0.2 | 0.2 | 0.25 | 0.25 | 0.4
1824
1925
#inference_cls: inference.svs.ds_cascade.DiffSingerCascadeInfer
2026
#exp_name: 0303_opencpop_ds58_midi

0 commit comments

Comments
 (0)