-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support On-the-fly Features Extraction #145
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self reviewed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, a lot features incoming XD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The recipe should be updated to provide instructions for online feature extraction.
@lmxue Good advice. I plan to update the recipe in the future. This PR is to prepare a codebase for our recent internal research. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work on supporting online feature extraction.
Support on-the-fly features extraction for the large-scale data preprocessing
Support on-the-fly features extraction for the large-scale data preprocessing
✨ Description
Support on-the-fly features extraction for the large-scale data preprocessing. Its strengths can be summarized as:
How to use?
Under the on-the-fly features extraction, the workflow for the future Amphion model is:
utt["Path"]
andutt["Duration"]
are the two key elements.Features Preprocess(No features preprocess any more!)preprocess.features_extraction_mode
asonline
[Task]OnlineDataset
and[Model]Trainer
Currently, I have supported DiffWaveNetSVC with on-the-fly features extraction. You can see the two main classes: SVCOnlineDataset and DiffusionTrainer.
👨💻 Main Changes
BaseDataset
andBaseCollator
toBaseOfflineDataset
andBaseOfflineDataset
andBaseOfflineCollator
BaseOnlineDataset
andBaseOnlineCollator
. The__getitem__
function will get the minimum elements (such as the raw waveform and its duration)audio_features_extractor.py
, I have integrated the common waveform features extraction operation (such as Mel Spectrogram, F0, Energy, and Semantic Features). Note that I have not integrated some vocoder requiring features. @VocodexElysiumtext_features_extractor.py
anddescriptive_text_features_extractor.py
for future TTS, TTA, and TTM's refactor/integration/supplement. @HeCheng0625 @lmxue @HarryHe11 @viewfinder-annnAmphion/config/[Task]/[Model].json
.✅ Checklist