-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NoisePerturbAugmentor and CHiME3 data preparation. #140
Conversation
xinghai-sun
commented
Jun 29, 2017
•
edited
Loading
edited
- Add NoisePerturbAugmentor and ImpulseResponseAugmentor.
- CHiME3 data preparation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost LGTM.
deep_speech_2/train.py
Outdated
default='[{"type": "shift", ' | ||
'"params": {"min_shift_ms": -5, "max_shift_ms": 5},' | ||
'"prob": 1.0}]', | ||
default=open('augmentation.config', 'r').read(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to mv augmentation.config
to a unified directory like conf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
deep_speech_2/augmentation.config
Outdated
@@ -0,0 +1,34 @@ | |||
[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can create a directory to hold configuration files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to conf/
.
:type impulse_manifest: basestring | ||
""" | ||
|
||
def __init__(self, rng, impulse_manifest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
impulse_manifest_path
is better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
:type noise_manifest: basestring | ||
""" | ||
|
||
def __init__(self, rng, min_snr_dB, max_snr_dB, noise_manifest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noise_manifest_path
is better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
deep_speech_2/data_utils/data.py
Outdated
@@ -202,6 +202,7 @@ def _process_utterance(self, filename, transcript): | |||
"""Load, augment, featurize and normalize for speech data.""" | |||
speech_segment = SpeechSegment.from_file(filename, transcript) | |||
self._augmentation_pipeline.transform_audio(speech_segment) | |||
speech_segment.to_wav_file("audio.test.wav") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can save audio.test.wav
to a temp directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
print("Skip downloading and unpacking. Data already exists in %s." % | ||
target_dir) | ||
# create manifest json file | ||
create_manifest(target_dir, manifest_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should do some checking here to make sure existence of audio files and transcription text files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不需要,第一,noise文件不存在transcription;第二,这里的逻辑是是遍历所有的音频文件(不存在就不会被遍历到)。