Skip to content

Compute deltas #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Sep 19, 2019
Merged

Compute deltas #268

merged 22 commits into from
Sep 19, 2019

Conversation

vincentqb
Copy link
Contributor

@vincentqb vincentqb commented Sep 5, 2019

This PR adds a functional/transform/compliance to compute the deltas of MFCC features, see also here.

  • The effect at the boundary follows Kaldi's convention of replicating the values (see here).
  • To compute the deltas-deltas, simply compute the deltas on the deltas (though Kaldi has a order parameter to compute higher order deltas). The compliance interface loops over the order to computer higher order differences.
  • This function simply returns the deltas. Kaldi instead appends them to the set of available features, see here

To do:

  • Add a test comparing with Kaldi's add_deltas. (leaving off this PR)

CC @cpuhrsch @mravanelli

@vincentqb vincentqb self-assigned this Sep 5, 2019
@vincentqb vincentqb changed the title [WIP] Compute deltas Compute deltas Sep 6, 2019
@vincentqb vincentqb requested review from zhangguanheng66 and removed request for zhangguanheng66 September 6, 2019 22:10
@@ -365,3 +365,36 @@ def forward(self, waveform):
return kaldi.resample_waveform(waveform, self.orig_freq, self.new_freq)

raise ValueError('Invalid resampling method: %s' % (self.resampling_method))


class ComputeDeltas(torch.jit.ScriptModule):
Copy link

@zhangguanheng66 zhangguanheng66 Sep 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this depends on the jit mode, it's better to add a jit test to make sure that it works properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of what you have in mind?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You just need to add a text case to ensure every operator is traceable.

Copy link
Contributor Author

@vincentqb vincentqb Sep 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was already a test verifying that the shape of the output is correct. I added a test to verify that the output matches the functional, see here. Thanks for pointing out this link, but could you clarify what you would like to verify? If the module were not to compile, the test would not run at all.


def compute_deltas(specgram, win_length=5):
# type: (Tensor, int) -> Tensor
r"""Compute delta coefficients of a spectogram:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This technically isn't limited to spectrograms right? I mean, we could just talk about regular Tensors as inputs.

Also, the edge behavior might be something the user could want to change in the future. Some people rather not replicate the deltas, but just drop those frames altogether.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This technically isn't limited to spectrograms right? I mean, we could just talk about regular Tensors as inputs.

fair enough, it's commonly used on spectrogram, but i could replace this by waveform, since there is a time-component, though it is not limited to waveform. the list of standardized words might not be useful here? :)

Also, the edge behavior might be something the user could want to change in the future. Some people rather not replicate the deltas, but just drop those frames altogether.

This is the behavior that kaldi does, and also does not make configurable :) I'll add the option to change that, by simply passing along the mode option for pad, and default to 'replicate'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants