Skip to content

Refactor audio sample conversion in encoder #704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 29, 2025

Conversation

NicolasHug
Copy link
Member

At a high level this PR just moves our AVFrame conversion logic away from encodeInnerLoop() into a new maybeConvertAVFrame() method, which is now called in encode().

It makes more sense to call this conversion logic before entering encodeInnerLoop(), and this becomes more evident as I'm working on sample rate conversion in another branch.

This is mainly copy/pasted but the diff isn't trivial and has a few subtle changes that I'll describe in the comments below.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 29, 2025
numEncodedSamples += numSamplesToEncode;
// TODO-ENCODING set frame pts correctly, and test against it.
// avFrame->pts += static_cast<int64_t>(numSamplesToEncode);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a drive-by. I realized that commenting this out was letting our tests pass, so this isn't load bearing. We weren't setting the pts on the convertedAVFrame anyway, so our logic was already wrong. I'm adding this TODO to investigate that later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also create an issue.

getNumChannels(avFrame) == outNumChannels_) {
// Note: the clone references the same underlying data, it's a cheap copy.
return UniqueAVFrame(av_frame_clone(avFrame.get()));
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main logic change is this one above: the call to av_frame_clone when no conversion needs to be done. We want to return the original avFrame but we can't use std::move, because that would destruct the original avFrame prematurely (the one created in encode()).

Copy link
Contributor

@scotts scotts May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I would have thought that using std::move() would keep the actual AVFrame alive that is contained in UniqueAVFrame. Yes, the UniqueAVFrame created in the calling scope would go away, but the underlying object it wrapped would live on. Does that not happen?

In order for that to work, though, we'd have to change how we pass the frame into maybeConvertAVFrame(). I think the way with the clearest intent is that it accepts just a plain UniqueAVFrame (not a reference), and then we std::move() the one created in the calling scope into it. (We also could pass by non-const reference, but I think that doesn't make the ownership semantics clear.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! As discussed offline I'll merge now but I want to dig a little bit more about what we could do differently

@NicolasHug NicolasHug merged commit 3056f40 into pytorch:main May 29, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants