Skip to content

Comments

Fix MPS tensor dtype conversion in audio decoding#20

Merged
lmangani merged 2 commits intomainfrom
copilot/fix-decoding-audio-failure-again
Jan 30, 2026
Merged

Fix MPS tensor dtype conversion in audio decoding#20
lmangani merged 2 commits intomainfrom
copilot/fix-decoding-audio-failure-again

Conversation

Copy link
Contributor

Copilot AI commented Jan 30, 2026

Audio decoding fails on Apple Silicon with invalid type: 'torch.mps.FloatTensor' because PyTorch implicitly converts token indices from torch.long to torch.float when moving tensors to MPS devices.

Changes

  • backend/app/services/music_service.py:850: Explicitly preserve torch.long dtype when moving frames to codec device
# Before: implicit dtype conversion on MPS
frames_for_codec = frames.to(device=pipeline.codec_device)

# After: explicit dtype preservation
frames_for_codec = frames.to(device=pipeline.codec_device, dtype=torch.long)

The frames tensor contains token IDs used as indices in codec.detokenize() and must remain integer type. Pattern matches existing MPS handling at lines 1585, 1603.

Original prompt

This section details on the original issue you should resolve

<issue_title>Decoding Audio Failure</issue_title>
<issue_description>After PR #14 generation succeeds at a reasonable speed, but once completed it fails with:

Deconding Audio:
Generation Failed: invalid type: 'torch.mps.FloatTensor'

Lets fix this last bug to make sure our generations are correctly decoded to audio.</issue_description>

<agent_instructions>Generation still fails in the last conversion phase:

Generation Failed: invalid type: 'torch.mps.FloatTensor'

Resolve this issue and make sure audio conversion succeeds 100%</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@lmangani This change was ineffective and caused a new issue:

Generation Failed: tensors used as indices must be long, int, byte or bool tensors</comment_new>
<comment_new>@lmangani
Generation still fails in the last conversion phase:

Generation Failed: invalid type: 'torch.mps.FloatTensor'

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…indices

Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix decoding audio failure in generation process Fix MPS tensor dtype conversion in audio decoding Jan 30, 2026
Copilot AI requested a review from lmangani January 30, 2026 11:06
@lmangani lmangani marked this pull request as ready for review January 30, 2026 12:13
@lmangani lmangani merged commit b41ca42 into main Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Decoding Audio Failure

2 participants