Fix MPS tensor dtype in audio generation pipeline by Copilot · Pull Request #21 · audiohacking/CTFN-Studio

Copilot · 2026-01-30T13:02:22Z

HeartMuLa's generate_frame() returns float16 tensors on MPS devices, but codec.detokenize() requires long integers for token indexing. Previous fix attempted conversion at detokenization (line 850), but dtype was already baked into stacked frames.

Changes

Convert tokens to long immediately after generation:

Line 761: After initial token generation
Line 797: After each generation loop iteration

with get_autocast_context(pipeline.mula_device.type, pipeline.mula_dtype):
    curr_token = pipeline.mula.generate_frame(...)
# Convert to long immediately after generation (tokens must be integers)
curr_token = curr_token.long()
frames.append(curr_token[0:1,])

This ensures torch.stack(frames) operates on long tensors from the start, preserving dtype through subsequent operations (permute, squeeze, cpu, device transfer).

Original prompt

This section details on the original issue you should resolve

<issue_title>Decoding Audio Failure</issue_title>
<issue_description>After PR #14 generation succeeds at a reasonable speed, but once completed it fails with:
Deconding Audio:
Generation Failed: invalid type: 'torch.mps.FloatTensor'
Lets fix this last bug to make sure our generations are correctly decoded to audio.</issue_description>

<agent_instructions>The fix was not effective. At the end of the generation the app still fails with:

invalid type: 'torch.mps.FloatTensor'
</agent_instructions>

Comments on the Issue (you are @copilot in this section)
@lmangani This change was ineffective and caused a new issue:
Generation Failed: tensors used as indices must be long, int, byte or bool tensors</comment_new>
<comment_new>@lmangani
Generation still fails in the last conversion phase:
Generation Failed: invalid type: 'torch.mps.FloatTensor'
@lmangani The fix was not effective. At the end of the generation the app still fails with:
invalid type: 'torch.mps.FloatTensor'

Fixes Decoding Audio Failure #15

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>

Initial plan

eb70048

Copilot AI assigned Copilot and lmangani Jan 30, 2026

Copilot started work on behalf of lmangani January 30, 2026 13:02 View session

Convert generated tokens to long dtype immediately after generation

66cef0f

Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix decoding audio failure in generation process~~ Fix MPS tensor dtype in audio generation pipeline Jan 30, 2026

Copilot AI requested a review from lmangani January 30, 2026 13:09

Copilot finished work on behalf of lmangani January 30, 2026 13:09

lmangani marked this pull request as ready for review January 30, 2026 13:09

lmangani merged commit c176603 into main Jan 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Fix MPS tensor dtype in audio generation pipeline#21

Fix MPS tensor dtype in audio generation pipeline#21
lmangani merged 2 commits intomainfrom
copilot/fix-decoding-audio-error-again

Copilot AI commented Jan 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

Copilot AI commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Comments on the Issue (you are @copilot in this section)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jan 30, 2026 •

edited

Loading