Skip to content

Conversation

@macpaul
Copy link

@macpaul macpaul commented Jan 8, 2026

Issue: #11626

Main solution is feat(mps): implement native-like Float8 support via LUT dequantization.
However, it encountered some merge conflicts when master branch was updated to 0.8.0 and later.
Hence, I've added a bunch of fixes to the errors I've encountered. Please check if they are adapteable to master branch.

Signed-off-by: Macpaul Lin macpaul@gmail.com

Add a new MPS-specific operations module to handle Float8 tensor support
on Apple Silicon. Since MPS does not natively support Float8 dtypes, this
implementation uses a uint8 storage strategy combined with a GPU-accelerated
Lookup Table (LUT) for efficient dequantization, keeping data on the GPU.

- Add comfy/mps_ops.py: Implement cached LUT generation and index-based
  dequantization for MPS.
- Modify comfy/quant_ops.py: Add logic to view Float8 tensors as uint8
  when moving to MPS, and route dequantization to mps_ops.
- Modify comfy/float.py: Add CPU staging for stochastic rounding to
  prevent MPS casting errors during quantization.
- Modify comfy/quant_ops.py: Add fallback for fp8_linear.

Signed-off-by: Macpaul Lin <macpaul@gmail.com>
…ng errors

Signed-off-by: Macpaul Lin <macpaul@gmail.com>
…edTensor

Signed-off-by: Macpaul Lin <macpaul@gmail.com>
…Tensor

Signed-off-by: Macpaul Lin <macpaul@gmail.com>
…edTensor

Signed-off-by: Macpaul Lin <macpaul@gmail.com>
…pe to prevent precision mismatch RuntimeErrors

Signed-off-by: Macpaul Lin <macpaul@gmail.com>
…ike for mock QuantizedTensor

Signed-off-by: Macpaul Lin <macpaul@gmail.com>
…r QuantizedTensor

Signed-off-by: Macpaul Lin <macpaul@gmail.com>
@rattus128 rattus128 added the MacOS MPS device related issues label Jan 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

MacOS MPS device related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants