Skip to content

Fix FP8 dequantization on MPS by falling back to CPU#34

Open
snwchd71 wants to merge 1 commit intoComfy-Org:mainfrom
snwchd71:fix/mps-fp8-dequantize
Open

Fix FP8 dequantization on MPS by falling back to CPU#34
snwchd71 wants to merge 1 commit intoComfy-Org:mainfrom
snwchd71:fix/mps-fp8-dequantize

Conversation

@snwchd71
Copy link
Copy Markdown

@snwchd71 snwchd71 commented Apr 4, 2026

Summary

MPS (Apple Silicon) does not support FP8 dtype conversion, causing dequantize_per_tensor_fp8() to crash with:

TypeError: Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype.

This adds an explicit device + dtype guard to dequantize on CPU and transfer only the final result back to MPS. Includes a one-time logger.warning for visibility.

  • No hot-path overhead: the check is a string compare + frozenset membership, only evaluated when the tensor is on MPS with an FP8 dtype
  • Bit-identical results: every FP8 value is exactly representable in bfloat16/float16, verified with torch.equal
  • Auto-disables: the dtype guard means this becomes a no-op if MPS ever gains FP8 support

Prior work

Test plan

  • test_dequantize_fp8_cpu_fallback_correctness — verifies fallback math is bit-identical to standard path (runs in CI, parametrized float16/bfloat16)
  • test_dequantize_fp8_on_mps_device — end-to-end on actual MPS hardware (skipped in CI, parametrized float16/bfloat16)
  • Full test suite: 103 passed, 203 skipped (CUDA/Triton), 0 failures
  • ruff check clean

MPS (Apple Silicon) does not support FP8 dtype conversion, causing
dequantize_per_tensor_fp8() to crash. Add an explicit device check
with dtype guard to dequantize on CPU and transfer only the final
result back to MPS. Includes a one-time logger.warning for visibility.

Signed-off-by: spn <snwchd71@users.noreply.github.com>
@snwchd71 snwchd71 force-pushed the fix/mps-fp8-dequantize branch from 636c256 to 2d427c4 Compare April 4, 2026 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant