Commit Graph

2 Commits

Author SHA1 Message Date
Tashdid Khan
45a2363e6a fix: remove hardcoded local paths from MPS FP8 tests
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 21:27:35 -05:00
Tashdid Khan
edd44a6874 fix: add CPU fallback for FP8 quantization on MPS (Apple Silicon)
MPS does not support float8_e4m3fn/float8_e5m2 dtypes. When FP8-quantized
models (FLUX, SD3.5, Wan 2.2, LTX-Video) are loaded on Apple Silicon, the
quantization step crashes with:

  TypeError: Trying to convert Float8_e4m3fn to the MPS backend but it does
  not have support for that dtype.

This adds device-aware fallbacks that move tensors to CPU for the FP8
quantization step only. The rest of inference remains on MPS.

Three code paths are patched:
- comfy/float.py: stochastic_rounding() — also fixes the secondary
  "Placeholder storage has not been allocated on MPS device!" error
  caused by torch.Generator being bound to MPS.
- comfy/float.py: stochastic_round_quantize_nvfp4*() — these create
  float8_e4m3fn block scales internally.
- comfy/quant_ops.py: _TensorCoreFP8LayoutBase.quantize() — the
  ck.quantize_per_tensor_fp8 path also fails on MPS.

Fixes: #6995, #9255, #11626, #11817

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 21:21:31 -05:00