EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-01-11 23:00:51 +08:00

Author	SHA1	Message	Date
Macpaul Lin	ef7b4a717a	feat(mps): implement native-like Float8 support via LUT dequantization Add a new MPS-specific operations module to handle Float8 tensor support on Apple Silicon. Since MPS does not natively support Float8 dtypes, this implementation uses a uint8 storage strategy combined with a GPU-accelerated Lookup Table (LUT) for efficient dequantization, keeping data on the GPU. - Add comfy/mps_ops.py: Implement cached LUT generation and index-based dequantization for MPS. - Modify comfy/quant_ops.py: Add logic to view Float8 tensors as uint8 when moving to MPS, and route dequantization to mps_ops. - Modify comfy/float.py: Add CPU staging for stochastic rounding to prevent MPS casting errors during quantization. - Modify comfy/quant_ops.py: Add fallback for fp8_linear. Signed-off-by: Macpaul Lin <macpaul@gmail.com>	2026-01-09 02:00:41 +08:00
comfyanonymous	73e3a9e676	Clamp output when rounding weight to prevent Nan.	2024-10-19 19:07:10 -04:00
comfyanonymous	7d2467e830	Some minor cleanups.	2024-10-05 13:22:39 -04:00
comfyanonymous	00a5d08103	Lower fp8 lora memory usage.	2024-09-03 01:25:05 -04:00
comfyanonymous	2ca8f6e23d	Make the stochastic fp8 rounding reproducible.	2024-08-26 15:12:06 -04:00
comfyanonymous	7985ff88b9	Use less memory in float8 lora patching by doing calculations in fp16.	2024-08-26 14:45:58 -04:00
comfyanonymous	4506ddc86a	Better subnormal fp8 stochastic rounding. Thanks Ashen.	2024-08-19 13:38:03 -04:00
comfyanonymous	22ec02afc0	Handle subnormal numbers in float8 rounding.	2024-08-19 05:51:08 -04:00
comfyanonymous	bb222ceddb	Fix loras having a weak effect when applied on fp8.	2024-08-17 15:20:17 -04:00

9 Commits