ComfyUI/comfy
rattus b6c79a648a
ops: Fix offloading with FP8MM performance (#11697)
This logic was checking comfy_cast_weights, and going straight to
to the forward_comfy_cast_weights implementation without
attempting to downscale input to fp8 in the event comfy_cast_weights
is set.

The main reason comfy_cast_weights would be set would be for async
offload, which is not a good reason to nix FP8MM.

So instead, and together the underlying exclusions for FP8MM which
are:

* having a weight_function (usually LowVramPatch)
* force_cast_weights (compute dtype override)
* the weight is not Quantized
* the input is already quantized
* the model or layer has MM explictily disabled.

If you get past all of those exclusions, quantize the input tensor.
Then hand the new input, quantized or not off to
forward_comfy_cast_weights to handle it. If the weight is offloaded
but input is quantized you will get an offloaded MM8.
2026-01-07 21:01:16 -05:00
..
audio_encoders Support the HuMo model. (#9903) 2025-09-17 00:12:48 -04:00
cldm Add better error message for common error. (#10846) 2025-11-23 04:55:22 -05:00
comfy_types
extra_samplers
image_encoders Add Hunyuan 3D 2.1 Support (#8714) 2025-09-04 20:36:20 -04:00
k_diffusion Fix noise with ancestral samplers when inferencing on cpu. (#11528) 2025-12-26 22:03:01 -05:00
ldm Fix lowvram issue with ltxv2 text encoder. (#11675) 2026-01-06 17:33:03 -05:00
sd1_tokenizer Silence clip tokenizer warning. (#8934) 2025-07-16 14:42:07 -04:00
t2i_adapter
taesd New Year ruff cleanup. (#11595) 2026-01-01 22:06:14 -05:00
text_encoders Add memory estimation function to ltxav text encoder. (#11716) 2026-01-07 20:11:22 -05:00
weight_adapter Fix loras not working on mixed fp8. (#10899) 2025-11-26 00:07:58 -05:00
checkpoint_pickle.py
cli_args.py feat(preview): add per-queue live preview method override (#11261) 2025-12-15 15:57:39 -08:00
clip_config_bigg.json
clip_model.py Refactor: move clip_preprocess to comfy.clip_model (#11586) 2025-12-31 17:38:36 -05:00
clip_vision_config_g.json
clip_vision_config_h.json
clip_vision_config_vitl_336_llava.json
clip_vision_config_vitl_336.json
clip_vision_config_vitl.json
clip_vision_siglip_384.json
clip_vision_siglip_512.json
clip_vision.py Refactor: move clip_preprocess to comfy.clip_model (#11586) 2025-12-31 17:38:36 -05:00
conds.py Add some warnings and prevent crash when cond devices don't match. (#9169) 2025-08-04 04:20:12 -04:00
context_windows.py Add handling for vace_context in context windows (#11386) 2025-12-30 14:40:42 -08:00
controlnet.py Fix Race condition in --async-offload that can cause corruption (#10501) 2025-10-29 17:17:46 -04:00
diffusers_convert.py
diffusers_load.py
float.py
gligen.py Remove some useless code. (#8812) 2025-07-06 07:07:39 -04:00
hooks.py New Year ruff cleanup. (#11595) 2026-01-01 22:06:14 -05:00
latent_formats.py Disable ltxav previews. (#11676) 2026-01-06 17:41:27 -05:00
lora_convert.py Implement the USO subject identity lora. (#9674) 2025-09-01 18:54:02 -04:00
lora.py Support "transformer." LoRA prefix for Z-Image (#11135) 2025-12-08 15:17:26 -05:00
model_base.py Support the LTXV 2 model. (#11632) 2026-01-05 01:58:59 -05:00
model_detection.py Support the LTXV 2 model. (#11632) 2026-01-05 01:58:59 -05:00
model_management.py Skip fp4 matrix mult on devices that don't support it. (#11677) 2026-01-06 18:07:26 -05:00
model_patcher.py ops: Fix offloading with FP8MM performance (#11697) 2026-01-07 21:01:16 -05:00
model_sampling.py Refactor model sampling sigmas code. (#10250) 2025-10-08 17:49:02 -04:00
nested_tensor.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
ops.py ops: Fix offloading with FP8MM performance (#11697) 2026-01-07 21:01:16 -05:00
options.py
patcher_extension.py Fix order of inputs nested merge_nested_dicts (#10362) 2025-10-15 16:47:26 -07:00
pixel_space_convert.py Changes to the previous radiance commit. (#9851) 2025-09-13 18:03:34 -04:00
quant_ops.py Disable comfy kitchen cuda if pytorch cuda less than 13 (#11681) 2026-01-06 22:13:43 -05:00
rmsnorm.py Add warning when using old pytorch. (#9347) 2025-08-15 00:22:26 -04:00
sample.py Fix mistake. (#10484) 2025-10-25 23:07:29 -04:00
sampler_helpers.py skip_load_model -> force_full_load (#11390) 2025-12-17 23:29:32 -05:00
samplers.py Support nested tensor denoise masks. (#11431) 2025-12-19 19:59:25 -05:00
sd1_clip_config.json
sd1_clip.py Disable prompt weights on newbie te. (#11434) 2025-12-20 00:19:47 -05:00
sd.py Add memory estimation function to ltxav text encoder. (#11716) 2026-01-07 20:11:22 -05:00
sdxl_clip.py
supported_models_base.py Fix some custom nodes. (#11134) 2025-12-05 18:25:31 -05:00
supported_models.py Increase ltxav mem estimation by a bit. (#11715) 2026-01-07 20:04:56 -05:00
utils.py Support the LTXV 2 model. (#11632) 2026-01-05 01:58:59 -05:00