EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-03-07 18:27:40 +08:00

History

Dustin 8c374c8b90 Fix SageAttention crash after PR #10276 fp8 weight scaling changes Problem: After PR #10276 (commit `139addd5`) introduced convert_func/set_func for proper fp8 weight scaling during LoRA application, users with SageAttention enabled experience 100% reproducible crashes (Exception 0xC0000005 ACCESS_VIOLATION) during KSampler execution. Root Cause: PR #10276 added fp8 weight transformations (scale up -> apply LoRA -> scale down) to fix LoRA quality with Wan 2.1/2.2 14B fp8 models. These transformations: 1. Convert weights to float32 and create copies (new memory addresses) 2. Invalidate tensor metadata that SageAttention cached 3. Break SageAttention's internal memory references 4. Cause access violation when SageAttention tries to use old pointers SageAttention expects weights at original memory addresses without transformations between caching and usage. Solution: Add conditional bypass in LowVramPatch.__call__ to detect when SageAttention is active (via --use-sage-attention flag) and skip convert_func/set_func calls. This preserves SageAttention's memory reference stability while maintaining PR #10276 benefits for users without SageAttention. Trade-offs: - When SageAttention is enabled with fp8 models + LoRAs, LoRAs are applied to scaled weights instead of properly scaled weights - Potential quality impact unknown (no issues observed in testing) - Only affects users who explicitly enable SageAttention flag - Users without SageAttention continue to benefit from PR #10276 Testing Completed: - RTX 5090, CUDA 12.8, PyTorch 2.7.0, SageAttention 2.1.1 - Wan 2.2 fp8 models with multiple LoRAs - Crash eliminated, ~40% SageAttention performance benefit preserved - No visual quality degradation observed - Non-SageAttention workflows unaffected Testing Requested: - Other GPU architectures (RTX 4090, 3090, etc.) - Different CUDA/PyTorch version combinations - fp8 LoRA quality comparison with SageAttention enabled - Edge cases: mixed fp8/non-fp8 workflows Files Changed: - comfy/model_patcher.py: LowVramPatch.__call__ method Related: - Issue: SageAttention incompatibility with fp8 weight scaling - Original PR: #10276 (fp8 LoRA quality fix for Wan models) - SageAttention: https://github.com/thu-ml/SageAttention		2025-10-12 02:40:30 -04:00
..
audio_encoders	Support the HuMo model. (#9903 )	2025-09-17 00:12:48 -04:00
cldm	Replace print with logging (#6138 )	2024-12-20 16:24:55 -05:00
comfy_types	LoRA Trainer: LoRA training node in weight adapter scheme (#8446 )	2025-06-13 19:25:59 -04:00
extra_samplers	Uni pc sampler now works with audio and video models.	2025-01-18 05:27:58 -05:00
image_encoders	Add Hunyuan 3D 2.1 Support (#8714 )	2025-09-04 20:36:20 -04:00
k_diffusion	Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884 )	2025-09-15 20:05:03 -04:00
ldm	Implement the mmaudio VAE. (#10300 )	2025-10-11 22:57:23 -04:00
sd1_tokenizer	Silence clip tokenizer warning. (#8934 )	2025-07-16 14:42:07 -04:00
t2i_adapter	Controlnet refactor.	2024-06-27 18:43:11 -04:00
taesd	Improvements to the TAESD3 implementation.	2024-06-16 02:04:24 -04:00
text_encoders	Implement gemma 3 as a text encoder. (#10241 )	2025-10-06 22:08:08 -04:00
weight_adapter	Fix LoRA Trainer bugs with FP8 models. (#9854 )	2025-09-20 21:24:48 -04:00
checkpoint_pickle.py	Remove pytorch_lightning dependency.	2023-06-13 10:11:33 -04:00
cli_args.py	Print all fast options in --help (#9737 )	2025-09-06 01:05:05 -04:00
clip_config_bigg.json	Fix potential issue with non clip text embeddings.	2024-07-30 14:41:13 -04:00
clip_model.py	USO style reference. (#9677 )	2025-09-02 15:36:22 -04:00
clip_vision_config_g.json	Add support for clip g vision model to CLIPVisionLoader.	2023-08-18 11:13:29 -04:00
clip_vision_config_h.json	Add support for unCLIP SD2.x models.	2023-04-01 23:19:15 -04:00
clip_vision_config_vitl_336_llava.json	Support llava clip vision model.	2025-03-06 00:24:43 -05:00
clip_vision_config_vitl_336.json	support clip-vit-large-patch14-336 (#4042 )	2024-07-17 13:12:50 -04:00
clip_vision_config_vitl.json	Add support for unCLIP SD2.x models.	2023-04-01 23:19:15 -04:00
clip_vision_siglip_384.json	Support new flux model variants.	2024-11-21 08:38:23 -05:00
clip_vision_siglip_512.json	Support 512 siglip model.	2025-04-05 07:01:01 -04:00
clip_vision.py	Some changes to the previous hunyuan PR. (#9725 )	2025-09-04 20:39:02 -04:00
conds.py	Add some warnings and prevent crash when cond devices don't match. (#9169 )	2025-08-04 04:20:12 -04:00
context_windows.py	Make step index detection much more robust (#9392 )	2025-08-17 18:54:07 -04:00
controlnet.py	Support qwen inpaint controlnet. (#9772 )	2025-09-08 17:30:26 -04:00
diffusers_convert.py	Remove useless code.	2025-01-24 06:15:54 -05:00
diffusers_load.py	load_unet -> load_diffusion_model with a model_options argument.	2024-08-12 23:20:57 -04:00
float.py	Clamp output when rounding weight to prevent Nan.	2024-10-19 19:07:10 -04:00
gligen.py	Remove some useless code. (#8812 )	2025-07-06 07:07:39 -04:00
hooks.py	Hooks Part 2 - TransformerOptionsHook and AdditionalModelsHook (#6377 )	2025-01-11 12:20:23 -05:00
latent_formats.py	Add support for Chroma Radiance (#9682 )	2025-09-13 17:58:43 -04:00
lora_convert.py	Implement the USO subject identity lora. (#9674 )	2025-09-01 18:54:02 -04:00
lora.py	Support the omnigen2 umo lora. (#9886 )	2025-09-15 18:10:55 -04:00
model_base.py	Basic WIP support for the wan animate model. (#9939 )	2025-09-19 03:07:17 -04:00
model_detection.py	Implement gemma 3 as a text encoder. (#10241 )	2025-10-06 22:08:08 -04:00
model_management.py	Improve AMD performance. (#10302 )	2025-10-12 00:28:01 -04:00
model_patcher.py	Fix SageAttention crash after PR #10276 fp8 weight scaling changes	2025-10-12 02:40:30 -04:00
model_sampling.py	Refactor model sampling sigmas code. (#10250 )	2025-10-08 17:49:02 -04:00
ops.py	More surgical fix for #10267 (#10276 )	2025-10-09 16:37:35 -04:00
options.py	Only parse command line args when main.py is called.	2023-09-13 11:38:20 -04:00
patcher_extension.py	Implement EasyCache and Invent LazyCache (#9496 )	2025-08-22 22:41:08 -04:00
pixel_space_convert.py	Changes to the previous radiance commit. (#9851 )	2025-09-13 18:03:34 -04:00
rmsnorm.py	Add warning when using old pytorch. (#9347 )	2025-08-15 00:22:26 -04:00
sample.py	Auto reshape 2d to 3d latent for single image generation on video model.	2024-12-29 02:26:49 -05:00
sampler_helpers.py	Added context window support to core sampling code (#9238 )	2025-08-13 21:33:05 -04:00
samplers.py	Add 'input_cond' and 'input_uncond' to the args dictionary passed into sampler_cfg_function (#10044 )	2025-09-26 19:55:03 -07:00
sd1_clip_config.json	Fix potential issue with non clip text embeddings.	2024-07-30 14:41:13 -04:00
sd1_clip.py	Disable prompt weights for qwen. (#9438 )	2025-08-20 01:08:11 -04:00
sd.py	Implement the mmaudio VAE. (#10300 )	2025-10-11 22:57:23 -04:00
sdxl_clip.py	Add a T5TokenizerOptions node to set options for the T5 tokenizer. (#7803 )	2025-04-25 19:36:00 -04:00
supported_models_base.py	Mixed precision diffusion models with scaled fp8.	2024-10-21 18:12:51 -04:00
supported_models.py	Lower wan memory estimation value a bit. (#9964 )	2025-09-20 22:09:35 -04:00
utils.py	Add WAN ATI support (#8874 )	2025-07-24 20:59:19 -04:00