EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-03-15 14:17:40 +08:00

History

rattus128 e42682b24e Reduce Peak WAN inference VRAM usage (#9898 ) * flux: Do the xq and xk ropes one at a time This was doing independendent interleaved tensor math on the q and k tensors, leading to the holding of more than the minimum intermediates in VRAM. On a bad day, it would VRAM OOM on xk intermediates. Do everything q and then everything k, so torch can garbage collect all of qs intermediates before k allocates its intermediates. This reduces peak VRAM usage for some WAN2.2 inferences (at least). * wan: Optimize qkv intermediates on attention As commented. The former logic computed independent pieces of QKV in parallel which help more inference intermediates in VRAM spiking VRAM usage. Fully roping Q and garbage collecting the intermediates before touching K reduces the peak inference VRAM usage.		2025-09-16 19:21:14 -04:00
..
ace	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
audio	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
aura	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
cascade	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
chroma	Add support for Chroma Radiance (#9682 )	2025-09-13 17:58:43 -04:00
chroma_radiance	Changes to the previous radiance commit. (#9851 )	2025-09-13 18:03:34 -04:00
cosmos	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
flux	Reduce Peak WAN inference VRAM usage (#9898 )	2025-09-16 19:21:14 -04:00
genmo	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
hidream	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
hunyuan3d	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
hunyuan3dv2_1	Fix issue on old torch. (#9791 )	2025-09-10 00:23:47 -04:00
hunyuan_video	Hunyuan refiner vae now works with tiled. (#9836 )	2025-09-12 19:46:46 -04:00
hydit	Change cosmos and hydit models to use the native RMSNorm. (#7934 )	2025-05-04 06:26:20 -04:00
lightricks	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
lumina	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
models	Implement hunyuan image refiner model. (#9817 )	2025-09-12 00:43:20 -04:00
modules	Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884 )	2025-09-15 20:05:03 -04:00
omnigen	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
pixart	Remove windows line endings. (#8866 )	2025-07-11 02:37:51 -04:00
qwen_image	Enable Runtime Selection of Attention Functions (#9639 )	2025-09-12 18:07:38 -04:00
wan	Reduce Peak WAN inference VRAM usage (#9898 )	2025-09-16 19:21:14 -04:00
common_dit.py	add RMSNorm to comfy.ops	2025-04-14 18:00:33 -04:00
util.py	Fix and enforce new lines at the end of files.	2024-12-30 04:14:59 -05:00