ComfyUI/comfy/ldm
Emiliooooo 61235fc35a fix(directml): replace try/except with device-type guard; fix both ZeroDivisionError sites
Improves on the previous directml commit with three research-based refinements:

1. model_management.py — module_mmap_residency() and cast_to_gathered()
   Replace broad try/except NotImplementedError with an explicit
   `t.device.type == 'privateuseone'` guard. Checking device type is
   faster in a hot loop and makes the intent self-documenting.
   Fixes: github.com/Comfy-Org/ComfyUI/issues/8347

2. attention.py — attention_split()
   Replace the "assume 4 GB free" heuristic with `steps = 64`.
   64-slice chunking is safe and correct regardless of card size;
   the 4 GB assumption was fragile on cards with less or more VRAM.

3. diffusionmodules/model.py — slice_attention()
   Apply the identical `steps = 64` guard to the second call site
   for the same ZeroDivisionError (was missed in the previous commit).
   Fixes: github.com/comfyanonymous/ComfyUI/issues/1518

Tested end-to-end on AMD RX 5600 XT (6 GB VRAM), Windows 11,
torch-directml 0.2.5, ComfyUI 0.21.1, DreamShaper 8 (SD 1.5).
Full 20-step txt2img pipeline completes and returns a valid PNG.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-14 21:09:35 -04:00
..
ace Support Ace Step 1.5 XL model. (#13317) 2026-04-07 03:13:47 -04:00
anima Fix anima LLM adapter forward when manual cast (#12504) 2026-02-17 07:56:44 -08:00
audio Enable Runtime Selection of Attention Functions (#9639) 2025-09-12 18:07:38 -04:00
aura Enable Runtime Selection of Attention Functions (#9639) 2025-09-12 18:07:38 -04:00
cascade cascade: remove dead weight init code (#13026) 2026-03-17 20:59:10 -04:00
chroma Implement NAG on all the models based on the Flux code. (#12500) 2026-02-16 23:30:34 -05:00
chroma_radiance Use torch RMSNorm for flux models and refactor hunyuan video code. (#12432) 2026-02-13 15:35:13 -05:00
cogvideo Cogvideox (#13402) 2026-04-29 19:30:08 -04:00
cosmos Some fixes to previous pr. (#12339) 2026-02-06 20:14:52 -05:00
ernie Some optimizations to make Ernie inference a bit faster. (#13472) 2026-04-18 23:02:29 -04:00
flux Add a supports_fp64 function. (#13368) 2026-04-11 21:06:36 -04:00
genmo Enable Runtime Selection of Attention Functions (#9639) 2025-09-12 18:07:38 -04:00
hidream Enable Runtime Selection of Attention Functions (#9639) 2025-09-12 18:07:38 -04:00
hidream_o1 feat: Support HiDream-O1-Image (CORE-187) (#13817) 2026-05-11 20:35:53 -07:00
hunyuan3d Enable Runtime Selection of Attention Functions (#9639) 2025-09-12 18:07:38 -04:00
hunyuan3dv2_1 fix: disable SageAttention for Hunyuan3D v2.1 DiT (#12772) 2026-03-16 22:27:27 -04:00
hunyuan_video Implement NAG on all the models based on the Flux code. (#12500) 2026-02-16 23:30:34 -05:00
hydit Change cosmos and hydit models to use the native RMSNorm. (#7934) 2025-05-04 06:26:20 -04:00
kandinsky5 Fix qwen scaled fp8 not working with kandinsky. Make basic t2i wf work. (#11162) 2025-12-06 17:50:10 -08:00
lightricks fix(directml): correct VRAM detection and make torchaudio imports optional 2026-05-14 12:10:31 -04:00
lumina Feat: z-image pixel space (model still training atm) (#12709) 2026-03-02 19:43:47 -05:00
mmaudio/vae Implement the mmaudio VAE. (#10300) 2025-10-11 22:57:23 -04:00
models Add support for small flux.2 decoder (#13314) 2026-04-07 03:44:18 -04:00
modules fix(directml): replace try/except with device-type guard; fix both ZeroDivisionError sites 2026-05-14 21:09:35 -04:00
omnigen Enable Runtime Selection of Attention Functions (#9639) 2025-09-12 18:07:38 -04:00
pixart Remove windows line endings. (#8866) 2025-07-11 02:37:51 -04:00
qwen_image Add pre attention and post input patches to qwen image model. (#12879) 2026-03-11 00:09:35 -04:00
rt_detr CORE-13 feat: Support RT-DETRv4 detection model (#12748) 2026-03-28 23:34:10 -04:00
sam3 Improve SAM3 large input handling (#13767) 2026-05-07 17:18:28 -07:00
supir feat: SUPIR model support (CORE-17) (#13250) 2026-04-18 23:02:01 -04:00
wan Support Wan-Dancer (#13813) 2026-05-09 14:02:56 -07:00
common_dit.py add RMSNorm to comfy.ops 2025-04-14 18:00:33 -04:00
util.py New Year ruff cleanup. (#11595) 2026-01-01 22:06:14 -05:00