Commit Graph

411 Commits

Author SHA1 Message Date
doctorpangloss
8108315b93 Fix distributed tracing 2025-11-18 11:04:09 -08:00
doctorpangloss
607fcf7321 Fix trim 2025-10-23 19:49:59 -07:00
doctorpangloss
99be2c40fa add amd gfx1102 arch for pytorch attention 2025-10-21 14:33:54 -07:00
doctorpangloss
d9269785d3 Better memory trimming and group_offloading logic 2025-10-21 14:27:26 -07:00
doctorpangloss
5186d19441 Merge branch 'master' of github.com:comfyanonymous/ComfyUI into fix-merge 2025-10-21 10:56:30 -07:00
doctorpangloss
be56a14e65 Merge commit 'a4787ac83bf6c83eeb459ed80fc9b36f63d2a3a7' of github.com:comfyanonymous/ComfyUI into fix-merge 2025-10-21 10:53:43 -07:00
comfyanonymous
2c2aa409b0
Log message for cudnn disable on AMD. (#10418) 2025-10-20 15:43:24 -04:00
comfyanonymous
5b80addafd
Turn off cuda malloc by default when --fast autotune is turned on. (#10393) 2025-10-18 22:35:46 -04:00
comfyanonymous
1c10b33f9b
gfx942 doesn't support fp8 operations. (#10348) 2025-10-15 00:21:11 -04:00
comfyanonymous
c8674bc6e9
Enable RDNA4 pytorch attention on ROCm 7.0 and up. (#10332) 2025-10-13 21:19:03 -04:00
comfyanonymous
a125cd84b0
Improve AMD performance. (#10302)
I honestly have no idea why this improves things but it does.
2025-10-12 00:28:01 -04:00
doctorpangloss
8ab28996fb Merge branch 'master' of github.com:comfyanonymous/ComfyUI 2025-09-26 13:18:23 -07:00
Guy Niv
c8d2117f02
Fix memory leak by properly detaching model finalizer (#9979)
When unloading models in load_models_gpu(), the model finalizer was not
being explicitly detached, leading to a memory leak. This caused
linear memory consumption increase over time as models are repeatedly
loaded and unloaded.

This change prevents orphaned finalizer references from accumulating in
memory during model switching operations.
2025-09-24 22:35:12 -04:00
doctorpangloss
6e98a0c478 Fix linting errors, preliminary rocm 7 support 2025-09-23 15:02:21 -07:00
doctorpangloss
a9a0f96408 Merge branch 'master' of github.com:comfyanonymous/ComfyUI 2025-09-22 14:29:50 -07:00
DELUXA
8d6653fca6
Enable fp8 ops by default on gfx1200 (#9926) 2025-09-18 19:50:37 -04:00
comfyanonymous
fb763d4333
Fix amd_min_version crash when cpu device. (#9754) 2025-09-07 21:16:29 -04:00
comfyanonymous
bcbd7884e3
Don't enable pytorch attention on AMD if triton isn't available. (#9747) 2025-09-07 00:29:38 -04:00
comfyanonymous
27a0fcccc3
Enable bf16 VAE on RDNA4. (#9746) 2025-09-06 23:25:22 -04:00
doctorpangloss
dfc47e0611 Merge branch 'master' of github.com:comfyanonymous/ComfyUI 2025-08-22 13:24:52 -07:00
comfyanonymous
0963493a9c
Support for Qwen Diffsynth Controlnets canny and depth. (#9465)
These are not real controlnets but actually a patch on the model so they
will be treated as such.

Put them in the models/model_patches/ folder.

Use the new ModelPatchLoader and QwenImageDiffsynthControlnet nodes.
2025-08-20 22:26:37 -04:00
Simon Lui
c991a5da65
Fix XPU iGPU regressions (#9322)
* Change bf16 check and switch non-blocking to off default with option to force to regain speed on certain classes of iGPUs and refactor xpu check.

* Turn non_blocking off by default for xpu.

* Update README.md for Intel GPUs.
2025-08-13 19:13:35 -04:00
comfyanonymous
5828607ccf
Not sure if AMD actually support fp16 acc but it doesn't crash. (#9258) 2025-08-09 12:49:25 -04:00
comfyanonymous
735bb4bdb1
Users report gfx1201 is buggy on flux with pytorch attention. (#9244) 2025-08-08 04:21:00 -04:00
doctorpangloss
87bed08124 Merge branch 'master' of github.com:comfyanonymous/ComfyUI 2025-08-01 16:05:47 -07:00
comfyanonymous
7d593baf91
Extra reserved vram on large cards on windows. (#9093) 2025-07-29 04:07:45 -04:00
doctorpangloss
03e5430121 Improvements for Wan 2.2 support
- add xet support and add the xet cache to manageable directories
 - xet is enabled by default
 - fix logging to root in various places
 - improve logging about model unloading and loading
 - TorchCompileNode now supports the VAE
 - torchaudio missing will cause less noise in the logs
 - feature flags will assume to be supporting everything in the distributed progress context
 - fixes progress notifications
2025-07-28 14:36:27 -07:00
doctorpangloss
3684cff31b Merge branch 'master' of github.com:comfyanonymous/ComfyUI 2025-07-25 12:48:05 -07:00
comfyanonymous
69cb57b342
Print xpu device name. (#9035) 2025-07-24 15:06:25 -04:00
honglyua
0ccc88b03f
Support Iluvatar CoreX (#8585)
* Support Iluvatar CoreX
Co-authored-by: mingjiang.li <mingjiang.li@iluvatar.com>
2025-07-24 13:57:36 -04:00
comfyanonymous
d3504e1778
Enable pytorch attention by default for gfx1201 on torch 2.8 (#9029) 2025-07-23 19:21:29 -04:00
comfyanonymous
a86a58c308
Fix xpu function not implemented p2. (#9027) 2025-07-23 18:18:20 -04:00
comfyanonymous
39dda1d40d
Fix xpu function not implemented. (#9026) 2025-07-23 18:10:59 -04:00
comfyanonymous
5ad33787de
Add default device argument. (#9023) 2025-07-23 14:20:49 -04:00
Simon Lui
255f139863
Add xpu version for async offload and some other things. (#9004) 2025-07-22 15:20:09 -04:00
doctorpangloss
96b4e04315 packaging fixes
- enable user db
 - fix main_pre order everywhere
 - fix absolute to relative imports everywhere
 - async better supported
2025-07-15 10:19:33 -07:00
doctorpangloss
a7aff3565b Merge branch 'master' of github.com:comfyanonymous/ComfyUI 2025-06-26 16:57:25 -07:00
comfyanonymous
a96e65df18
Disable omnigen2 fp16 on older pytorch versions. (#8672) 2025-06-26 03:39:09 -04:00
doctorpangloss
1d901e32eb Merge branch 'master' of github.com:comfyanonymous/ComfyUI 2025-06-17 13:20:31 -07:00
doctorpangloss
82388d51a2 Merge branch 'master' of github.com:comfyanonymous/ComfyUI 2025-06-17 10:35:10 -07:00
comfyanonymous
6e28a46454
Apple most likely is never fixing the fp16 attention bug. (#8485) 2025-06-10 13:06:24 -04:00
comfyanonymous
7f800d04fa
Enable AMD fp8 and pytorch attention on some GPUs. (#8474)
Information is from the pytorch source code.
2025-06-09 12:50:39 -04:00
comfyanonymous
97755eed46
Enable fp8 ops by default on gfx1201 (#8464) 2025-06-08 14:15:34 -04:00
comfyanonymous
daf9d25ee2
Cleaner torch version comparisons. (#8453) 2025-06-07 10:01:15 -04:00
comfyanonymous
704fc78854
Put ROCm version in tuple to make it easier to enable stuff based on it. (#8348) 2025-05-30 15:41:02 -04:00
comfyanonymous
89a84e32d2
Disable initial GPU load when novram is used. (#8294) 2025-05-26 16:39:27 -04:00
comfyanonymous
e5799c4899
Enable pytorch attention by default on AMD gfx1151 (#8282) 2025-05-26 04:29:25 -04:00
comfyanonymous
0b50d4c0db
Add argument to explicitly enable fp8 compute support. (#8257)
This can be used to test if your current GPU/pytorch version supports fp8 matrix mult in combination with --fast or the fp8_e4m3fn_fast dtype.
2025-05-23 17:43:50 -04:00
Benjamin Berman
b6d3f1fb08 Accept workflows from the command line 2025-05-07 14:53:39 -07:00
comfyanonymous
0a66d4b0af
Per device stream counters for async offload. (#7873) 2025-04-29 20:28:52 -04:00