doctorpangloss
be56a14e65
Merge commit 'a4787ac83bf6c83eeb459ed80fc9b36f63d2a3a7' of github.com:comfyanonymous/ComfyUI into fix-merge
2025-10-21 10:53:43 -07:00
comfyanonymous
b4f30bd408
Pytorch is stupid. ( #10398 )
2025-10-19 01:25:35 -04:00
comfyanonymous
5b80addafd
Turn off cuda malloc by default when --fast autotune is turned on. ( #10393 )
2025-10-18 22:35:46 -04:00
comfyanonymous
9da397ea2f
Disable torch compiler for cast_bias_weight function ( #10384 )
...
* Disable torch compiler for cast_bias_weight function
* Fix torch compile.
2025-10-17 20:03:28 -04:00
comfyanonymous
b1293d50ef
workaround also works on cudnn 91200 ( #10375 )
2025-10-16 19:59:56 -04:00
comfyanonymous
19b466160c
Workaround for nvidia issue where VAE uses 3x more memory on torch 2.9 ( #10373 )
2025-10-16 18:16:03 -04:00
comfyanonymous
3374e900d0
Faster workflow cancelling. ( #10301 )
2025-10-13 23:43:53 -04:00
comfyanonymous
139addd53c
More surgical fix for #10267 ( #10276 )
2025-10-09 16:37:35 -04:00
doctorpangloss
06a5766dd7
Update logging to logger everywhere
2025-09-23 16:07:54 -07:00
doctorpangloss
6e98a0c478
Fix linting errors, preliminary rocm 7 support
2025-09-23 15:02:21 -07:00
doctorpangloss
a9a0f96408
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2025-09-22 14:29:50 -07:00
Kohaku-Blueleaf
7be2b49b6b
Fix LoRA Trainer bugs with FP8 models. ( #9854 )
...
* Fix adapter weight init
* Fix fp8 model training
* Avoid inference tensor
2025-09-20 21:24:48 -04:00
doctorpangloss
179c2d35c8
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2025-09-03 12:04:32 -07:00
contentis
e2d1e5dad9
Enable Convolution AutoTuning ( #9301 )
2025-09-01 20:33:50 -04:00
doctorpangloss
1e938f5feb
fix sdpa priorities
2025-08-26 14:33:00 -07:00
doctorpangloss
735a133ad4
Update to 0.3.51
2025-08-22 17:29:18 -07:00
doctorpangloss
dfc47e0611
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2025-08-22 13:24:52 -07:00
comfyanonymous
4e5c230f6a
Fix last commit not working on older pytorch. ( #9346 )
2025-08-14 23:44:02 -04:00
Xiangxi Guo (Ryan)
f0d5d0111f
Avoid torch compile graphbreak for older pytorch versions ( #9344 )
...
Turns out torch.compile has some gaps in context manager decorator
syntax support. I've sent patches to fix that in PyTorch, but it won't
be available for all the folks running older versions of PyTorch, hence
this trivial patch.
2025-08-14 23:41:37 -04:00
comfyanonymous
9df8792d4b
Make last PR not crash comfy on old pytorch. ( #9324 )
2025-08-13 15:12:41 -04:00
contentis
3da5a07510
SDPA backend priority ( #9299 )
2025-08-13 14:53:27 -04:00
doctorpangloss
69a4906964
Experimental GGUF support
2025-07-28 17:02:20 -07:00
doctorpangloss
04e411c32e
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2025-07-14 13:45:09 -07:00
comfyanonymous
111f583e00
Fallback to regular op when fp8 op throws exception. ( #8761 )
2025-07-02 00:57:13 -04:00
doctorpangloss
82388d51a2
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2025-06-17 10:35:10 -07:00
comfyanonymous
d42613686f
Fix issue with fp8 ops on some models. ( #8045 )
...
_scaled_mm errors when an input is non contiguous.
2025-05-10 07:52:56 -04:00
comfyanonymous
ac10a0d69e
Make loras work with --async-offload ( #7824 )
2025-04-26 19:56:22 -04:00
comfyanonymous
0dcc75ca54
Add experimental --async-offload lowvram weight offloading. ( #7820 )
...
This should speed up the lowvram mode a bit. It currently is only enabled when --async-offload is used but it will be enabled by default in the future if there are no problems.
2025-04-26 16:11:21 -04:00
doctorpangloss
5823497d55
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2025-04-21 13:14:36 -07:00
comfyanonymous
9ad792f927
Basic support for hidream i1 model.
2025-04-15 17:35:05 -04:00
comfyanonymous
8a438115fb
add RMSNorm to comfy.ops
2025-04-14 18:00:33 -04:00
catboxanon
1714a4c158
Add CublasOps support ( #7574 )
...
* CublasOps support
* Guard CublasOps behind --fast arg
2025-04-12 18:29:15 -04:00
doctorpangloss
040a324346
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2025-03-29 15:57:24 -07:00
comfyanonymous
70e15fd743
No need for scale_input when fp8 matrix mult is disabled.
2025-03-07 04:49:20 -05:00
comfyanonymous
e1474150de
Support fp8_scaled diffusion models that don't use fp8 matrix mult.
2025-03-07 04:39:21 -05:00
doctorpangloss
3c82be86d1
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2025-03-05 14:38:50 -08:00
comfyanonymous
4dc6709307
Rename argument in last commit and document the options.
2025-03-01 02:43:49 -05:00
Chenlei Hu
4d55f16ae8
Use enum list for --fast options ( #7024 )
2025-03-01 02:37:35 -05:00
comfyanonymous
cf0b549d48
--fast now takes a number as argument to indicate how fast you want it.
...
The idea is that you can indicate how much quality vs speed you want.
At the moment:
--fast 2 enables fp16 accumulation if your pytorch supports it.
--fast 5 enables fp8 matrix mult on fp8 models and the optimization above.
--fast without a number enables all optimizations.
2025-02-28 02:48:20 -05:00
doctorpangloss
693038738a
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2025-02-24 09:39:26 -08:00
comfyanonymous
ab888e1e0b
Add add_weight_wrapper function to model patcher.
...
Functions can now easily be added to wrap/modify model weights.
2025-02-12 05:55:35 -05:00
doctorpangloss
9d5a5dd533
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2024-12-28 14:24:27 -08:00
comfyanonymous
99a1fb6027
Make fast fp8 take a bit less peak memory.
2024-12-24 18:05:19 -05:00
doctorpangloss
2d1676c717
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2024-12-09 15:54:37 -08:00
Haoming
fbf68c4e52
clamp input ( #5928 )
2024-12-07 14:00:31 -05:00
doctorpangloss
31eacb6ac9
Improve compilation of models, adding support for triton
2024-11-01 10:40:58 -07:00
doctorpangloss
a8d8bff548
Improve support for torch compilation and sage attention
2024-10-29 19:22:26 -07:00
doctorpangloss
76a80a65ea
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2024-10-29 15:35:39 -07:00
comfyanonymous
915fdb5745
Fix lowvram edge case.
2024-10-22 16:34:50 -04:00
comfyanonymous
8ce2a1052c
Optimizations to --fast and scaled fp8.
2024-10-22 02:12:28 -04:00