EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-01-20 19:30:20 +08:00

Author	SHA1	Message	Date
Xiangxi Guo (Ryan)	f0d5d0111f	Avoid torch compile graphbreak for older pytorch versions (#9344 ) Turns out torch.compile has some gaps in context manager decorator syntax support. I've sent patches to fix that in PyTorch, but it won't be available for all the folks running older versions of PyTorch, hence this trivial patch.	2025-08-14 23:41:37 -04:00
comfyanonymous	9df8792d4b	Make last PR not crash comfy on old pytorch. (#9324 )	2025-08-13 15:12:41 -04:00
contentis	3da5a07510	SDPA backend priority (#9299 )	2025-08-13 14:53:27 -04:00
doctorpangloss	69a4906964	Experimental GGUF support	2025-07-28 17:02:20 -07:00
doctorpangloss	04e411c32e	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2025-07-14 13:45:09 -07:00
comfyanonymous	111f583e00	Fallback to regular op when fp8 op throws exception. (#8761 )	2025-07-02 00:57:13 -04:00
doctorpangloss	82388d51a2	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2025-06-17 10:35:10 -07:00
comfyanonymous	d42613686f	Fix issue with fp8 ops on some models. (#8045 ) _scaled_mm errors when an input is non contiguous.	2025-05-10 07:52:56 -04:00
comfyanonymous	ac10a0d69e	Make loras work with --async-offload (#7824 )	2025-04-26 19:56:22 -04:00
comfyanonymous	0dcc75ca54	Add experimental --async-offload lowvram weight offloading. (#7820 ) This should speed up the lowvram mode a bit. It currently is only enabled when --async-offload is used but it will be enabled by default in the future if there are no problems.	2025-04-26 16:11:21 -04:00
doctorpangloss	5823497d55	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2025-04-21 13:14:36 -07:00
comfyanonymous	9ad792f927	Basic support for hidream i1 model.	2025-04-15 17:35:05 -04:00
comfyanonymous	8a438115fb	add RMSNorm to comfy.ops	2025-04-14 18:00:33 -04:00
catboxanon	1714a4c158	Add CublasOps support (#7574 ) * CublasOps support * Guard CublasOps behind --fast arg	2025-04-12 18:29:15 -04:00
doctorpangloss	040a324346	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2025-03-29 15:57:24 -07:00
comfyanonymous	70e15fd743	No need for scale_input when fp8 matrix mult is disabled.	2025-03-07 04:49:20 -05:00
comfyanonymous	e1474150de	Support fp8_scaled diffusion models that don't use fp8 matrix mult.	2025-03-07 04:39:21 -05:00
doctorpangloss	3c82be86d1	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2025-03-05 14:38:50 -08:00
comfyanonymous	4dc6709307	Rename argument in last commit and document the options.	2025-03-01 02:43:49 -05:00
Chenlei Hu	4d55f16ae8	Use enum list for --fast options (#7024 )	2025-03-01 02:37:35 -05:00
comfyanonymous	cf0b549d48	--fast now takes a number as argument to indicate how fast you want it. The idea is that you can indicate how much quality vs speed you want. At the moment: --fast 2 enables fp16 accumulation if your pytorch supports it. --fast 5 enables fp8 matrix mult on fp8 models and the optimization above. --fast without a number enables all optimizations.	2025-02-28 02:48:20 -05:00
doctorpangloss	693038738a	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2025-02-24 09:39:26 -08:00
comfyanonymous	ab888e1e0b	Add add_weight_wrapper function to model patcher. Functions can now easily be added to wrap/modify model weights.	2025-02-12 05:55:35 -05:00
doctorpangloss	9d5a5dd533	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2024-12-28 14:24:27 -08:00
comfyanonymous	99a1fb6027	Make fast fp8 take a bit less peak memory.	2024-12-24 18:05:19 -05:00
doctorpangloss	2d1676c717	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2024-12-09 15:54:37 -08:00
Haoming	fbf68c4e52	clamp input (#5928 )	2024-12-07 14:00:31 -05:00
doctorpangloss	31eacb6ac9	Improve compilation of models, adding support for triton	2024-11-01 10:40:58 -07:00
doctorpangloss	a8d8bff548	Improve support for torch compilation and sage attention	2024-10-29 19:22:26 -07:00
doctorpangloss	76a80a65ea	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2024-10-29 15:35:39 -07:00
comfyanonymous	915fdb5745	Fix lowvram edge case.	2024-10-22 16:34:50 -04:00
comfyanonymous	8ce2a1052c	Optimizations to --fast and scaled fp8.	2024-10-22 02:12:28 -04:00
comfyanonymous	0075c6d096	Mixed precision diffusion models with scaled fp8. This change allows supports for diffusion models where all the linears are scaled fp8 while the other weights are the original precision.	2024-10-21 18:12:51 -04:00
comfyanonymous	83ca891118	Support scaled fp8 t5xxl model.	2024-10-20 22:27:00 -04:00
comfyanonymous	f9f9faface	Fixed model merging issue with scaled fp8.	2024-10-20 06:24:31 -04:00
comfyanonymous	a68bbafddb	Support diffusion models with scaled fp8 weights.	2024-10-19 23:47:42 -04:00
comfyanonymous	67158994a4	Use the lowvram cast_to function for everything.	2024-10-17 17:25:56 -04:00
doctorpangloss	8512f361fe	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2024-10-14 15:26:27 -07:00
doctorpangloss	f3da381869	Fix inference mode execution issues	2024-10-10 21:00:09 -07:00
doctorpangloss	a38968f098	Improvements to execution - Validation errors that occur early in the lifecycle of prompt execution now get propagated to their callers in the EmbeddedComfyClient. This includes error messages about missing node classes. - The execution context now includes the node_id and the prompt_id - Latent previews are now sent with a node_id. This is not backwards compatible with old frontends. - Dependency execution errors are now modeled correctly. - Distributed progress encodes image previews with node and prompt IDs. - Typing for models - The frontend was updated to use node IDs with previews - Improvements to torch.compile experiments - Some controlnet_aux nodes were upstreamed	2024-10-10 19:30:18 -07:00
doctorpangloss	69e523b89d	Experimental quantization support. Only Linux is meaningfully supported	2024-10-10 13:43:06 -07:00
comfyanonymous	e38c94228b	Add a weight_dtype fp8_e4m3fn_fast to the Diffusion Model Loader node. This is used to load weights in fp8 and use fp8 matrix multiplication.	2024-10-09 19:43:17 -04:00
doctorpangloss	fa3176f96f	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2024-09-23 12:50:31 -07:00
comfyanonymous	9c41bc8d10	Remove useless line.	2024-09-23 02:32:29 -04:00
comfyanonymous	dc96a1ae19	Load controlnet in fp8 if weights are in fp8.	2024-09-21 04:50:12 -04:00
doctorpangloss	5155a3e248	Merge WIP	2024-08-25 18:52:29 -07:00
comfyanonymous	8ae23d8e80	Fix onnx export.	2024-08-23 17:52:47 -04:00
comfyanonymous	c7ee4b37a1	Try to fix some lora issues.	2024-08-22 15:32:18 -04:00
comfyanonymous	904bf58e7d	Make --fast work on pytorch nightly.	2024-08-21 14:01:41 -04:00
Svein Ove Aas	5f50263088	Replace use of .view with .reshape (#4522 ) When generating images with fp8_e4_m3 Flux and batch size >1, using --fast, ComfyUI throws a "view size is not compatible with input tensor's size and stride" error pointing at the first of these two calls to view. As reshape is semantically equivalent to view except for working on a broader set of inputs, there should be no downside to changing this. The only difference is that it clones the underlying data in cases where .view would error out. I have confirmed that the output still looks as expected, but cannot confirm that no mutable use is made of the tensors anywhere. Note that --fast is only marginally faster than the default.	2024-08-21 11:21:48 -04:00
comfyanonymous	03ec517afb	Remove useless line, adjust windows default reserved vram.	2024-08-21 00:47:19 -04:00
comfyanonymous	510f3438c1	Speed up fp8 matrix mult by using better code.	2024-08-20 22:53:26 -04:00
comfyanonymous	9953f22fce	Add --fast argument to enable experimental optimizations. Optimizations that might break things/lower quality will be put behind this flag first and might be enabled by default in the future. Currently the only optimization is float8_e4m3fn matrix multiplication on 4000/ADA series Nvidia cards or later. If you have one of these cards you will see a speed boost when using fp8_e4m3fn flux for example.	2024-08-20 11:55:51 -04:00
comfyanonymous	538cb068bc	Make cast_to a nop if weight is already good.	2024-08-20 10:46:36 -04:00
comfyanonymous	39f114c44b	Less broken non blocking?	2024-08-18 16:53:17 -04:00
comfyanonymous	6730f3e1a3	Disable non blocking. It fixed some perf issues but caused other issues that need to be debugged.	2024-08-18 14:38:09 -04:00
comfyanonymous	73332160c8	Enable non blocking transfers in lowvram mode.	2024-08-18 10:29:33 -04:00
doctorpangloss	0a1ae64b0b	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2024-08-01 16:19:11 -07:00
comfyanonymous	b85216a3c0	Lower T5 memory usage by a few hundred MB.	2024-07-31 00:52:34 -04:00
comfyanonymous	25853d0be8	Use common function for casting weights to input.	2024-07-30 10:49:14 -04:00
doctorpangloss	8cdc246450	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2024-06-17 16:19:48 -07:00
comfyanonymous	bb1969cab7	Initial support for the stable audio open model.	2024-06-15 12:14:56 -04:00
doctorpangloss	cb557c960b	Merge branch 'master' of github.com:comfyanonymous/ComfyUI	2024-05-31 07:42:11 -07:00
comfyanonymous	6c23854f54	Fix OSX latent2rgb previews.	2024-05-22 13:56:28 -04:00
doctorpangloss	005e370254	Merge upstream	2024-03-21 13:15:36 -07:00
comfyanonymous	448d9263a2	Fix control loras breaking.	2024-03-14 09:30:21 -04:00
comfyanonymous	db8b59ecff	Lower memory usage for loras in lowvram mode at the cost of perf.	2024-03-13 20:07:27 -04:00
doctorpangloss	7520691021	Merge with master	2024-02-19 10:55:22 -08:00
comfyanonymous	667c92814e	Stable Cascade Stage B.	2024-02-16 13:02:03 -05:00
doctorpangloss	82edb2ff0e	Merge with latest upstream.	2024-01-29 15:06:31 -08:00
comfyanonymous	78a70fda87	Remove useless import.	2024-01-19 15:38:05 -05:00
comfyanonymous	36a7953142	Greatly improve lowvram sampling speed by getting rid of accelerate. Let me know if this breaks anything.	2023-12-22 14:38:45 -05:00
comfyanonymous	77755ab8db	Refactor comfy.ops comfy.ops -> comfy.ops.disable_weight_init This should make it more clear what they actually do. Some unused code has also been removed.	2023-12-11 23:27:13 -05:00
comfyanonymous	ba07cb748e	Use faster manual cast for fp8 in unet.	2023-12-11 18:24:44 -05:00
comfyanonymous	57926635e8	Switch text encoder to manual cast. Use fp16 text encoder weights for CPU inference to lower memory usage.	2023-12-10 23:00:54 -05:00
comfyanonymous	af365e4dd1	All the unet ops with weights are now handled by comfy.ops	2023-12-04 03:12:18 -05:00
comfyanonymous	412d3ff57d	Refactor.	2023-11-11 01:11:06 -05:00
comfyanonymous	00c0b2c507	Initialize text encoder to target dtype.	2023-08-23 21:01:15 -04:00
comfyanonymous	d6e4b342e6	Support for Control Loras. Control loras are controlnets where some of the weights are stored in "lora" format: an up and a down low rank matrice that when multiplied together and added to the unet weight give the controlnet weight. This allows a much smaller memory footprint depending on the rank of the matrices. These controlnets are used just like regular ones.	2023-08-18 11:59:51 -04:00
comfyanonymous	bb1f45d6e8	Properly disable weight initialization in clip models.	2023-06-14 20:13:08 -04:00
comfyanonymous	21f04fe632	Disable default weight values in unet conv2d for faster loading.	2023-06-14 19:46:08 -04:00
comfyanonymous	6971646b8b	Speed up model loading a bit. Default pytorch Linear initializes the weights which is useless and slow.	2023-06-14 12:09:41 -04:00

1 2 3

132 Commits