EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-06-20 06:49:37 +08:00

Author	SHA1	Message	Date
comfyanonymous	ec4fc2a09a	Fix case of weights not being unpinned. (#10533 )	2025-10-29 15:48:06 -04:00
comfyanonymous	1a58087ac2	Reduce memory usage for fp8 scaled op. (#10531 )	2025-10-29 15:43:51 -04:00
comfyanonymous	e525673f72	Fix issue. (#10527 )	2025-10-29 00:37:00 -04:00
comfyanonymous	3fa7a5c04a	Speed up offloading using pinned memory. (#10526 ) To enable this feature use: --fast pinned_memory	2025-10-29 00:21:01 -04:00
contentis	8817f8fc14	Mixed Precision Quantization System (#10498 ) * Implement mixed precision operations with a registry design and metadate for quant spec in checkpoint. * Updated design using Tensor Subclasses * Fix FP8 MM * An actually functional POC * Remove CK reference and ensure correct compute dtype * Update unit tests * ruff lint * Implement mixed precision operations with a registry design and metadate for quant spec in checkpoint. * Updated design using Tensor Subclasses * Fix FP8 MM * An actually functional POC * Remove CK reference and ensure correct compute dtype * Update unit tests * ruff lint * Fix missing keys * Rename quant dtype parameter * Rename quant dtype parameter * Fix unittests for CPU build	2025-10-28 16:20:53 -04:00
Dr.Lt.Data	fe26f30cb6	Merge branch 'master' into dr-support-pip-cm	2025-10-26 12:52:08 +09:00
comfyanonymous	f6bbc1ac84	Fix mistake. (#10484 )	2025-10-25 23:07:29 -04:00
comfyanonymous	098a352f13	Add warning for torch-directml usage (#10482 ) Added a warning message about the state of torch-directml.	2025-10-25 20:05:22 -04:00
Dr.Lt.Data	3c4b429251	Merge branch 'master' into dr-support-pip-cm	2025-10-25 10:42:34 +09:00
comfyanonymous	426cde37f1	Remove useless function (#10472 )	2025-10-24 19:56:51 -04:00
Dr.Lt.Data	0432bccbcf	Merge branch 'master' into dr-support-pip-cm	2025-10-24 12:17:46 +09:00
comfyanonymous	1bcda6df98	WIP way to support multi multi dimensional latents. (#10456 )	2025-10-23 21:21:14 -04:00
Dr.Lt.Data	aaf06ace12	Merge branch 'master' into dr-support-pip-cm	2025-10-23 06:54:58 +09:00
comfyanonymous	9cdc64998f	Only disable cudnn on newer AMD GPUs. (#10437 )	2025-10-21 19:15:23 -04:00
Dr.Lt.Data	a1a6f4d7fe	Merge branch 'master' into dr-support-pip-cm	2025-10-21 07:26:53 +09:00
comfyanonymous	2c2aa409b0	Log message for cudnn disable on AMD. (#10418 )	2025-10-20 15:43:24 -04:00
Dr.Lt.Data	ee54914a52	Merge branch 'master' into dr-support-pip-cm	2025-10-20 06:35:52 +09:00
comfyanonymous	b4f30bd408	Pytorch is stupid. (#10398 )	2025-10-19 01:25:35 -04:00
comfyanonymous	dad076aee6	Speed up chroma radiance. (#10395 )	2025-10-18 23:19:52 -04:00
comfyanonymous	0cf33953a7	Fix batch size above 1 giving bad output in chroma radiance. (#10394 )	2025-10-18 23:15:34 -04:00
Dr.Lt.Data	8f59e2a341	Merge branch 'master' into dr-support-pip-cm	2025-10-19 11:39:42 +09:00
comfyanonymous	5b80addafd	Turn off cuda malloc by default when --fast autotune is turned on. (#10393 )	2025-10-18 22:35:46 -04:00
Dr.Lt.Data	7d5e73ea94	Merge branch 'master' into dr-support-pip-cm	2025-10-19 09:37:12 +09:00
comfyanonymous	9da397ea2f	Disable torch compiler for cast_bias_weight function (#10384 ) * Disable torch compiler for cast_bias_weight function * Fix torch compile.	2025-10-17 20:03:28 -04:00
Dr.Lt.Data	6626f7c5c4	Merge branch 'master' into dr-support-pip-cm	2025-10-17 12:42:54 +09:00
comfyanonymous	b1293d50ef	workaround also works on cudnn 91200 (#10375 )	2025-10-16 19:59:56 -04:00
comfyanonymous	19b466160c	Workaround for nvidia issue where VAE uses 3x more memory on torch 2.9 (#10373 )	2025-10-16 18:16:03 -04:00
Dr.Lt.Data	0802f3a635	Merge branch 'master' into dr-support-pip-cm	2025-10-16 12:06:19 +09:00
Faych	afa8a24fe1	refactor: Replace manual patches merging with merge_nested_dicts (#10360 )	2025-10-15 17:16:09 -07:00
Jedrzej Kosinski	493b81e48f	Fix order of inputs nested merge_nested_dicts (#10362 )	2025-10-15 16:47:26 -07:00
Dr.Lt.Data	19ad129d37	Merge branch 'master' into dr-support-pip-cm	2025-10-16 06:40:04 +09:00
comfyanonymous	1c10b33f9b	gfx942 doesn't support fp8 operations. (#10348 )	2025-10-15 00:21:11 -04:00
Dr.Lt.Data	5fbc8a1b80	Merge branch 'master' into dr-support-pip-cm	2025-10-15 06:43:20 +09:00
comfyanonymous	3374e900d0	Faster workflow cancelling. (#10301 )	2025-10-13 23:43:53 -04:00
Dr.Lt.Data	b180f47d0e	Merge branch 'master' into dr-support-pip-cm	2025-10-14 12:34:58 +09:00
comfyanonymous	dfff7e5332	Better memory estimation for the SD/Flux VAE on AMD. (#10334 )	2025-10-13 22:37:19 -04:00
comfyanonymous	e4ea393666	Fix loading old stable diffusion ckpt files on newer numpy. (#10333 )	2025-10-13 22:18:58 -04:00
comfyanonymous	c8674bc6e9	Enable RDNA4 pytorch attention on ROCm 7.0 and up. (#10332 )	2025-10-13 21:19:03 -04:00
Dr.Lt.Data	2b47f4a38e	Merge branch 'master' into dr-support-pip-cm	2025-10-14 07:36:42 +09:00
rattus128	95ca2e56c8	WAN2.2: Fix cache VRAM leak on error (#10308 ) Same change pattern as `7e8dd275c2` applied to WAN2.2 If this suffers an exception (such as a VRAM oom) it will leave the encode() and decode() methods which skips the cleanup of the WAN feature cache. The comfy node cache then ultimately keeps a reference this object which is in turn reffing large tensors from the failed execution. The feature cache is currently setup at a class variable on the encoder/decoder however, the encode and decode functions always clear it on both entry and exit of normal execution. Its likely the design intent is this is usable as a streaming encoder where the input comes in batches, however the functions as they are today don't support that. So simplify by bringing the cache back to local variable, so that if it does VRAM OOM the cache itself is properly garbage when the encode()/decode() functions dissappear from the stack.	2025-10-13 15:23:11 -04:00
comfyanonymous	e693e4db6a	Always set diffusion model to eval() mode. (#10331 )	2025-10-13 14:57:27 -04:00
Dr.Lt.Data	5f50b86114	Merge branch 'master' into dr-support-pip-cm	2025-10-13 06:42:04 +09:00
comfyanonymous	a125cd84b0	Improve AMD performance. (#10302 ) I honestly have no idea why this improves things but it does.	2025-10-12 00:28:01 -04:00
comfyanonymous	84e9ce32c6	Implement the mmaudio VAE. (#10300 )	2025-10-11 22:57:23 -04:00
Dr.Lt.Data	4e7f2eeae2	Merge branch 'master' into dr-support-pip-cm	2025-10-10 08:15:03 +09:00
comfyanonymous	f1dd6e50f8	Fix bug with applying loras on fp8 scaled without fp8 ops. (#10279 )	2025-10-09 19:02:40 -04:00
comfyanonymous	139addd53c	More surgical fix for #10267 (#10276 )	2025-10-09 16:37:35 -04:00
Dr.Lt.Data	05cd5348b6	Merge branch 'master' into dr-support-pip-cm	2025-10-09 10:49:23 +09:00
comfyanonymous	6e59934089	Refactor model sampling sigmas code. (#10250 )	2025-10-08 17:49:02 -04:00
Dr.Lt.Data	6b20418ad1	Merge branch 'master' into dr-support-pip-cm	2025-10-07 14:30:16 +09:00
comfyanonymous	8aea746212	Implement gemma 3 as a text encoder. (#10241 ) Not useful yet.	2025-10-06 22:08:08 -04:00
comfyanonymous	195e0b0639	Remove useless code. (#10223 )	2025-10-05 15:41:19 -04:00
Dr.Lt.Data	8634b19bc7	Merge branch 'master' into dr-support-pip-cm	2025-10-04 07:09:43 +09:00
Finn-Hecker	93d859cfaa	Fix type annotation syntax in MotionEncoder_tc __init__ (#10186 ) ## Summary Fixed incorrect type hint syntax in `MotionEncoder_tc.__init__()` parameter list. ## Changes - Line 647: Changed `num_heads=int` to `num_heads: int` - This corrects the parameter annotation from a default value assignment to proper type hint syntax ## Details The parameter was using assignment syntax (`=`) instead of type annotation syntax (`:`), which would incorrectly set the default value to the `int` class itself rather than annotating the expected type.	2025-10-03 14:32:19 -07:00
Dr.Lt.Data	28092933c1	Merge branch 'master' into dr-support-pip-cm	2025-10-02 12:49:48 +09:00
rattus128	4965c0e2ac	WAN: Fix cache VRAM leak on error (#10141 ) If this suffers an exception (such as a VRAM oom) it will leave the encode() and decode() methods which skips the cleanup of the WAN feature cache. The comfy node cache then ultimately keeps a reference this object which is in turn reffing large tensors from the failed execution. The feature cache is currently setup at a class variable on the encoder/decoder however, the encode and decode functions always clear it on both entry and exit of normal execution. Its likely the design intent is this is usable as a streaming encoder where the input comes in batches, however the functions as they are today don't support that. So simplify by bringing the cache back to local variable, so that if it does VRAM OOM the cache itself is properly garbage when the encode()/decode() functions dissappear from the stack.	2025-10-01 18:42:16 -04:00
rattus128	911331c06c	sd: fix VAE tiled fallback VRAM leak (#10139 ) When the VAE catches this VRAM OOM, it launches the fallback logic straight from the exception context. Python however refs the entire call stack that caused the exception including any local variables for the sake of exception report and debugging. In the case of tensors, this can hold on the references to GBs of VRAM and inhibit the VRAM allocated from freeing them. So dump the except context completely before going back to the VAE via the tiler by getting out of the except block with nothing but a flag. The greately increases the reliability of the tiler fallback, especially on low VRAM cards, as with the bug, if the leak randomly leaked more than the headroom needed for a single tile, the tiler would fallback would OOM and fail the flow.	2025-10-01 18:40:28 -04:00
Dr.Lt.Data	17064a993c	Merge branch 'master' into dr-support-pip-cm	2025-10-02 07:31:37 +09:00
comfyanonymous	a6f83a4a1a	Support the new hunyuan vae. (#10150 )	2025-10-01 17:19:13 -04:00
Dr.Lt.Data	20ac0052f8	Merge branch 'master' into dr-support-pip-cm	2025-09-29 06:58:35 +09:00
rattus128	653ceab414	Reduce Peak WAN inference VRAM usage - part II (#10062 ) * flux: math: Use _addcmul to avoid expensive VRAM intermediate The rope process can be the VRAM peak and this intermediate for the addition result before releasing the original can OOM. addcmul_ it. * wan: Delete the self attention before cross attention This saves VRAM when the cross attention and FFN are in play as the VRAM peak.	2025-09-27 18:14:16 -04:00
Jedrzej Kosinski	196954ab8c	Add 'input_cond' and 'input_uncond' to the args dictionary passed into sampler_cfg_function (#10044 )	2025-09-26 19:55:03 -07:00
comfyanonymous	1e098d6132	Don't add template to qwen2.5vl when template is in prompt. (#10043 ) Make the hunyuan image refiner template_end 36.	2025-09-26 18:34:17 -04:00
Dr.Lt.Data	bc8418f55a	Merge branch 'master' into dr-support-pip-cm	2025-09-26 07:00:43 +09:00
Guy Niv	c8d2117f02	Fix memory leak by properly detaching model finalizer (#9979 ) When unloading models in load_models_gpu(), the model finalizer was not being explicitly detached, leading to a memory leak. This caused linear memory consumption increase over time as models are repeatedly loaded and unloaded. This change prevents orphaned finalizer references from accumulating in memory during model switching operations.	2025-09-24 22:35:12 -04:00
comfyanonymous	fccab99ec0	Fix issue with .view() in HuMo. (#10014 )	2025-09-24 20:09:42 -04:00
Dr.Lt.Data	74c1a58566	Merge branch 'master' into dr-support-pip-cm	2025-09-23 07:28:52 +09:00
comfyanonymous	1fee8827cb	Support for qwen edit plus model. Use the new TextEncodeQwenImageEditPlus. (#9986 )	2025-09-22 16:49:48 -04:00
Dr.Lt.Data	7b1ed9b2b8	Merge branch 'master' into dr-support-pip-cm	2025-09-21 11:24:37 +09:00
comfyanonymous	d1d9eb94b1	Lower wan memory estimation value a bit. (#9964 ) Previous pr reduced the peak memory requirement.	2025-09-20 22:09:35 -04:00
Dr.Lt.Data	4ea946778b	Merge branch 'master' into dr-support-pip-cm	2025-09-21 10:45:28 +09:00
Kohaku-Blueleaf	7be2b49b6b	Fix LoRA Trainer bugs with FP8 models. (#9854 ) * Fix adapter weight init * Fix fp8 model training * Avoid inference tensor	2025-09-20 21:24:48 -04:00
Dr.Lt.Data	309c92d6c9	Merge branch 'master' into dr-support-pip-cm	2025-09-21 09:33:38 +09:00
comfyanonymous	e8df53b764	Update WanAnimateToVideo to more easily extend videos. (#9959 )	2025-09-19 18:48:56 -04:00
Dr.Lt.Data	ca7492c9d4	Merge branch 'master' into dr-support-pip-cm	2025-09-20 07:13:36 +09:00
comfyanonymous	dc95b6acc0	Basic WIP support for the wan animate model. (#9939 )	2025-09-19 03:07:17 -04:00
Dr.Lt.Data	fa51f0c60a	Merge branch 'master' into dr-support-pip-cm	2025-09-19 12:00:10 +09:00
comfyanonymous	24b0fce099	Do padding of audio embed in model for humo for more flexibility. (#9935 )	2025-09-18 19:54:16 -04:00
DELUXA	8d6653fca6	Enable fp8 ops by default on gfx1200 (#9926 )	2025-09-18 19:50:37 -04:00
Dr.Lt.Data	0a084a88a2	Merge branch 'master' into dr-support-pip-cm	2025-09-19 08:16:58 +09:00
comfyanonymous	e7ff647d02	--disable-manager -> --enable-manager	2025-09-17 20:58:42 -04:00
comfyanonymous	dd611a7700	Support the HuMo 17B model. (#9912 )	2025-09-17 18:39:24 -04:00
Dr.Lt.Data	77e10752fe	Merge branch 'master' into dr-support-pip-cm	2025-09-18 07:32:23 +09:00
comfyanonymous	9288c78fc5	Support the HuMo model. (#9903 )	2025-09-17 00:12:48 -04:00
Dr.Lt.Data	2c30881d9c	Merge branch 'master' into dr-support-pip-cm	2025-09-17 11:56:35 +09:00
rattus128	e42682b24e	Reduce Peak WAN inference VRAM usage (#9898 ) * flux: Do the xq and xk ropes one at a time This was doing independendent interleaved tensor math on the q and k tensors, leading to the holding of more than the minimum intermediates in VRAM. On a bad day, it would VRAM OOM on xk intermediates. Do everything q and then everything k, so torch can garbage collect all of qs intermediates before k allocates its intermediates. This reduces peak VRAM usage for some WAN2.2 inferences (at least). * wan: Optimize qkv intermediates on attention As commented. The former logic computed independent pieces of QKV in parallel which help more inference intermediates in VRAM spiking VRAM usage. Fully roping Q and garbage collecting the intermediates before touching K reduces the peak inference VRAM usage.	2025-09-16 19:21:14 -04:00
Dr.Lt.Data	7fa5990dbc	Merge branch 'master' into dr-support-pip-cm	2025-09-17 06:09:40 +09:00
comfyanonymous	a39ac59c3e	Add encoder part of whisper large v3 as an audio encoder model. (#9894 ) Not useful yet but some models use it.	2025-09-16 01:19:50 -04:00
Dr.Lt.Data	07212a2466	Merge branch 'master' into dr-support-pip-cm	2025-09-16 12:39:43 +09:00
blepping	1a85483da1	Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884 ) Correctly handle the case where w0 is passed by kwargs in BatchedBrownianTree	2025-09-15 20:05:03 -04:00
comfyanonymous	47a9cde5d3	Support the omnigen2 umo lora. (#9886 )	2025-09-15 18:10:55 -04:00
Dr.Lt.Data	f4d7a32cd8	Merge branch 'master' into dr-support-pip-cm	2025-09-15 12:16:00 +09:00
Jedrzej Kosinski	f228367c5e	Make ModuleNotFoundError ImportError instead (#9850 )	2025-09-13 21:34:21 -04:00
comfyanonymous	80b7c9455b	Changes to the previous radiance commit. (#9851 )	2025-09-13 18:03:34 -04:00
blepping	c1297f4eb3	Add support for Chroma Radiance (#9682 ) * Initial Chroma Radiance support * Minor Chroma Radiance cleanups * Update Radiance nodes to ensure latents/images are on the intermediate device * Fix Chroma Radiance memory estimation. * Increase Chroma Radiance memory usage factor * Increase Chroma Radiance memory usage factor once again * Ensure images are multiples of 16 for Chroma Radiance Add batch dimension and fix channels when necessary in ChromaRadianceImageToLatent node * Tile Chroma Radiance NeRF to reduce memory consumption, update memory usage factor * Update Radiance to support conv nerf final head type. * Allow setting NeRF embedder dtype for Radiance Bump Radiance nerf tile size to 32 Support EasyCache/LazyCache on Radiance (maybe) * Add ChromaRadianceStubVAE node * Crop Radiance image inputs to multiples of 16 instead of erroring to be in line with existing VAE behavior * Convert Chroma Radiance nodes to V3 schema. * Add ChromaRadianceOptions node and backend support. Cleanups/refactoring to reduce code duplication with Chroma. * Fix overriding the NeRF embedder dtype for Chroma Radiance * Minor Chroma Radiance cleanups * Move Chroma Radiance to its own directory in ldm Minor code cleanups and tooltip improvements * Fix Chroma Radiance embedder dtype overriding * Remove Radiance dynamic nerf_embedder dtype override feature * Unbork Radiance NeRF embedder init * Remove Chroma Radiance image conversion and stub VAE nodes Add a chroma_radiance option to the VAELoader builtin node which uses comfy.sd.PixelspaceConversionVAE Add a PixelspaceConversionVAE to comfy.sd for converting BHWC 0..1 <-> BCHW -1..1	2025-09-13 17:58:43 -04:00
Kimbing Ng	e5e70636e7	Remove single quote pattern to avoid wrong matches (#9842 )	2025-09-13 16:59:19 -04:00
Dr.Lt.Data	ce1df28bef	Merge branch 'master' into dr-support-pip-cm	2025-09-13 15:41:22 +09:00
comfyanonymous	29bf807b0e	Cleanup. (#9838 )	2025-09-12 21:57:04 -04:00
Jukka Seppänen	2559dee492	Support wav2vec base models (#9637 ) * Support wav2vec base models * trim trailing whitespace * Do interpolation after	2025-09-12 21:52:58 -04:00
comfyanonymous	a3b04de700	Hunyuan refiner vae now works with tiled. (#9836 )	2025-09-12 19:46:46 -04:00

1 2 3 4 5 ...

1912 Commits