EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-03-06 17:57:40 +08:00

Author	SHA1	Message	Date
comfyanonymous	2c2aa409b0	Log message for cudnn disable on AMD. (#10418 )	2025-10-20 15:43:24 -04:00
patientx	657a7872ab	Merge branch 'comfyanonymous:master' into master	2025-10-19 15:20:17 +03:00
comfyanonymous	b4f30bd408	Pytorch is stupid. (#10398 )	2025-10-19 01:25:35 -04:00
comfyanonymous	dad076aee6	Speed up chroma radiance. (#10395 )	2025-10-18 23:19:52 -04:00
comfyanonymous	0cf33953a7	Fix batch size above 1 giving bad output in chroma radiance. (#10394 )	2025-10-18 23:15:34 -04:00
comfyanonymous	5b80addafd	Turn off cuda malloc by default when --fast autotune is turned on. (#10393 )	2025-10-18 22:35:46 -04:00
comfyanonymous	9da397ea2f	Disable torch compiler for cast_bias_weight function (#10384 ) * Disable torch compiler for cast_bias_weight function * Fix torch compile.	2025-10-17 20:03:28 -04:00
patientx	76dde47dbb	Merge branch 'comfyanonymous:master' into master	2025-10-18 00:05:02 +03:00
comfyanonymous	b1293d50ef	workaround also works on cudnn 91200 (#10375 )	2025-10-16 19:59:56 -04:00
comfyanonymous	19b466160c	Workaround for nvidia issue where VAE uses 3x more memory on torch 2.9 (#10373 )	2025-10-16 18:16:03 -04:00
patientx	7b0643ada1	Merge branch 'comfyanonymous:master' into master	2025-10-16 16:40:50 +03:00
Faych	afa8a24fe1	refactor: Replace manual patches merging with merge_nested_dicts (#10360 )	2025-10-15 17:16:09 -07:00
Jedrzej Kosinski	493b81e48f	Fix order of inputs nested merge_nested_dicts (#10362 )	2025-10-15 16:47:26 -07:00
patientx	26589a3a0b	Merge branch 'comfyanonymous:master' into master	2025-10-15 12:18:21 +03:00
comfyanonymous	1c10b33f9b	gfx942 doesn't support fp8 operations. (#10348 )	2025-10-15 00:21:11 -04:00
comfyanonymous	3374e900d0	Faster workflow cancelling. (#10301 )	2025-10-13 23:43:53 -04:00
comfyanonymous	dfff7e5332	Better memory estimation for the SD/Flux VAE on AMD. (#10334 )	2025-10-13 22:37:19 -04:00
comfyanonymous	e4ea393666	Fix loading old stable diffusion ckpt files on newer numpy. (#10333 )	2025-10-13 22:18:58 -04:00
comfyanonymous	c8674bc6e9	Enable RDNA4 pytorch attention on ROCm 7.0 and up. (#10332 )	2025-10-13 21:19:03 -04:00
patientx	eae7a58e60	Merge branch 'comfyanonymous:master' into master	2025-10-14 02:07:30 +03:00
rattus128	95ca2e56c8	WAN2.2: Fix cache VRAM leak on error (#10308 ) Same change pattern as `7e8dd275c2` applied to WAN2.2 If this suffers an exception (such as a VRAM oom) it will leave the encode() and decode() methods which skips the cleanup of the WAN feature cache. The comfy node cache then ultimately keeps a reference this object which is in turn reffing large tensors from the failed execution. The feature cache is currently setup at a class variable on the encoder/decoder however, the encode and decode functions always clear it on both entry and exit of normal execution. Its likely the design intent is this is usable as a streaming encoder where the input comes in batches, however the functions as they are today don't support that. So simplify by bringing the cache back to local variable, so that if it does VRAM OOM the cache itself is properly garbage when the encode()/decode() functions dissappear from the stack.	2025-10-13 15:23:11 -04:00
comfyanonymous	e693e4db6a	Always set diffusion model to eval() mode. (#10331 )	2025-10-13 14:57:27 -04:00
patientx	fa7942933b	Merge branch 'comfyanonymous:master' into master	2025-10-12 13:56:39 +03:00
comfyanonymous	a125cd84b0	Improve AMD performance. (#10302 ) I honestly have no idea why this improves things but it does.	2025-10-12 00:28:01 -04:00
comfyanonymous	84e9ce32c6	Implement the mmaudio VAE. (#10300 )	2025-10-11 22:57:23 -04:00
patientx	aa6afacc01	Merge branch 'comfyanonymous:master' into master	2025-10-10 02:25:35 +03:00
comfyanonymous	f1dd6e50f8	Fix bug with applying loras on fp8 scaled without fp8 ops. (#10279 )	2025-10-09 19:02:40 -04:00
patientx	3553ce45e5	Merge branch 'comfyanonymous:master' into master	2025-10-09 23:40:21 +03:00
comfyanonymous	139addd53c	More surgical fix for #10267 (#10276 )	2025-10-09 16:37:35 -04:00
patientx	77fc639ed2	Merge branch 'comfyanonymous:master' into master	2025-10-09 15:55:12 +03:00
comfyanonymous	6e59934089	Refactor model sampling sigmas code. (#10250 )	2025-10-08 17:49:02 -04:00
patientx	2502069447	Merge branch 'comfyanonymous:master' into master	2025-10-07 14:02:12 +03:00
comfyanonymous	8aea746212	Implement gemma 3 as a text encoder. (#10241 ) Not useful yet.	2025-10-06 22:08:08 -04:00
patientx	e5c08bb5c2	Merge branch 'comfyanonymous:master' into master	2025-10-06 00:08:39 +03:00
comfyanonymous	195e0b0639	Remove useless code. (#10223 )	2025-10-05 15:41:19 -04:00
patientx	c3a59c8e40	Merge branch 'comfyanonymous:master' into master	2025-10-04 00:51:31 +03:00
Finn-Hecker	93d859cfaa	Fix type annotation syntax in MotionEncoder_tc __init__ (#10186 ) ## Summary Fixed incorrect type hint syntax in `MotionEncoder_tc.__init__()` parameter list. ## Changes - Line 647: Changed `num_heads=int` to `num_heads: int` - This corrects the parameter annotation from a default value assignment to proper type hint syntax ## Details The parameter was using assignment syntax (`=`) instead of type annotation syntax (`:`), which would incorrectly set the default value to the `int` class itself rather than annotating the expected type.	2025-10-03 14:32:19 -07:00
patientx	603dfa1a65	Merge branch 'comfyanonymous:master' into master	2025-10-02 14:06:08 +03:00
rattus128	4965c0e2ac	WAN: Fix cache VRAM leak on error (#10141 ) If this suffers an exception (such as a VRAM oom) it will leave the encode() and decode() methods which skips the cleanup of the WAN feature cache. The comfy node cache then ultimately keeps a reference this object which is in turn reffing large tensors from the failed execution. The feature cache is currently setup at a class variable on the encoder/decoder however, the encode and decode functions always clear it on both entry and exit of normal execution. Its likely the design intent is this is usable as a streaming encoder where the input comes in batches, however the functions as they are today don't support that. So simplify by bringing the cache back to local variable, so that if it does VRAM OOM the cache itself is properly garbage when the encode()/decode() functions dissappear from the stack.	2025-10-01 18:42:16 -04:00
rattus128	911331c06c	sd: fix VAE tiled fallback VRAM leak (#10139 ) When the VAE catches this VRAM OOM, it launches the fallback logic straight from the exception context. Python however refs the entire call stack that caused the exception including any local variables for the sake of exception report and debugging. In the case of tensors, this can hold on the references to GBs of VRAM and inhibit the VRAM allocated from freeing them. So dump the except context completely before going back to the VAE via the tiler by getting out of the except block with nothing but a flag. The greately increases the reliability of the tiler fallback, especially on low VRAM cards, as with the bug, if the leak randomly leaked more than the headroom needed for a single tile, the tiler would fallback would OOM and fail the flow.	2025-10-01 18:40:28 -04:00
comfyanonymous	a6f83a4a1a	Support the new hunyuan vae. (#10150 )	2025-10-01 17:19:13 -04:00
patientx	21dc67a0b9	Merge branch 'comfyanonymous:master' into master	2025-09-28 04:38:19 +03:00
rattus128	653ceab414	Reduce Peak WAN inference VRAM usage - part II (#10062 ) * flux: math: Use _addcmul to avoid expensive VRAM intermediate The rope process can be the VRAM peak and this intermediate for the addition result before releasing the original can OOM. addcmul_ it. * wan: Delete the self attention before cross attention This saves VRAM when the cross attention and FFN are in play as the VRAM peak.	2025-09-27 18:14:16 -04:00
patientx	7e6b077cd7	Merge branch 'comfyanonymous:master' into master	2025-09-27 14:25:54 +03:00
Jedrzej Kosinski	196954ab8c	Add 'input_cond' and 'input_uncond' to the args dictionary passed into sampler_cfg_function (#10044 )	2025-09-26 19:55:03 -07:00
comfyanonymous	1e098d6132	Don't add template to qwen2.5vl when template is in prompt. (#10043 ) Make the hunyuan image refiner template_end 36.	2025-09-26 18:34:17 -04:00
patientx	258da26c98	Merge branch 'comfyanonymous:master' into master	2025-09-25 15:08:16 +03:00
Guy Niv	c8d2117f02	Fix memory leak by properly detaching model finalizer (#9979 ) When unloading models in load_models_gpu(), the model finalizer was not being explicitly detached, leading to a memory leak. This caused linear memory consumption increase over time as models are repeatedly loaded and unloaded. This change prevents orphaned finalizer references from accumulating in memory during model switching operations.	2025-09-24 22:35:12 -04:00
comfyanonymous	fccab99ec0	Fix issue with .view() in HuMo. (#10014 )	2025-09-24 20:09:42 -04:00
patientx	64aa08cf53	Merge branch 'comfyanonymous:master' into master	2025-09-23 00:01:32 +03:00
comfyanonymous	1fee8827cb	Support for qwen edit plus model. Use the new TextEncodeQwenImageEditPlus. (#9986 )	2025-09-22 16:49:48 -04:00
Rando717	5dcd8d2428	add files via upload uploaded nvcuda.zluda_get_nightly_flag.py to get nightly flag info inside batch	2025-09-21 19:31:18 +02:00
patientx	9d2b926f56	Merge branch 'comfyanonymous:master' into master	2025-09-21 16:57:32 +03:00
comfyanonymous	d1d9eb94b1	Lower wan memory estimation value a bit. (#9964 ) Previous pr reduced the peak memory requirement.	2025-09-20 22:09:35 -04:00
Kohaku-Blueleaf	7be2b49b6b	Fix LoRA Trainer bugs with FP8 models. (#9854 ) * Fix adapter weight init * Fix fp8 model training * Avoid inference tensor	2025-09-20 21:24:48 -04:00
patientx	c62e820d45	Merge branch 'comfyanonymous:master' into master	2025-09-20 01:51:06 +03:00
comfyanonymous	e8df53b764	Update WanAnimateToVideo to more easily extend videos. (#9959 )	2025-09-19 18:48:56 -04:00
comfyanonymous	dc95b6acc0	Basic WIP support for the wan animate model. (#9939 )	2025-09-19 03:07:17 -04:00
comfyanonymous	24b0fce099	Do padding of audio embed in model for humo for more flexibility. (#9935 )	2025-09-18 19:54:16 -04:00
DELUXA	8d6653fca6	Enable fp8 ops by default on gfx1200 (#9926 )	2025-09-18 19:50:37 -04:00
patientx	50e281dc6d	Merge branch 'comfyanonymous:master' into master	2025-09-18 03:02:08 +03:00
comfyanonymous	dd611a7700	Support the HuMo 17B model. (#9912 )	2025-09-17 18:39:24 -04:00
patientx	a8b63b21fe	Merge branch 'comfyanonymous:master' into master	2025-09-17 12:29:45 +03:00
comfyanonymous	9288c78fc5	Support the HuMo model. (#9903 )	2025-09-17 00:12:48 -04:00
rattus128	e42682b24e	Reduce Peak WAN inference VRAM usage (#9898 ) * flux: Do the xq and xk ropes one at a time This was doing independendent interleaved tensor math on the q and k tensors, leading to the holding of more than the minimum intermediates in VRAM. On a bad day, it would VRAM OOM on xk intermediates. Do everything q and then everything k, so torch can garbage collect all of qs intermediates before k allocates its intermediates. This reduces peak VRAM usage for some WAN2.2 inferences (at least). * wan: Optimize qkv intermediates on attention As commented. The former logic computed independent pieces of QKV in parallel which help more inference intermediates in VRAM spiking VRAM usage. Fully roping Q and garbage collecting the intermediates before touching K reduces the peak inference VRAM usage.	2025-09-16 19:21:14 -04:00
patientx	9cdd9e38d2	Merge branch 'comfyanonymous:master' into master	2025-09-17 00:30:44 +03:00
comfyanonymous	a39ac59c3e	Add encoder part of whisper large v3 as an audio encoder model. (#9894 ) Not useful yet but some models use it.	2025-09-16 01:19:50 -04:00
blepping	1a85483da1	Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884 ) Correctly handle the case where w0 is passed by kwargs in BatchedBrownianTree	2025-09-15 20:05:03 -04:00
patientx	a08c1e1613	Merge branch 'comfyanonymous:master' into master	2025-09-16 01:22:18 +03:00
comfyanonymous	47a9cde5d3	Support the omnigen2 umo lora. (#9886 )	2025-09-15 18:10:55 -04:00
patientx	09bd37e843	Merge branch 'comfyanonymous:master' into master	2025-09-14 13:23:58 +03:00
Jedrzej Kosinski	f228367c5e	Make ModuleNotFoundError ImportError instead (#9850 )	2025-09-13 21:34:21 -04:00
patientx	78c0630849	Merge branch 'comfyanonymous:master' into master	2025-09-14 03:43:29 +03:00
comfyanonymous	80b7c9455b	Changes to the previous radiance commit. (#9851 )	2025-09-13 18:03:34 -04:00
blepping	c1297f4eb3	Add support for Chroma Radiance (#9682 ) * Initial Chroma Radiance support * Minor Chroma Radiance cleanups * Update Radiance nodes to ensure latents/images are on the intermediate device * Fix Chroma Radiance memory estimation. * Increase Chroma Radiance memory usage factor * Increase Chroma Radiance memory usage factor once again * Ensure images are multiples of 16 for Chroma Radiance Add batch dimension and fix channels when necessary in ChromaRadianceImageToLatent node * Tile Chroma Radiance NeRF to reduce memory consumption, update memory usage factor * Update Radiance to support conv nerf final head type. * Allow setting NeRF embedder dtype for Radiance Bump Radiance nerf tile size to 32 Support EasyCache/LazyCache on Radiance (maybe) * Add ChromaRadianceStubVAE node * Crop Radiance image inputs to multiples of 16 instead of erroring to be in line with existing VAE behavior * Convert Chroma Radiance nodes to V3 schema. * Add ChromaRadianceOptions node and backend support. Cleanups/refactoring to reduce code duplication with Chroma. * Fix overriding the NeRF embedder dtype for Chroma Radiance * Minor Chroma Radiance cleanups * Move Chroma Radiance to its own directory in ldm Minor code cleanups and tooltip improvements * Fix Chroma Radiance embedder dtype overriding * Remove Radiance dynamic nerf_embedder dtype override feature * Unbork Radiance NeRF embedder init * Remove Chroma Radiance image conversion and stub VAE nodes Add a chroma_radiance option to the VAELoader builtin node which uses comfy.sd.PixelspaceConversionVAE Add a PixelspaceConversionVAE to comfy.sd for converting BHWC 0..1 <-> BCHW -1..1	2025-09-13 17:58:43 -04:00
Kimbing Ng	e5e70636e7	Remove single quote pattern to avoid wrong matches (#9842 )	2025-09-13 16:59:19 -04:00
patientx	596049a855	Merge branch 'comfyanonymous:master' into master	2025-09-13 14:38:42 +03:00
comfyanonymous	29bf807b0e	Cleanup. (#9838 )	2025-09-12 21:57:04 -04:00
Jukka Seppänen	2559dee492	Support wav2vec base models (#9637 ) * Support wav2vec base models * trim trailing whitespace * Do interpolation after	2025-09-12 21:52:58 -04:00
comfyanonymous	a3b04de700	Hunyuan refiner vae now works with tiled. (#9836 )	2025-09-12 19:46:46 -04:00
patientx	42a2c109ec	Merge branch 'comfyanonymous:master' into master	2025-09-13 01:16:39 +03:00
Jedrzej Kosinski	d7f40442f9	Enable Runtime Selection of Attention Functions (#9639 ) * Looking into a @wrap_attn decorator to look for 'optimized_attention_override' entry in transformer_options * Created logging code for this branch so that it can be used to track down all the code paths where transformer_options would need to be added * Fix memory usage issue with inspect * Made WAN attention receive transformer_options, test node added to wan to test out attention override later * Added *kwargs to all attention functions so transformer_options could potentially be passed through Make sure wrap_attn doesn't make itself recurse infinitely, attempt to load SageAttention and FlashAttention if not enabled so that they can be marked as available or not, create registry for available attention * Turn off attention logging for now, make AttentionOverrideTestNode have a dropdown with available attention (this is a test node only) * Make flux work with optimized_attention_override * Add logs to verify optimized_attention_override is passed all the way into attention function * Make Qwen work with optimized_attention_override * Made hidream work with optimized_attention_override * Made wan patches_replace work with optimized_attention_override * Made SD3 work with optimized_attention_override * Made HunyuanVideo work with optimized_attention_override * Made Mochi work with optimized_attention_override * Made LTX work with optimized_attention_override * Made StableAudio work with optimized_attention_override * Made optimized_attention_override work with ACE Step * Made Hunyuan3D work with optimized_attention_override * Make CosmosPredict2 work with optimized_attention_override * Made CosmosVideo work with optimized_attention_override * Made Omnigen 2 work with optimized_attention_override * Made StableCascade work with optimized_attention_override * Made AuraFlow work with optimized_attention_override * Made Lumina work with optimized_attention_override * Made Chroma work with optimized_attention_override * Made SVD work with optimized_attention_override * Fix WanI2VCrossAttention so that it expects to receive transformer_options * Fixed Wan2.1 Fun Camera transformer_options passthrough * Fixed WAN 2.1 VACE transformer_options passthrough * Add optimized to get_attention_function * Disable attention logs for now * Remove attention logging code * Remove _register_core_attention_functions, as we wouldn't want someone to call that, just in case * Satisfy ruff * Remove AttentionOverrideTest node, that's something to cook up for later	2025-09-12 18:07:38 -04:00
comfyanonymous	b149e2e1e3	Better way of doing the generator for the hunyuan image noise aug. (#9834 )	2025-09-12 17:53:15 -04:00
comfyanonymous	7757d5a657	Set default hunyuan refiner shift to 4.0 (#9833 )	2025-09-12 16:40:12 -04:00
comfyanonymous	e600520f8a	Fix hunyuan refiner blownout colors at noise aug less than 0.25 (#9832 )	2025-09-12 16:35:34 -04:00
patientx	39a0d246ee	Merge branch 'comfyanonymous:master' into master	2025-09-12 23:24:35 +03:00
comfyanonymous	fd2b820ec2	Add noise augmentation to hunyuan image refiner. (#9831 ) This was missing and should help with colors being blown out.	2025-09-12 16:03:08 -04:00
patientx	4c5915d5cb	Merge branch 'comfyanonymous:master' into master	2025-09-12 09:29:27 +03:00
comfyanonymous	33bd9ed9cb	Implement hunyuan image refiner model. (#9817 )	2025-09-12 00:43:20 -04:00
comfyanonymous	18de0b2830	Fast preview for hunyuan image. (#9814 )	2025-09-11 19:33:02 -04:00
patientx	aae8c1486f	Merge pull request #297 from Rando717/Rando717-zluda.py zluda.py "Expanded gfx identifier, lowercase gpu search, detect Triton version"	2025-09-11 20:35:35 +03:00
patientx	06fe8754d2	Merge branch 'comfyanonymous:master' into master	2025-09-11 13:46:42 +03:00
comfyanonymous	e01e99d075	Support hunyuan image distilled model. (#9807 )	2025-09-10 23:17:34 -04:00
patientx	666b2e05fa	Merge branch 'comfyanonymous:master' into master	2025-09-10 10:47:09 +03:00
comfyanonymous	543888d3d8	Fix lowvram issue with hunyuan image vae. (#9794 )	2025-09-10 02:15:34 -04:00
comfyanonymous	85e34643f8	Support hunyuan image 2.1 regular model. (#9792 )	2025-09-10 02:05:07 -04:00
comfyanonymous	5c33872e2f	Fix issue on old torch. (#9791 )	2025-09-10 00:23:47 -04:00
comfyanonymous	b288fb0db8	Small refactor of some vae code. (#9787 )	2025-09-09 18:09:56 -04:00
Rando717	4057f2984c	Update zluda.py (MEM_BUS_WIDTH#3) Lower casing the lookup inside MEM_BUS_WIDTH, just in case of incorrect casing on Radeon Pro (PRO) GPUs. fixed/lower-casing "Triton device properties" lookup inside MEM_BUS_WIDTH.	2025-09-09 20:04:20 +02:00
Rando717	13ba6a8a8d	Update zluda.py (cleanup print Triton version) compacted, without exception, silent if no version string	2025-09-09 19:30:54 +02:00
Rando717	ce8900fa25	Update zluda.py (gpu_name_to_gfx) -function changed into list of rules -correct gfx codes attached to each GPU name -addressed potential incorrect designation for RX 6000 S Series, sort priority	2025-09-09 18:51:41 +02:00
patientx	a531352603	Merge branch 'comfyanonymous:master' into master	2025-09-09 01:35:58 +03:00
comfyanonymous	103a12cb66	Support qwen inpaint controlnet. (#9772 )	2025-09-08 17:30:26 -04:00
patientx	6f38e729cc	Merge branch 'comfyanonymous:master' into master	2025-09-08 22:15:28 +03:00
Rando717	e7d48450a3	Update zluda.py (removed previously added gfx90c) 'radeon graphics' check is not viable enough considering 'radeon (tm) graphics' also exists on Vega. Plus gfx1036 Raphael (Ryzen 7000) is called 'radeon (tm) graphics' , same with Granite Ridge (Ryzen 9000).	2025-09-08 21:10:20 +02:00
contentis	97652d26b8	Add explicit casting in apply_rope for Qwen VL (#9759 )	2025-09-08 15:08:18 -04:00
Rando717	590f46ab41	Update zluda.py (typo)	2025-09-08 20:31:49 +02:00
Rando717	675d6d8f4c	Update zluda.py (gfx gpu names) -expanded GPU gfx names -added RDNA4, RDNA3.5, ... -added missing Polaris cards to prevent 'gfx1010' and 'gfx1030' fallback -kept gfx designation mostly the same, based on available custom lib's for hip57/62 might need some post adjustments	2025-09-08 17:55:29 +02:00
Rando717	ddb1e3da47	Update zluda.py (typo)	2025-09-08 17:22:41 +02:00
Rando717	a7336ad630	Update zluda.py (MEM_BUS_WIDTH#2) Added Vega10/20 cards. Can't test, no clue if it has effect or just a placebo effect.	2025-09-08 17:19:03 +02:00
Rando717	40199a5244	Update zluda.py (print Triton version) Added check for Triton version string, if it exists. Could be useful info for troubleshooting reports.	2025-09-08 17:00:40 +02:00
patientx	b46622ffa5	Merge branch 'comfyanonymous:master' into master	2025-09-08 11:14:04 +03:00
comfyanonymous	fb763d4333	Fix amd_min_version crash when cpu device. (#9754 )	2025-09-07 21:16:29 -04:00
patientx	9417753a6c	Merge branch 'comfyanonymous:master' into master	2025-09-07 13:16:57 +03:00
comfyanonymous	bcbd7884e3	Don't enable pytorch attention on AMD if triton isn't available. (#9747 )	2025-09-07 00:29:38 -04:00
comfyanonymous	27a0fcccc3	Enable bf16 VAE on RDNA4. (#9746 )	2025-09-06 23:25:22 -04:00
patientx	afbcd5d57e	Merge branch 'comfyanonymous:master' into master	2025-09-06 11:51:33 +03:00
comfyanonymous	ea6cdd2631	Print all fast options in --help (#9737 )	2025-09-06 01:05:05 -04:00
patientx	3ca065a755	fix	2025-09-05 23:11:57 +03:00
patientx	0488fe3748	rmsnorm patch second try	2025-09-05 23:10:27 +03:00
patientx	8966009181	added rmsnorm patch for torch's older than 2.4	2025-09-05 22:43:39 +03:00
patientx	f9d7fcb696	Merge branch 'comfyanonymous:master' into master	2025-09-05 22:09:30 +03:00
comfyanonymous	2ee7879a0b	Fix lowvram issues with hunyuan3d 2.1 (#9735 )	2025-09-05 14:57:35 -04:00
patientx	c7c7269f48	Merge branch 'comfyanonymous:master' into master	2025-09-05 17:11:07 +03:00
comfyanonymous	c9ebe70072	Some changes to the previous hunyuan PR. (#9725 )	2025-09-04 20:39:02 -04:00
Yousef R. Gamaleldin	261421e218	Add Hunyuan 3D 2.1 Support (#8714 )	2025-09-04 20:36:20 -04:00
patientx	d79e93a0a9	Merge branch 'comfyanonymous:master' into master	2025-09-04 12:41:48 +03:00
comfyanonymous	72855db715	Fix potential rope issue. (#9710 )	2025-09-03 22:20:13 -04:00
patientx	991209d11d	Merge branch 'comfyanonymous:master' into master	2025-09-03 00:05:33 +03:00
comfyanonymous	e3018c2a5a	uso -> uxo/uno as requested. (#9688 )	2025-09-02 16:12:07 -04:00
patientx	b30a38dca0	Merge branch 'comfyanonymous:master' into master	2025-09-02 22:46:44 +03:00
comfyanonymous	3412d53b1d	USO style reference. (#9677 ) Load the projector.safetensors file with the ModelPatchLoader node and use the siglip_vision_patch14_384.safetensors "clip vision" model and the USOStyleReferenceNode.	2025-09-02 15:36:22 -04:00
patientx	47c6fb34c9	Merge branch 'comfyanonymous:master' into master	2025-09-02 09:46:42 +03:00
contentis	e2d1e5dad9	Enable Convolution AutoTuning (#9301 )	2025-09-01 20:33:50 -04:00
comfyanonymous	27e067ce50	Implement the USO subject identity lora. (#9674 ) Use the lora with FluxContextMultiReferenceLatentMethod node set to "uso" and a ReferenceLatent node with the reference image.	2025-09-01 18:54:02 -04:00
patientx	9cb469282e	Merge branch 'comfyanonymous:master' into master	2025-08-31 11:24:57 +03:00
chaObserv	32a627bf1f	SEEDS: update noise decomposition and refactor (#9633 ) - Update the decomposition to reflect interval dependency - Extract phi computations into functions - Use torch.lerp for interpolation	2025-08-31 00:01:45 -04:00
patientx	c6b0bf480f	Merge branch 'comfyanonymous:master' into master	2025-08-29 09:31:05 +03:00
comfyanonymous	e80a14ad50	Support wan2.2 5B fun control model. (#9611 ) Use the Wan22FunControlToVideo node.	2025-08-28 22:13:07 -04:00
patientx	c8af694267	Merge pull request #279 from sfinktah/sfink-cudnn-benchmark Added env_var for cudnn.benchmark	2025-08-28 23:17:05 +03:00
patientx	1db0a73a2a	Merge branch 'comfyanonymous:master' into master	2025-08-28 09:06:22 +03:00
comfyanonymous	4aa79dbf2c	Adjust flux mem usage factor a bit. (#9588 )	2025-08-27 23:08:17 -04:00
patientx	fc93a6f534	Merge branch 'comfyanonymous:master' into master	2025-08-28 02:22:15 +03:00
Gangin Park	3aad339b63	Add DPM++ 2M SDE Heun (RES) sampler (#9542 )	2025-08-27 19:07:31 -04:00
comfyanonymous	491755325c	Better s2v memory estimation. (#9584 )	2025-08-27 19:02:42 -04:00
Christopher Anderson	cf22cbd8d5	Added env_var for cudnn.benchmark	2025-08-28 09:00:08 +10:00
comfyanonymous	496888fd68	Improve s2v performance when generating videos longer than 120 frames. (#9582 )	2025-08-27 16:06:40 -04:00
comfyanonymous	b5ac6ed7ce	Fixes to make controlnet type models work on qwen edit and kontext. (#9581 )	2025-08-27 15:26:28 -04:00
Kohaku-Blueleaf	b20ba1f27c	Fix #9537 (#9576 )	2025-08-27 12:45:02 -04:00
patientx	eeab23fc0b	Merge branch 'comfyanonymous:master' into master	2025-08-27 10:07:57 +03:00
comfyanonymous	88aee596a3	WIP Wan 2.2 S2V model. (#9568 )	2025-08-27 01:10:34 -04:00
patientx	c1aef0126d	Merge pull request #276 from sfinktah/sfink-cudnn-benchmark-env Deleted torch.backends.cudnn.benchmark line, defaults are fine	2025-08-26 19:34:35 +03:00
patientx	1efeba7066	Merge branch 'comfyanonymous:master' into master	2025-08-26 10:41:38 +03:00
comfyanonymous	914c2a2973	Implement wav2vec2 as an audio encoder model. (#9549 ) This is useless on its own but there are multiple models that use it.	2025-08-25 23:26:47 -04:00
Christopher Anderson	110cb0a9d9	Deleted torch.backends.cudnn.benchmark line, defaults are fine	2025-08-26 08:43:31 +10:00
Christopher Anderson	1b9a3b12c2	had to move cudnn disablement up much higher	2025-08-25 14:11:54 +10:00
Christopher Anderson	cd3d60254b	argggh, white space hell	2025-08-25 09:52:58 +10:00
Christopher Anderson	184fa5921f	worst PR ever, really.	2025-08-25 09:42:27 +10:00
Christopher Anderson	33c43b68c3	worst PR ever	2025-08-25 09:38:22 +10:00
Christopher Anderson	2a06dc8e87	Merge remote-tracking branch 'origin/sfink-cudnn-env' into sfink-cudnn-env # Conflicts: # comfy/customzluda/zluda.py	2025-08-25 09:34:32 +10:00
Christopher Anderson	3504eeeb4a	rebased onto upstream master (woops)	2025-08-25 09:32:34 +10:00
Christopher Anderson	7eda4587be	Added env var TORCH_BACKENDS_CUDNN_ENABLED, defaults to 1.	2025-08-25 09:31:12 +10:00
Christopher Anderson	954644ef83	Added env var TORCH_BACKENDS_CUDNN_ENABLED, defaults to 1.	2025-08-25 08:56:48 +10:00
Rando717	053a6b95e5	Update zluda.py (MEM_BUS_WIDTH) Added more cards, mostly RDNA(1) and Radeon Pro. Reasoning: Every time zluda.py gets update I have to manually add 256 for my RX 5700, otherwise it default to 128. Also, manual local edits fails at git pull.	2025-08-24 18:39:40 +02:00
patientx	c92a07594b	Update zluda.py	2025-08-24 12:01:20 +03:00
patientx	dba9d20791	Update zluda.py	2025-08-24 10:23:30 +03:00
patientx	cdc04b5a8a	Merge branch 'comfyanonymous:master' into master	2025-08-23 07:47:07 +03:00
comfyanonymous	41048c69b4	Fix Conditioning masks on 3d latents. (#9506 )	2025-08-22 23:15:44 -04:00
Jedrzej Kosinski	fc247150fe	Implement EasyCache and Invent LazyCache (#9496 ) * Attempting a universal implementation of EasyCache, starting with flux as test; I screwed up the math a bit, but when I set it just right it works. * Fixed math to make threshold work as expected, refactored code to use EasyCacheHolder instead of a dict wrapped by object * Use sigmas from transformer_options instead of timesteps to be compatible with a greater amount of models, make end_percent work * Make log statement when not skipping useful, preparing for per-cond caching * Added DIFFUSION_MODEL wrapper around forward function for wan model * Add subsampling for heuristic inputs * Add subsampling to output_prev (output_prev_subsampled now) * Properly consider conds in EasyCache logic * Created SuperEasyCache to test what happens if caching and reuse is moved outside the scope of conds, added PREDICT_NOISE wrapper to facilitate this test * Change max reuse_threshold to 3.0 * Mark EasyCache/SuperEasyCache as experimental (beta) * Make Lumina2 compatible with EasyCache * Add EasyCache support for Qwen Image * Fix missing comma, curse you Cursor * Add EasyCache support to AceStep * Add EasyCache support to Chroma * Added EasyCache support to Cosmos Predict t2i * Make EasyCache not crash with Cosmos Predict ImagToVideo latents, but does not work well at all * Add EasyCache support to hidream * Added EasyCache support to hunyuan video * Added EasyCache support to hunyuan3d * Added EasyCache support to LTXV (not very good, but does not crash) * Implemented EasyCache for aura_flow * Renamed SuperEasyCache to LazyCache, hardcoded subsample_factor to 8 on nodes * Eatra logging when verbose is true for EasyCache	2025-08-22 22:41:08 -04:00
contentis	fe31ad0276	Add elementwise fusions (#9495 ) * Add elementwise fusions * Add addcmul pattern to Qwen	2025-08-22 19:39:15 -04:00
patientx	7bc46389fa	Merge branch 'comfyanonymous:master' into master	2025-08-22 10:52:52 +03:00
comfyanonymous	ff57793659	Support InstantX Qwen controlnet. (#9488 )	2025-08-22 00:53:11 -04:00
comfyanonymous	f7bd5e58dd	Make it easier to implement future qwen controlnets. (#9485 )	2025-08-21 23:18:04 -04:00
patientx	7ff01ded58	Merge branch 'comfyanonymous:master' into master	2025-08-21 09:24:26 +03:00
comfyanonymous	0963493a9c	Support for Qwen Diffsynth Controlnets canny and depth. (#9465 ) These are not real controlnets but actually a patch on the model so they will be treated as such. Put them in the models/model_patches/ folder. Use the new ModelPatchLoader and QwenImageDiffsynthControlnet nodes.	2025-08-20 22:26:37 -04:00
patientx	6dca25e2a8	Merge branch 'comfyanonymous:master' into master	2025-08-20 10:14:34 +03:00
comfyanonymous	8d38ea3bbf	Fix bf16 precision issue with qwen image embeddings. (#9441 )	2025-08-20 02:58:54 -04:00
comfyanonymous	5a8f502db5	Disable prompt weights for qwen. (#9438 )	2025-08-20 01:08:11 -04:00
comfyanonymous	7cd2c4bd6a	Qwen rotary embeddings should now match reference code. (#9437 )	2025-08-20 00:45:27 -04:00
comfyanonymous	dfa791eb4b	Rope fix for qwen vl. (#9435 )	2025-08-19 20:47:42 -04:00
patientx	1cbb5fdc14	Merge branch 'comfyanonymous:master' into master	2025-08-19 10:21:12 +03:00
comfyanonymous	4977f203fa	P2 of qwen edit model. (#9412 ) * P2 of qwen edit model. * Typo. * Fix normal qwen. * Fix. * Make the TextEncodeQwenImageEdit also set the ref latent. If you don't want it to set the ref latent and want to use the ReferenceLatent node with your custom latent instead just disconnect the VAE.	2025-08-18 22:38:34 -04:00
patientx	3f09b4dba5	Merge branch 'comfyanonymous:master' into master	2025-08-18 15:14:34 +03:00
Jedrzej Kosinski	7f3b9b16c6	Make step index detection much more robust (#9392 )	2025-08-17 18:54:07 -04:00
comfyanonymous	ed43784b0d	WIP Qwen edit model: The diffusion model part. (#9383 )	2025-08-17 16:45:39 -04:00
patientx	64d6cf045e	Merge branch 'comfyanonymous:master' into master	2025-08-17 11:29:13 +03:00
comfyanonymous	0f2b8525bc	Qwen image model refactor. (#9375 )	2025-08-16 17:51:28 -04:00
patientx	5a21015adb	Merge branch 'comfyanonymous:master' into master	2025-08-16 09:54:01 +03:00
comfyanonymous	1702e6df16	Implement wan2.2 camera model. (#9357 ) Use the old WanCameraImageToVideo node.	2025-08-15 17:29:58 -04:00
patientx	eb283b5fd7	Merge branch 'comfyanonymous:master' into master	2025-08-16 00:26:31 +03:00
comfyanonymous	c308a8840a	Add FluxKontextMultiReferenceLatentMethod node. (#9356 ) This node is only useful if someone trains the kontext model to properly use multiple reference images via the index method. The default is the offset method which feeds the multiple images like if they were stitched together as one. This method works with the current flux kontext model.	2025-08-15 15:50:39 -04:00
patientx	13f5f9d78f	Merge branch 'comfyanonymous:master' into master	2025-08-15 10:54:10 +03:00
comfyanonymous	e08ecfbd8a	Add warning when using old pytorch. (#9347 )	2025-08-15 00:22:26 -04:00
comfyanonymous	4e5c230f6a	Fix last commit not working on older pytorch. (#9346 )	2025-08-14 23:44:02 -04:00
Xiangxi Guo (Ryan)	f0d5d0111f	Avoid torch compile graphbreak for older pytorch versions (#9344 ) Turns out torch.compile has some gaps in context manager decorator syntax support. I've sent patches to fix that in PyTorch, but it won't be available for all the folks running older versions of PyTorch, hence this trivial patch.	2025-08-14 23:41:37 -04:00
comfyanonymous	ad19a069f6	Make SLG nodes work on Qwen Image model. (#9345 )	2025-08-14 23:16:01 -04:00
patientx	a927fbd99b	Merge branch 'comfyanonymous:master' into master	2025-08-14 12:16:50 +03:00
Jedrzej Kosinski	e4f7ea105f	Added context window support to core sampling code (#9238 ) * Added initial support for basic context windows - in progress * Add prepare_sampling wrapper for context window to more accurately estimate latent memory requirements, fixed merging wrappers/callbacks dicts in prepare_model_patcher * Made context windows compatible with different dimensions; works for WAN, but results are bad * Fix comfy.patcher_extension.merge_nested_dicts calls in prepare_model_patcher in sampler_helpers.py * Considering adding some callbacks to context window code to allow extensions of behavior without the need to rewrite code * Made dim slicing cleaner * Add Wan Context WIndows node for testing * Made context schedule and fuse method functions be stored on the handler instead of needing to be registered in core code to be found * Moved some code around between node_context_windows.py and context_windows.py * Change manual context window nodes names/ids * Added callbacks to IndexListContexHandler * Adjusted default values for context_length and context_overlap, made schema.inputs definition for WAN Context Windows less annoying * Make get_resized_cond more robust for various dim sizes * Fix typo * Another small fix	2025-08-13 21:33:05 -04:00
Simon Lui	c991a5da65	Fix XPU iGPU regressions (#9322 ) * Change bf16 check and switch non-blocking to off default with option to force to regain speed on certain classes of iGPUs and refactor xpu check. * Turn non_blocking off by default for xpu. * Update README.md for Intel GPUs.	2025-08-13 19:13:35 -04:00
patientx	804c7097fa	Merge branch 'comfyanonymous:master' into master	2025-08-13 23:56:43 +03:00
comfyanonymous	9df8792d4b	Make last PR not crash comfy on old pytorch. (#9324 )	2025-08-13 15:12:41 -04:00
contentis	3da5a07510	SDPA backend priority (#9299 )	2025-08-13 14:53:27 -04:00
patientx	bcafc3f7a3	Merge branch 'comfyanonymous:master' into master	2025-08-13 10:36:20 +03:00
comfyanonymous	560d38f34c	Wan2.2 fun control support. (#9292 )	2025-08-12 23:26:33 -04:00
patientx	f80a9bb674	Merge branch 'comfyanonymous:master' into master	2025-08-12 00:33:53 +03:00
PsychoLogicAu	2208aa616d	Support SimpleTuner lycoris lora for Qwen-Image (#9280 )	2025-08-11 16:56:16 -04:00
patientx	c2686a3968	Merge branch 'comfyanonymous:master' into master	2025-08-10 12:09:19 +03:00
comfyanonymous	5828607ccf	Not sure if AMD actually support fp16 acc but it doesn't crash. (#9258 )	2025-08-09 12:49:25 -04:00
patientx	89499c6fae	Merge branch 'comfyanonymous:master' into master	2025-08-08 11:40:07 +03:00
comfyanonymous	735bb4bdb1	Users report gfx1201 is buggy on flux with pytorch attention. (#9244 )	2025-08-08 04:21:00 -04:00
patientx	8795ae98aa	Merge branch 'comfyanonymous:master' into master	2025-08-06 20:24:47 +03:00
flybirdxx	4c3e57b0ae	Fixed an issue where qwenLora could not be loaded properly. (#9208 )	2025-08-06 13:23:11 -04:00
patientx	2e39e0999f	Update zluda.py	2025-08-05 19:21:20 +03:00
patientx	28957a7bd6	Merge branch 'comfyanonymous:master' into master	2025-08-05 13:37:09 +03:00
comfyanonymous	d044a24398	Fix default shift and any latent size for qwen image model. (#9186 )	2025-08-05 06:12:27 -04:00
patientx	e419bade03	Merge pull request #244 from sfinktah/sfink-zluda-is-nasty Bad ideas from zluda update.	2025-08-05 09:48:53 +03:00
patientx	ea8122f065	Merge branch 'comfyanonymous:master' into master	2025-08-05 09:47:31 +03:00
comfyanonymous	c012400240	Initial support for qwen image model. (#9179 )	2025-08-04 22:53:25 -04:00
Christopher Anderson	4f853403fe	Bad ideas from zluda update.	2025-08-05 06:00:55 +10:00
patientx	88b7fe87ff	Merge branch 'comfyanonymous:master' into master	2025-08-04 12:38:56 +03:00
comfyanonymous	03895dea7c	Fix another issue with the PR. (#9170 )	2025-08-04 04:33:04 -04:00
comfyanonymous	84f9759424	Add some warnings and prevent crash when cond devices don't match. (#9169 )	2025-08-04 04:20:12 -04:00
comfyanonymous	7991341e89	Various fixes for broken things from earlier PR. (#9168 )	2025-08-04 04:02:40 -04:00
patientx	37415c40c1	device identification and setting triton arch override	2025-08-04 10:44:18 +03:00
patientx	d823c0c615	Merge branch 'comfyanonymous:master' into master	2025-08-04 10:42:15 +03:00
comfyanonymous	140ffc7fdc	Fix broken controlnet from last PR. (#9167 )	2025-08-04 03:28:12 -04:00
comfyanonymous	182f90b5ec	Lower cond vram use by casting at the same time as device transfer. (#9159 )	2025-08-04 03:11:53 -04:00
patientx	7258461c23	Merge branch 'comfyanonymous:master' into master	2025-08-03 16:33:54 +03:00
comfyanonymous	aebac22193	Cleanup. (#9160 )	2025-08-03 07:08:11 -04:00
patientx	da4fc8189a	Merge branch 'comfyanonymous:master' into master	2025-08-03 00:17:56 +03:00
comfyanonymous	13aaa66ec2	Make sure context is on the right device. (#9154 )	2025-08-02 15:09:23 -04:00
comfyanonymous	5f582a9757	Make sure all the conds are on the right device. (#9151 )	2025-08-02 15:00:13 -04:00
patientx	83dbd68651	Merge branch 'comfyanonymous:master' into master	2025-08-01 14:42:25 +03:00
comfyanonymous	1e638a140b	Tiny wan vae optimizations. (#9136 )	2025-08-01 05:25:38 -04:00
patientx	321d683af0	Merge branch 'comfyanonymous:master' into master	2025-07-31 14:49:33 +03:00
chaObserv	61b08d4ba6	Replace manual x * sigmoid(x) with torch silu in VAE nonlinearity (#9057 )	2025-07-30 19:25:56 -04:00
comfyanonymous	da9dab7edd	Small wan camera memory optimization. (#9111 )	2025-07-30 05:55:26 -04:00
patientx	1bd4b6489e	Merge branch 'comfyanonymous:master' into master	2025-07-30 11:11:46 +03:00
comfyanonymous	dca6bdd4fa	Make wan2.2 5B i2v take a lot less memory. (#9102 )	2025-07-29 19:44:18 -04:00
patientx	d8ca8134c3	Merge branch 'comfyanonymous:master' into master	2025-07-29 11:56:59 +03:00
comfyanonymous	7d593baf91	Extra reserved vram on large cards on windows. (#9093 )	2025-07-29 04:07:45 -04:00
patientx	fc4e82537c	Merge pull request #233 from sfinktah/sfink-flash-attn-gfx-startswith This will allow much better support for gfx1032 and other things not …	2025-07-28 23:12:38 +03:00
patientx	7ba2a8d3b0	Merge branch 'comfyanonymous:master' into master	2025-07-28 22:15:10 +03:00
comfyanonymous	c60dc4177c	Remove unecessary clones in the wan2.2 VAE. (#9083 )	2025-07-28 14:48:19 -04:00
Christopher Anderson	b5ede18481	This will allow much better support for gfx1032 and other things not specifically named	2025-07-29 04:21:45 +10:00
patientx	769ab3bd25	Merge branch 'comfyanonymous:master' into master	2025-07-28 15:21:30 +03:00
comfyanonymous	a88788dce6	Wan 2.2 support. (#9080 )	2025-07-28 08:00:23 -04:00
patientx	5a45e12b61	Merge branch 'comfyanonymous:master' into master	2025-07-26 14:09:19 +03:00
comfyanonymous	0621d73a9c	Remove useless code. (#9059 )	2025-07-26 04:44:19 -04:00
comfyanonymous	e6e5d33b35	Remove useless code. (#9041 ) This is only needed on old pytorch 2.0 and older.	2025-07-25 04:58:28 -04:00

... 3 4 5 6 7 ...

2533 Commits