EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-07-11 00:47:14 +08:00

Author	SHA1	Message	Date
Yousef Rafat	3dd39efa03	some fixes	2025-12-02 23:40:31 +02:00
Yousef Rafat	a58133f188	updated the hunyuan moe forward splitted the forward statement between ready and pending experts	2025-12-01 22:58:53 +02:00
Yousef Rafat	76e14d69b2	operations + device + dtype \| checkpoint skip	2025-12-01 18:53:43 +02:00
Yousef Rafat	88c350bfed	corrected img_ratio	2025-11-28 20:43:01 +02:00
Yousef Rafat	bd2c2f7375	removed additional token injection	2025-11-28 18:21:20 +02:00
Yousef Rafat	334041f6a6	efficiency improvements	2025-11-26 21:21:45 +02:00
Yousef Rafat	823870db53	fixed some mistakes/errors	2025-11-25 21:34:03 +02:00
Yousef Rafat	b4001bbd27	input correction and improvements	2025-11-25 00:25:27 +02:00
Yousef Rafat	86e9a7a669	updates to the input	2025-11-23 22:20:24 +02:00
Yousef Rafat	a3ac798d4e	decrease peak memory in moe forward	2025-11-22 23:02:42 +02:00
Yousef Rafat	ae8592ebf5	zero-copy and optimized moe loader	2025-11-22 12:37:10 +02:00
Yousef Rafat	4d982e83f6	added clip encoder	2025-11-21 17:22:37 +02:00
Yousef Rafat	b84af5b947	small attention fix	2025-11-17 23:03:52 +02:00
Yousef Rafat	3f71760913	resblock fix	2025-11-17 06:50:54 +02:00
Yousef Rafat	61b1efdaf0	vectrozied correct implementation of moe forward	2025-11-16 19:25:37 +02:00
Yousef Rafat	4a5509a4c5	.	2025-11-16 16:20:35 +02:00
Yousef Rafat	d731c58353	improving performance and fixing race condition	2025-11-16 16:19:39 +02:00
Yousef Rafat	12cc6924ac	meta init	2025-11-14 20:10:52 +02:00
Yousef Rafat	7b4c1e8031	async cache revamp Added an async loading and offloading of moe layers, having consistent memory with oom errors. Used to give oom error after the third layer with 24 giga bytes gpu, now goes to the end with consistent memory with minimal latency	2025-11-14 09:15:16 +02:00
Yousef Rafat	44346c4251	removed all errors	2025-11-08 19:49:02 +02:00
Yousef Rafat	5056a1f4d4	important fixes	2025-11-06 00:24:49 +02:00
Yousef Rafat	9e9c536c8e	fixes from testing	2025-11-04 23:55:16 +02:00
Yousef Rafat	ca119c44fb	returned kv cache for image generation	2025-11-01 23:06:11 +02:00
Yousef Rafat	10a17dc85d	a bunch of fixes	2025-11-01 16:40:49 +02:00
Yousef R. Gamaleldin	575fe3e92e	Merge branch 'master' into yousef-hunyuan-image-3	2025-10-31 23:55:42 +02:00
Yousef Rafat	a2fff60d4c	vectorized implementation of moe/fixes for issues	2025-10-31 23:53:13 +02:00
comfyanonymous	7f374e42c8	ScaleROPE now works on Lumina models. (#10578 )	2025-10-31 15:41:40 -04:00
Yousef Rafat	de43880bdb	Hunyuan Image 3.0	2025-10-31 18:56:20 +02:00
comfyanonymous	27d1bd8829	Fix rope scaling. (#10560 )	2025-10-30 22:51:58 -04:00
comfyanonymous	614cf9805e	Add a ScaleROPE node. Currently only works on WAN models. (#10559 )	2025-10-30 22:11:38 -04:00
comfyanonymous	0cf33953a7	Fix batch size above 1 giving bad output in chroma radiance. (#10394 )	2025-10-18 23:15:34 -04:00
rattus128	95ca2e56c8	WAN2.2: Fix cache VRAM leak on error (#10308 ) Same change pattern as `7e8dd275c2` applied to WAN2.2 If this suffers an exception (such as a VRAM oom) it will leave the encode() and decode() methods which skips the cleanup of the WAN feature cache. The comfy node cache then ultimately keeps a reference this object which is in turn reffing large tensors from the failed execution. The feature cache is currently setup at a class variable on the encoder/decoder however, the encode and decode functions always clear it on both entry and exit of normal execution. Its likely the design intent is this is usable as a streaming encoder where the input comes in batches, however the functions as they are today don't support that. So simplify by bringing the cache back to local variable, so that if it does VRAM OOM the cache itself is properly garbage when the encode()/decode() functions dissappear from the stack.	2025-10-13 15:23:11 -04:00
comfyanonymous	84e9ce32c6	Implement the mmaudio VAE. (#10300 )	2025-10-11 22:57:23 -04:00
comfyanonymous	195e0b0639	Remove useless code. (#10223 )	2025-10-05 15:41:19 -04:00
Finn-Hecker	93d859cfaa	Fix type annotation syntax in MotionEncoder_tc __init__ (#10186 ) ## Summary Fixed incorrect type hint syntax in `MotionEncoder_tc.__init__()` parameter list. ## Changes - Line 647: Changed `num_heads=int` to `num_heads: int` - This corrects the parameter annotation from a default value assignment to proper type hint syntax ## Details The parameter was using assignment syntax (`=`) instead of type annotation syntax (`:`), which would incorrectly set the default value to the `int` class itself rather than annotating the expected type.	2025-10-03 14:32:19 -07:00
rattus128	4965c0e2ac	WAN: Fix cache VRAM leak on error (#10141 ) If this suffers an exception (such as a VRAM oom) it will leave the encode() and decode() methods which skips the cleanup of the WAN feature cache. The comfy node cache then ultimately keeps a reference this object which is in turn reffing large tensors from the failed execution. The feature cache is currently setup at a class variable on the encoder/decoder however, the encode and decode functions always clear it on both entry and exit of normal execution. Its likely the design intent is this is usable as a streaming encoder where the input comes in batches, however the functions as they are today don't support that. So simplify by bringing the cache back to local variable, so that if it does VRAM OOM the cache itself is properly garbage when the encode()/decode() functions dissappear from the stack.	2025-10-01 18:42:16 -04:00
comfyanonymous	a6f83a4a1a	Support the new hunyuan vae. (#10150 )	2025-10-01 17:19:13 -04:00
rattus128	653ceab414	Reduce Peak WAN inference VRAM usage - part II (#10062 ) * flux: math: Use _addcmul to avoid expensive VRAM intermediate The rope process can be the VRAM peak and this intermediate for the addition result before releasing the original can OOM. addcmul_ it. * wan: Delete the self attention before cross attention This saves VRAM when the cross attention and FFN are in play as the VRAM peak.	2025-09-27 18:14:16 -04:00
comfyanonymous	fccab99ec0	Fix issue with .view() in HuMo. (#10014 )	2025-09-24 20:09:42 -04:00
comfyanonymous	e8df53b764	Update WanAnimateToVideo to more easily extend videos. (#9959 )	2025-09-19 18:48:56 -04:00
comfyanonymous	dc95b6acc0	Basic WIP support for the wan animate model. (#9939 )	2025-09-19 03:07:17 -04:00
comfyanonymous	24b0fce099	Do padding of audio embed in model for humo for more flexibility. (#9935 )	2025-09-18 19:54:16 -04:00
comfyanonymous	dd611a7700	Support the HuMo 17B model. (#9912 )	2025-09-17 18:39:24 -04:00
comfyanonymous	9288c78fc5	Support the HuMo model. (#9903 )	2025-09-17 00:12:48 -04:00
rattus128	e42682b24e	Reduce Peak WAN inference VRAM usage (#9898 ) * flux: Do the xq and xk ropes one at a time This was doing independendent interleaved tensor math on the q and k tensors, leading to the holding of more than the minimum intermediates in VRAM. On a bad day, it would VRAM OOM on xk intermediates. Do everything q and then everything k, so torch can garbage collect all of qs intermediates before k allocates its intermediates. This reduces peak VRAM usage for some WAN2.2 inferences (at least). * wan: Optimize qkv intermediates on attention As commented. The former logic computed independent pieces of QKV in parallel which help more inference intermediates in VRAM spiking VRAM usage. Fully roping Q and garbage collecting the intermediates before touching K reduces the peak inference VRAM usage.	2025-09-16 19:21:14 -04:00
blepping	1a85483da1	Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884 ) Correctly handle the case where w0 is passed by kwargs in BatchedBrownianTree	2025-09-15 20:05:03 -04:00
Jedrzej Kosinski	f228367c5e	Make ModuleNotFoundError ImportError instead (#9850 )	2025-09-13 21:34:21 -04:00
comfyanonymous	80b7c9455b	Changes to the previous radiance commit. (#9851 )	2025-09-13 18:03:34 -04:00
blepping	c1297f4eb3	Add support for Chroma Radiance (#9682 ) * Initial Chroma Radiance support * Minor Chroma Radiance cleanups * Update Radiance nodes to ensure latents/images are on the intermediate device * Fix Chroma Radiance memory estimation. * Increase Chroma Radiance memory usage factor * Increase Chroma Radiance memory usage factor once again * Ensure images are multiples of 16 for Chroma Radiance Add batch dimension and fix channels when necessary in ChromaRadianceImageToLatent node * Tile Chroma Radiance NeRF to reduce memory consumption, update memory usage factor * Update Radiance to support conv nerf final head type. * Allow setting NeRF embedder dtype for Radiance Bump Radiance nerf tile size to 32 Support EasyCache/LazyCache on Radiance (maybe) * Add ChromaRadianceStubVAE node * Crop Radiance image inputs to multiples of 16 instead of erroring to be in line with existing VAE behavior * Convert Chroma Radiance nodes to V3 schema. * Add ChromaRadianceOptions node and backend support. Cleanups/refactoring to reduce code duplication with Chroma. * Fix overriding the NeRF embedder dtype for Chroma Radiance * Minor Chroma Radiance cleanups * Move Chroma Radiance to its own directory in ldm Minor code cleanups and tooltip improvements * Fix Chroma Radiance embedder dtype overriding * Remove Radiance dynamic nerf_embedder dtype override feature * Unbork Radiance NeRF embedder init * Remove Chroma Radiance image conversion and stub VAE nodes Add a chroma_radiance option to the VAELoader builtin node which uses comfy.sd.PixelspaceConversionVAE Add a PixelspaceConversionVAE to comfy.sd for converting BHWC 0..1 <-> BCHW -1..1	2025-09-13 17:58:43 -04:00
comfyanonymous	a3b04de700	Hunyuan refiner vae now works with tiled. (#9836 )	2025-09-12 19:46:46 -04:00

1 2 3 4 5 ...

454 Commits