EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-03-12 04:37:32 +08:00

Author	SHA1	Message	Date
Yousef R. Gamaleldin	e3d5079d26	Merge `ea176bb87d` into `10e90a5757`	2025-11-20 21:58:25 -05:00
Yousef Rafat	ea176bb87d	basic support for hunyuan model	2025-11-20 23:47:57 +02:00
comfyanonymous	cb96d4d18c	Disable workaround on newer cudnn. (#10807 )	2025-11-19 23:56:23 -05:00
comfyanonymous	17027f2a6a	Add a way to disable the final norm in the llama based TE models. (#10794 )	2025-11-18 22:36:03 -05:00
comfyanonymous	d526974576	Fix hunyuan 3d 2.0 (#10792 )	2025-11-18 16:46:19 -05:00
Yousef Rafat	b84af5b947	small attention fix	2025-11-17 23:03:52 +02:00
Yousef Rafat	3f71760913	resblock fix	2025-11-17 06:50:54 +02:00
Yousef Rafat	61b1efdaf0	vectrozied correct implementation of moe forward	2025-11-16 19:25:37 +02:00
Yousef Rafat	4a5509a4c5	.	2025-11-16 16:20:35 +02:00
Yousef Rafat	d731c58353	improving performance and fixing race condition	2025-11-16 16:19:39 +02:00
comfyanonymous	bd01d9f7fd	Add left padding support to tokenizers. (#10753 )	2025-11-15 06:54:40 -05:00
Yousef Rafat	12cc6924ac	meta init	2025-11-14 20:10:52 +02:00
comfyanonymous	443056c401	Fix custom nodes import error. (#10747 ) This should fix the import errors but will break if the custom nodes actually try to use the class.	2025-11-14 03:26:05 -05:00
Yousef Rafat	7b4c1e8031	async cache revamp Added an async loading and offloading of moe layers, having consistent memory with oom errors. Used to give oom error after the third layer with 24 giga bytes gpu, now goes to the end with consistent memory with minimal latency	2025-11-14 09:15:16 +02:00
comfyanonymous	f60923590c	Use same code for chroma and flux blocks so that optimizations are shared. (#10746 )	2025-11-14 01:28:05 -05:00
rattus	94c298f962	flux: reduce VRAM usage (#10737 ) Cleanup a bunch of stack tensors on Flux. This take me from B=19 to B=22 for 1600x1600 on RTX5090.	2025-11-13 16:02:03 -08:00
contentis	3b3ef9a77a	Quantized Ops fixes (#10715 ) * offload support, bug fixes, remove mixins * add readme	2025-11-12 18:26:52 -05:00
rattus	1c7eaeca10	qwen: reduce VRAM usage (#10725 ) Clean up a bunch of stacked and no-longer-needed tensors on the QWEN VRAM peak (currently FFN). With this I go from OOMing at B=37x1328x1328 to being able to succesfully run B=47 (RTX5090).	2025-11-12 16:20:53 -05:00
rattus	18e7d6dba5	mm/mp: always unload re-used but modified models (#10724 ) The partial unloader path in model re-use flow skips straight to the actual unload without any check of the patching UUID. This means that if you do an upscale flow with a model patch on an existing model, it will not apply your patchings. Fix by delaying the partial_unload until after the uuid checks. This is done by making partial_unload a model of partial_load where extra_mem is -ve.	2025-11-12 16:19:53 -05:00
comfyanonymous	1199411747	Don't pin tensor if not a torch.nn.parameter.Parameter (#10718 )	2025-11-11 19:33:30 -05:00
rattus	c350009236	ops: Put weight cast on the offload stream (#10697 ) This needs to be on the offload stream. This reproduced a black screen with low resolution images on a slow bus when using FP8.	2025-11-09 22:52:11 -05:00
comfyanonymous	dea899f221	Unload weights if vram usage goes up between runs. (#10690 )	2025-11-09 18:51:33 -05:00
comfyanonymous	e632e5de28	Add logging for model unloading. (#10692 )	2025-11-09 18:06:39 -05:00
comfyanonymous	2abd2b5c20	Make ScaleROPE node work on Flux. (#10686 )	2025-11-08 15:52:02 -05:00
Yousef Rafat	44346c4251	removed all errors	2025-11-08 19:49:02 +02:00
comfyanonymous	a1a70362ca	Only unpin tensor if it was pinned by ComfyUI (#10677 )	2025-11-07 11:15:05 -05:00
rattus	cf97b033ee	mm: guard against double pin and unpin explicitly (#10672 ) As commented, if you let cuda be the one to detect double pin/unpinning it actually creates an asyc GPU error.	2025-11-06 21:20:48 -05:00
comfyanonymous	09dc24c8a9	Pinned mem also seems to work on AMD. (#10658 )	2025-11-05 19:11:15 -05:00
comfyanonymous	1d69245981	Enable pinned memory by default on Nvidia. (#10656 ) Removed the --fast pinned_memory flag. You can use --disable-pinned-memory to disable it. Please report if it causes any issues.	2025-11-05 18:08:13 -05:00
comfyanonymous	97f198e421	Fix qwen controlnet regression. (#10657 )	2025-11-05 18:07:35 -05:00
Yousef Rafat	5056a1f4d4	important fixes	2025-11-06 00:24:49 +02:00
comfyanonymous	c4a6b389de	Lower ltxv mem usage to what it was before previous pr. (#10643 ) Bring back qwen behavior to what it was before previous pr.	2025-11-04 22:47:35 -05:00
contentis	4cd881866b	Use single apply_rope function across models (#10547 )	2025-11-04 20:10:11 -05:00
comfyanonymous	7f3e4d486c	Limit amount of pinned memory on windows to prevent issues. (#10638 )	2025-11-04 17:37:50 -05:00
Yousef Rafat	9e9c536c8e	fixes from testing	2025-11-04 23:55:16 +02:00
comfyanonymous	af4b7b5edb	More fp8 torch.compile regressions fixed. (#10625 )	2025-11-03 22:14:20 -05:00
comfyanonymous	0f4ef3afa0	This seems to slow things down slightly on Linux. (#10624 )	2025-11-03 21:47:14 -05:00
comfyanonymous	6b88478f9f	Bring back fp8 torch compile performance to what it should be. (#10622 )	2025-11-03 19:22:10 -05:00
comfyanonymous	e199c8cc67	Fixes (#10621 )	2025-11-03 17:58:24 -05:00
comfyanonymous	0652cb8e2d	Speed up torch.compile (#10620 )	2025-11-03 17:37:12 -05:00
comfyanonymous	958a17199a	People should update their pytorch versions. (#10618 )	2025-11-03 17:08:30 -05:00
comfyanonymous	97ff9fae7e	Clarify help text for --fast argument (#10609 ) Updated help text for the --fast argument to clarify potential risks.	2025-11-02 13:14:04 -05:00
rattus	135fa49ec2	Small speed improvements to --async-offload (#10593 ) * ops: dont take an offload stream if you dont need one * ops: prioritize mem transfer The async offload streams reason for existence is to transfer from RAM to GPU. The post processing compute steps are a bonus on the side stream, but if the compute stream is running a long kernel, it can stall the side stream, as it wait to type-cast the bias before transferring the weight. So do a pure xfer of the weight straight up, then do everything bias, then go back to fix the weight type and do weight patches.	2025-11-01 18:48:53 -04:00
comfyanonymous	44869ff786	Fix issue with pinned memory. (#10597 )	2025-11-01 17:25:59 -04:00
Yousef Rafat	ca119c44fb	returned kv cache for image generation	2025-11-01 23:06:11 +02:00
Yousef Rafat	10a17dc85d	a bunch of fixes	2025-11-01 16:40:49 +02:00
comfyanonymous	c58c13b2ba	Fix torch compile regression on fp8 ops. (#10580 )	2025-11-01 00:25:17 -04:00
Yousef Rafat	1a25a0ad69	Merge branch 'yousef-hunyuan-image-3' of https://github.com/yousef-rafat/ComfyUI into yousef-hunyuan-image-3	2025-10-31 23:58:22 +02:00
Yousef Rafat	70f216bbd0	tiny bug	2025-10-31 23:58:01 +02:00
Yousef R. Gamaleldin	575fe3e92e	Merge branch 'master' into yousef-hunyuan-image-3	2025-10-31 23:55:42 +02:00

1 2 3 4 5 ...

1808 Commits