EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-01-13 07:40:50 +08:00

Author	SHA1	Message	Date
patientx	3662d0a2ce	Merge branch 'comfyanonymous:master' into master	2025-11-10 14:05:53 +03:00
comfyanonymous	dea899f221	Unload weights if vram usage goes up between runs. (#10690 )	2025-11-09 18:51:33 -05:00
patientx	8e02689534	Merge branch 'comfyanonymous:master' into master	2025-11-07 20:30:21 +03:00
comfyanonymous	a1a70362ca	Only unpin tensor if it was pinned by ComfyUI (#10677 )	2025-11-07 11:15:05 -05:00
patientx	d29dbbd829	Merge branch 'comfyanonymous:master' into master	2025-11-07 14:27:13 +03:00
rattus	cf97b033ee	mm: guard against double pin and unpin explicitly (#10672 ) As commented, if you let cuda be the one to detect double pin/unpinning it actually creates an asyc GPU error.	2025-11-06 21:20:48 -05:00
patientx	3ab45ae725	Merge branch 'comfyanonymous:master' into master	2025-11-06 15:35:41 +03:00
comfyanonymous	09dc24c8a9	Pinned mem also seems to work on AMD. (#10658 )	2025-11-05 19:11:15 -05:00
comfyanonymous	1d69245981	Enable pinned memory by default on Nvidia. (#10656 ) Removed the --fast pinned_memory flag. You can use --disable-pinned-memory to disable it. Please report if it causes any issues.	2025-11-05 18:08:13 -05:00
patientx	84faf45f09	Merge branch 'comfyanonymous:master' into master	2025-11-05 13:07:02 +03:00
comfyanonymous	7f3e4d486c	Limit amount of pinned memory on windows to prevent issues. (#10638 )	2025-11-04 17:37:50 -05:00
patientx	7907b8d6be	Merge branch 'comfyanonymous:master' into master	2025-10-30 03:16:55 +03:00
rattus	ab7ab5be23	Fix Race condition in --async-offload that can cause corruption (#10501 ) * mm: factor out the current stream getter Make this a reusable function. * ops: sync the offload stream with the consumption of w&b This sync is nessacary as pytorch will queue cuda async frees on the same stream as created to tensor. In the case of async offload, this will be on the offload stream. Weights and biases can go out of scope in python which then triggers the pytorch garbage collector to queue the free operation on the offload stream possible before the compute stream has used the weight. This causes a use after free on weight data leading to total corruption of some workflows. So sync the offload stream with the compute stream after the weight has been used so the free has to wait for the weight to be used. The cast_bias_weight is extended in a backwards compatible way with the new behaviour opt-in on a defaulted parameter. This handles custom node packs calling cast_bias_weight and defeatures async-offload for them (as they do not handle the race). The pattern is now: cast_bias_weight(... , offloadable=True) #This might be offloaded thing(weight, bias, ...) uncast_bias_weight(...) * controlnet: adopt new cast_bias_weight synchronization scheme This is nessacary for safe async weight offloading. * mm: sync the last stream in the queue, not the next Currently this peeks ahead to sync the next stream in the queue of streams with the compute stream. This doesnt allow a lot of parallelization, as then end result is you can only get one weight load ahead regardless of how many streams you have. Rotate the loop logic here to synchronize the end of the queue before returning the next stream. This allows weights to be loaded ahead of the compute streams position.	2025-10-29 17:17:46 -04:00
patientx	d8528ac31e	Merge branch 'comfyanonymous:master' into master	2025-10-29 12:42:07 +03:00
comfyanonymous	3fa7a5c04a	Speed up offloading using pinned memory. (#10526 ) To enable this feature use: --fast pinned_memory	2025-10-29 00:21:01 -04:00
patientx	8590e1f713	Merge branch 'comfyanonymous:master' into master	2025-10-26 14:29:29 +03:00
comfyanonymous	098a352f13	Add warning for torch-directml usage (#10482 ) Added a warning message about the state of torch-directml.	2025-10-25 20:05:22 -04:00
comfyanonymous	426cde37f1	Remove useless function (#10472 )	2025-10-24 19:56:51 -04:00
patientx	d4bcb93575	Merge branch 'comfyanonymous:master' into master	2025-10-22 11:34:33 +03:00
comfyanonymous	9cdc64998f	Only disable cudnn on newer AMD GPUs. (#10437 )	2025-10-21 19:15:23 -04:00
patientx	5bf1c8be44	Merge branch 'comfyanonymous:master' into master	2025-10-21 03:49:14 +03:00
comfyanonymous	2c2aa409b0	Log message for cudnn disable on AMD. (#10418 )	2025-10-20 15:43:24 -04:00
patientx	657a7872ab	Merge branch 'comfyanonymous:master' into master	2025-10-19 15:20:17 +03:00
comfyanonymous	5b80addafd	Turn off cuda malloc by default when --fast autotune is turned on. (#10393 )	2025-10-18 22:35:46 -04:00
patientx	26589a3a0b	Merge branch 'comfyanonymous:master' into master	2025-10-15 12:18:21 +03:00
comfyanonymous	1c10b33f9b	gfx942 doesn't support fp8 operations. (#10348 )	2025-10-15 00:21:11 -04:00
comfyanonymous	c8674bc6e9	Enable RDNA4 pytorch attention on ROCm 7.0 and up. (#10332 )	2025-10-13 21:19:03 -04:00
patientx	fa7942933b	Merge branch 'comfyanonymous:master' into master	2025-10-12 13:56:39 +03:00
comfyanonymous	a125cd84b0	Improve AMD performance. (#10302 ) I honestly have no idea why this improves things but it does.	2025-10-12 00:28:01 -04:00
patientx	258da26c98	Merge branch 'comfyanonymous:master' into master	2025-09-25 15:08:16 +03:00
Guy Niv	c8d2117f02	Fix memory leak by properly detaching model finalizer (#9979 ) When unloading models in load_models_gpu(), the model finalizer was not being explicitly detached, leading to a memory leak. This caused linear memory consumption increase over time as models are repeatedly loaded and unloaded. This change prevents orphaned finalizer references from accumulating in memory during model switching operations.	2025-09-24 22:35:12 -04:00
patientx	c62e820d45	Merge branch 'comfyanonymous:master' into master	2025-09-20 01:51:06 +03:00
DELUXA	8d6653fca6	Enable fp8 ops by default on gfx1200 (#9926 )	2025-09-18 19:50:37 -04:00
patientx	b46622ffa5	Merge branch 'comfyanonymous:master' into master	2025-09-08 11:14:04 +03:00
comfyanonymous	fb763d4333	Fix amd_min_version crash when cpu device. (#9754 )	2025-09-07 21:16:29 -04:00
patientx	9417753a6c	Merge branch 'comfyanonymous:master' into master	2025-09-07 13:16:57 +03:00
comfyanonymous	bcbd7884e3	Don't enable pytorch attention on AMD if triton isn't available. (#9747 )	2025-09-07 00:29:38 -04:00
comfyanonymous	27a0fcccc3	Enable bf16 VAE on RDNA4. (#9746 )	2025-09-06 23:25:22 -04:00
patientx	7ff01ded58	Merge branch 'comfyanonymous:master' into master	2025-08-21 09:24:26 +03:00
comfyanonymous	0963493a9c	Support for Qwen Diffsynth Controlnets canny and depth. (#9465 ) These are not real controlnets but actually a patch on the model so they will be treated as such. Put them in the models/model_patches/ folder. Use the new ModelPatchLoader and QwenImageDiffsynthControlnet nodes.	2025-08-20 22:26:37 -04:00
patientx	a927fbd99b	Merge branch 'comfyanonymous:master' into master	2025-08-14 12:16:50 +03:00
Simon Lui	c991a5da65	Fix XPU iGPU regressions (#9322 ) * Change bf16 check and switch non-blocking to off default with option to force to regain speed on certain classes of iGPUs and refactor xpu check. * Turn non_blocking off by default for xpu. * Update README.md for Intel GPUs.	2025-08-13 19:13:35 -04:00
patientx	c2686a3968	Merge branch 'comfyanonymous:master' into master	2025-08-10 12:09:19 +03:00
comfyanonymous	5828607ccf	Not sure if AMD actually support fp16 acc but it doesn't crash. (#9258 )	2025-08-09 12:49:25 -04:00
patientx	89499c6fae	Merge branch 'comfyanonymous:master' into master	2025-08-08 11:40:07 +03:00
comfyanonymous	735bb4bdb1	Users report gfx1201 is buggy on flux with pytorch attention. (#9244 )	2025-08-08 04:21:00 -04:00
patientx	d8ca8134c3	Merge branch 'comfyanonymous:master' into master	2025-07-29 11:56:59 +03:00
comfyanonymous	7d593baf91	Extra reserved vram on large cards on windows. (#9093 )	2025-07-29 04:07:45 -04:00
patientx	970b7fb84f	Merge branch 'comfyanonymous:master' into master	2025-07-24 22:30:55 +03:00
comfyanonymous	69cb57b342	Print xpu device name. (#9035 )	2025-07-24 15:06:25 -04:00

1 2 3 4 5 ...

427 Commits