EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-02-28 23:07:33 +08:00

Author	SHA1	Message	Date
rattus	8646bd96ef	Merge `12e1560dcc` into `26c5bbb875`	2026-01-25 05:05:38 +01:00
comfyanonymous	26c5bbb875	Move nodes from previous PR into their own file. (#12066 ) Some checks failed Python Linting / Run Ruff (push) Has been cancelled Details Python Linting / Run Pylint (push) Has been cancelled Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Has been cancelled Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Has been cancelled Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Has been cancelled Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Has been cancelled Details Execution Tests / test (macos-latest) (push) Has been cancelled Details Generate Pydantic Stubs from api.comfy.org / generate-models (push) Has been cancelled Details Execution Tests / test (ubuntu-latest) (push) Has been cancelled Details Execution Tests / test (windows-latest) (push) Has been cancelled Details Test server launches without errors / test (push) Has been cancelled Details Unit Tests / test (macos-latest) (push) Has been cancelled Details Unit Tests / test (ubuntu-latest) (push) Has been cancelled Details Unit Tests / test (windows-2022) (push) Has been cancelled Details	2026-01-24 23:02:32 -05:00
Kohaku-Blueleaf	a97c98068f	[Weight-adapter/Trainer] Bypass forward mode in Weight adapter system (#11958 ) * Add API of bypass forward module * bypass implementation * add bypass fwd into nodes list/trainer	2026-01-24 22:56:22 -05:00
comfyanonymous	635406e283	Only enable fp16 on z image models that actually support it. (#12065 )	2026-01-24 22:32:28 -05:00
pythongosssss	ed6002cb60	add support for kwargs inputs to allow arbitrary inputs from frontend (#12063 ) used to output selected combo index Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>	2026-01-24 17:30:40 -08:00
Alexander Piskun	bc72d7f8d1	[API Nodes] add TencentHunyuan3D nodes (#12026 ) * feat(api-nodes): add TencentHunyuan3D nodes * add "(Pro)" to display name --------- Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>	2026-01-24 17:10:09 -08:00
comfyanonymous	aef4e13588	Make empty latent node work with other models. (#12062 )	2026-01-24 19:23:20 -05:00
Rattus	12e1560dcc	remove bad pyt2.4 versions gate	2026-01-25 09:14:52 +10:00
rattus	4e6a1b66a9	speed up and reduce VRAM of QWEN VAE and WAN (less so) (#12036 ) Some checks are pending Execution Tests / test (windows-latest) (push) Waiting to run Details Test server launches without errors / test (push) Waiting to run Details Unit Tests / test (macos-latest) (push) Waiting to run Details Unit Tests / test (ubuntu-latest) (push) Waiting to run Details Unit Tests / test (windows-2022) (push) Waiting to run Details Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run Details Execution Tests / test (macos-latest) (push) Waiting to run Details Execution Tests / test (ubuntu-latest) (push) Waiting to run Details * ops: introduce autopad for conv3d This works around pytorch missing ability to causal pad as part of the kernel and avoids massive weight duplications for padding. * wan-vae: rework causal padding This currently uses F.pad which takes a full deep copy and is liable to be the VRAM peak. Instead, kick spatial padding back to the op and consolidate the temporal padding with the cat for the cache. * wan-vae: implement zero pad fast path The WAN VAE is also QWEN where it is used single-image. These convolutions are however zero padded 3d convolutions, which means the VAE is actually just 2D down the last element of the conv weight in the temporal dimension. Fast path this, to avoid adding zeros that then just evaporate in convoluton math but cost computation.	2026-01-23 19:56:14 -05:00
comfyanonymous	9cf299a9f9	Make regular empty latent node work properly on flux 2 variants. (#12050 )	2026-01-23 19:50:48 -05:00
ComfyUI Wiki	e89b22993a	Support ModelScope-Trainer/DiffSynth LoRA format for Flux.2 Klein models (#12042 ) Some checks are pending Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Execution Tests / test (macos-latest) (push) Waiting to run Details Execution Tests / test (ubuntu-latest) (push) Waiting to run Details Execution Tests / test (windows-latest) (push) Waiting to run Details Unit Tests / test (macos-latest) (push) Waiting to run Details Unit Tests / test (ubuntu-latest) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run Details Test server launches without errors / test (push) Waiting to run Details Unit Tests / test (windows-2022) (push) Waiting to run Details	2026-01-23 15:27:49 -05:00
Jukka Seppänen	55bd606e92	LTX2: Refactor forward function for better VRAM efficiency and fix spatial inpainting (#12046 ) * Disable timestep embed compression when inpainting Spatial inpainting not compatible with the compression * Reduce crossattn peak VRAM * LTX2: Refactor forward function for better VRAM efficiency	2026-01-23 15:26:38 -05:00
Rattus	a9bc2d884c	MPDynamic: Add support for model defined dtype If the model defines a dtype that is different to what is in the state dict, respect that at load time. This is done as part of the casting process.	2026-01-23 16:54:12 +10:00
Rattus	18748d4641	ops: fix __init__ return	2026-01-23 16:54:12 +10:00
Rattus	b9f6ec4ca5	archive the model defined dtypes Scan created models and save off the dtypes as defined by the model creation process. This is needed for assign=True, which will override the dtypes.	2026-01-23 16:54:12 +10:00
Rattus	8371708e09	mp: big bump on the VBAR sizes Now that the model defined dtype is decoupled from the state_dict dtypes we need to be able to handle worst case scenario casts between the SD and VBAR.	2026-01-23 16:54:12 +10:00
Rattus	19c9219fe4	ruff	2026-01-23 16:54:12 +10:00
Rattus	e36ffd2cee	nodes_model_patch: fix copy-paste coding error	2026-01-23 16:54:12 +10:00
Rattus	b915d13e57	mp: handle blank __new__ call This is needed for deepcopy construction. We shouldnt really have deep copies of MP or MODynamic however this is a stay one in some controlnet flows.	2026-01-23 16:54:12 +10:00
Rattus	7f706a01d6	mm: remove left over hooks draft code This is phase 2	2026-01-23 16:54:12 +10:00
Rattus	ec4837c88a	execution: remove per node gc.collect() This isn't worth it and the likelyhood of inference leaving a complex data-structure with cyclic reference behind is now. Remove it. We would replace it with a condition on nodes that actually touch the GPU which might be win.	2026-01-23 16:54:12 +10:00
Rattus	5bd8ec8544	implement lightweight safetensors with READ mmap The CoW MMAP as used by safetensors is hardcoded to CoW which forcibly consumes windows commit charge on a zero copy. RIP. Implement safetensors in pytorch itself with a READ mmap to not get commit charged for all our open models.	2026-01-23 16:54:12 +10:00
Rattus	1c5fc82077	ops: defer creation of the parameters until state dict load If running on Windows, defer creation of the layer parameters until the state dict is loaded. This avoids a massive charge in windows commit charge spike when a model is created and not loaded. This problem doesnt exist on Linux as linux allows RAM overcommit, however windows does not. Before dynamic memory work this was also a non issue as every non-quant model would just immediate RAM load and need the memory anyway. Make the workaround windows specific, as there may be someone out there with some training from scratch workflow (which this might break), and assume said someone is on Linux.	2026-01-23 16:54:12 +10:00
Rattus	441dcd2b17	remove junk arg	2026-01-23 16:54:12 +10:00
Rattus	76f94ecf9f	aimdo version bump	2026-01-23 16:54:12 +10:00
Rattus	7f980124b0	main: Rework aimdo into process Be more tolerant of unsupported platforms and fallback properly. Fixes crash when cuda is not installed at all.	2026-01-23 16:54:12 +10:00
Rattus	a7023384ca	sampling: improve progress meter accuracy for dynamic loading	2026-01-23 16:54:12 +10:00
Rattus	a310ca93d3	clip: support assign load when taking clip from a ckpt	2026-01-23 16:54:10 +10:00
Rattus	6ecbba2232	sd: empty cache on tiler fallback This is needed for aimdo where the cache cant self recover from fragmentation. It is however a good thing to do anyway after an OOM so make it unconditional.	2026-01-23 16:52:31 +10:00
Rattus	61dda30171	ruff	2026-01-23 16:52:31 +10:00
Rattus	79b3fe334b	misc cleanup	2026-01-23 16:52:31 +10:00
Rattus	15ae09fb19	add missing del on unpin	2026-01-23 16:52:31 +10:00
Rattus	6e852baa9a	write better tx commentary	2026-01-23 16:52:31 +10:00
Rattus	a2c8f45c93	mm: fix sync Sync before deleting anything.	2026-01-23 16:52:31 +10:00
Rattus	4d914099fb	main: Go live with --fast dynamic_vram Add the optional command line switch --fast dynamic_vram. This is mutually exclusing --high-vram and --gpu-only which contradict aimdos underlying feature. Add appropriate installation warning and a startup message, match the comfy debug level inconfiguring aimdo. Add comfy-aimdo pip requirement. This will safely stub to a nop for unsupported platforms.	2026-01-23 16:52:31 +10:00
Rattus	81845a9ab2	execution: add aimdo primary pytorch cache integration We need to general pytorch cache defragmentation on an appropriate level for aimdo. Do in here on the per node basis, which has a reasonable chance of purging stale shapes out of the pytorch caching allocator and saving VRAM without costing too much garbage collector thrash. This looks like a lot of GC but because aimdo never fails from pytorch and saves the pytorch allocator from ever need to defrag out of demand, but it needs a oil change every now and then so we gotta do it. Doing it here also means the pytorch temps are cleared from task manager VRAM usage so user anxiety can go down a little when they see their vram drop back at the end of workflows inline with inference usage (rather than assuming full VRAM leaks).	2026-01-23 16:52:31 +10:00
Rattus	6b8f4949c4	models: Use CoreModelPatcher Use CoreModelPatcher for all internal ModelPatcher implementations. This drives conditional use of the aimdo feature, while making sure custom node packs get to keep ModelPatcher unchanged for the moment.	2026-01-23 16:52:31 +10:00
Rattus	56d526c133	ops/mp: implement aimdo Implement a model patcher and caster for aimdo. A new ModelPatcher implementation which backs onto comfy-aimdo to implement varying model load levels that can be adjusted during model use. The patcher defers all load processes to lazily load the model during use (e.g. the first step of a ksampler) and automatically negotiates a load level during the inference to maximize VRAM usage without OOMing. If inference requires more VRAM than is available weights are offloaded to make space before the OOM happens. As for loading the weight onto the GPU, that happens via comfy_cast_weights which is now used in all cases. cast_bias_weight checks whether the VBAR assigned to the model has space for the weight (based on the same load priority semantics as the original ModelPatcher). If it does, the VRAM as returned by the Aimdo allocator is used as the parameter GPU side. The caster is responsible for populating the weight data. This is done using the usual offload_stream (which mean we now have asynchronous load overlapping first use compute). Pinning works a little differently. When a weight is detected during load as unable to fit, a pin is allocated at the time of casting and the weight as used by the layer is DMAd back to the the pin using the GPU DMA TX engine, also using the asynchronous offload streams. This means you get to pin the Lora modified and requantized weights which can be a major speedup for offload+quantize+lora use cases, This works around the JIT Lora + FP8 exclusion and brings FP8MM to heavy offloading users (who probably really need it with more modest GPUs). There is a performance risk in that a CPU+RAM patch has been replace with a GPU+RAM patch but my initial performance results look good. Most users as likely to have a GPU that outruns their CPU in these woods. Some common code is written to consolidate a layers tensors for aimdo mapping, pinning, and DMA transfers. interpret_gathered_like() allows unpacking a raw buffer as a set of tensors. This is used consistently to bundle and pack weights, quantization metadata (QuantizedTensor bits) and biases into one payload for DMA in the load process reducing Cuda overhead a little. Some Quantization metadata was missing async offload is some cases which is now added. This also pins quantization metadata and consolidates the number of cuda_host_register calls (which can be expensive).	2026-01-23 16:52:31 +10:00
Rattus	1aa3386c9f	mp: add mode for non comfy weight prioritization non-comfy weights dont get async offload and a few other performance limitations. Load them at top priority accordingly.	2026-01-23 16:52:31 +10:00
Rattus	a2e15d1117	mp/mm: APi expansions for dynamic loading Add two api expansions, a flag for whether a model patcher is dynamic a a very basic RAM freeing system. Implement the semantics of the dynamic model patcher which never frees VRAM ahead of time for the sake of another dynamic model patcher. At the same time add an API for clearing out pins on a reservation of model size x2 heuristic, as pins consume RAM in their own right in the dynamic patcher. This is actually less about OOMing RAM and more about performance, as with assign=True load semantics there needs to be plenty headroom for the OS to load models to dosk cache on demand so err on the side of kicking old pins out.	2026-01-23 16:52:31 +10:00
Rattus	168dd7d6c2	mp: wrap get_free_memory Dynamic load needs to adjust these numbers based on future movements, so wrap this in a MP API.	2026-01-23 16:52:31 +10:00
Rattus	2bf2463ca8	pinned_memory: add python Add a python for managing pinned memory of the weight/bias module level. This allocates, pins and attached a tensor to a module for the pin for this module. It does not set the weight, just allocates a singular ram buffer for population and bulk DMA transfer.	2026-01-23 16:52:31 +10:00
Rattus	92a8183c13	move string_to_seed to utils.py This needs to be visible by ops which may want to do stochastic rounding on the fly.	2026-01-23 16:52:31 +10:00
Rattus	c5e0e80cb3	mm: Implement cast buffer allocations	2026-01-23 16:52:31 +10:00
Rattus	4622c0825e	ops: Do bias dtype conversion on compute stream For consistency with weights.	2026-01-23 16:52:31 +10:00
Rattus	d795a23c12	Reduce RAM and compute time in model saving with Loras Get the model saving logic away from force_patch_weights and instead do the patching JIT during safetensors saving. Firstly switch off force_patch_weights in the load for save which avoids creating CPU side tensors with loras calculated. Then at save time, wrap the tensor to catch safetensors call to .to() and patch it live. This avoids having to ever have a lora-calculated copy of offloaded weights on the CPU. Also take advantage of the presence of the GPU when doing this Lora calculation. The former force_patch_weights would just do eveyrthing on the CPU. Its generally faster to go the GPU and back even if its just a Lora application.	2026-01-23 16:52:31 +10:00
Christian Byrne	79cdbc81cb	feat: Improve ResizeImageMaskNode UX with tooltips and search aliases (#12040 ) Some checks are pending Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run Details Execution Tests / test (macos-latest) (push) Waiting to run Details Execution Tests / test (ubuntu-latest) (push) Waiting to run Details Execution Tests / test (windows-latest) (push) Waiting to run Details Test server launches without errors / test (push) Waiting to run Details Unit Tests / test (macos-latest) (push) Waiting to run Details Unit Tests / test (ubuntu-latest) (push) Waiting to run Details Unit Tests / test (windows-2022) (push) Waiting to run Details - Add search_aliases for discoverability: resize, scale, dimensions, etc. - Add node description for hover tooltip - Add tooltips to all inputs explaining their behavior - Reorder options: most common (scale dimensions) first, most technical (scale to multiple) last Addresses user feedback that 'resize' search returned nothing useful and options like 'match size' and 'scale to multiple' were not self-explanatory.	2026-01-22 22:04:27 -08:00
comfyanonymous	f443b9f2ca	Revert "feat: Improve ResizeImageMaskNode UX with tooltips and search aliases…" (#12038 ) Some checks are pending Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run Details Execution Tests / test (macos-latest) (push) Waiting to run Details Execution Tests / test (ubuntu-latest) (push) Waiting to run Details Execution Tests / test (windows-latest) (push) Waiting to run Details Test server launches without errors / test (push) Waiting to run Details Unit Tests / test (macos-latest) (push) Waiting to run Details Unit Tests / test (ubuntu-latest) (push) Waiting to run Details Unit Tests / test (windows-2022) (push) Waiting to run Details This reverts commit `4e3038114a`.	2026-01-22 23:02:37 -05:00
Christian Byrne	4e3038114a	feat: Improve ResizeImageMaskNode UX with tooltips and search aliases (#12013 ) - Add search_aliases for discoverability: resize, scale, dimensions, etc. - Add node description for hover tooltip - Add tooltips to all inputs explaining their behavior - Reorder options: most common (scale dimensions) first, most technical (scale to multiple) last Addresses user feedback that 'resize' search returned nothing useful and options like 'match size' and 'scale to multiple' were not self-explanatory.	2026-01-22 18:46:55 -08:00
Christian Byrne	bbb8864778	add search aliases to all nodes (#12035 ) * feat: Add search_aliases field to node schema Adds `search_aliases` field to improve node discoverability. Users can define alternative search terms for nodes (e.g., "text concat" → StringConcatenate). Changes: - Add `search_aliases: list[str]` to V3 Schema - Add `SEARCH_ALIASES` support for V1 nodes - Include field in `/object_info` response - Add aliases to high-priority core nodes V1 usage: ```python class MyNode: SEARCH_ALIASES = ["alt name", "synonym"] ``` V3 usage: ```python io.Schema( node_id="MyNode", search_aliases=["alt name", "synonym"], ... ) ``` ## Related PRs - Frontend: Comfy-Org/ComfyUI_frontend#XXXX (draft - merge after this) - Docs: Comfy-Org/docs#XXXX (draft - merge after stable) * Propagate search_aliases through V3 Schema.get_v1_info to NodeInfoV1 * feat: add SEARCH_ALIASES for core nodes (#12016) Add search aliases to 22 core nodes in nodes.py to improve node discoverability: - Checkpoint/model loaders: CheckpointLoader, DiffusersLoader - Conditioning nodes: ConditioningAverage, ConditioningSetArea, ConditioningSetMask, ConditioningZeroOut - Style nodes: StyleModelApply - Image nodes: LoadImageMask, LoadImageOutput, ImageBatch, ImageInvert, ImagePadForOutpaint - Latent nodes: LoadLatent, SaveLatent, LatentBlend, LatentComposite, LatentCrop, LatentFlip, LatentFromBatch, LatentUpscale, LatentUpscaleBy, RepeatLatentBatch * feat: add SEARCH_ALIASES for image, mask, and string nodes (#12017) Add search aliases to nodes in comfy_extras for better discoverability: - nodes_mask.py: mask manipulation nodes - nodes_images.py: image processing nodes - nodes_post_processing.py: post-processing effect nodes - nodes_string.py: string manipulation nodes - nodes_compositing.py: compositing nodes - nodes_morphology.py: morphological operation nodes - nodes_latent.py: latent space nodes Uses search_aliases parameter in io.Schema() for v3 nodes. * feat: add SEARCH_ALIASES for audio and video nodes (#12018) Add search aliases to audio and video nodes for better discoverability: - nodes_audio.py: audio loading, saving, and processing nodes - nodes_video.py: video loading and processing nodes - nodes_wan.py: WAN model nodes Uses search_aliases parameter in io.Schema() for v3 nodes. * feat: add SEARCH_ALIASES for model and misc nodes (#12019) Add search aliases to model-related and miscellaneous nodes: - Model nodes: nodes_model_merging.py, nodes_model_advanced.py, nodes_lora_extract.py - Sampler nodes: nodes_custom_sampler.py, nodes_align_your_steps.py - Control nodes: nodes_controlnet.py, nodes_attention_multiply.py, nodes_hooks.py - Training nodes: nodes_train.py, nodes_dataset.py - Utility nodes: nodes_logic.py, nodes_canny.py, nodes_differential_diffusion.py - Architecture-specific: nodes_sd3.py, nodes_pixart.py, nodes_lumina2.py, nodes_kandinsky5.py, nodes_hidream.py, nodes_fresca.py, nodes_hunyuan3d.py - Media nodes: nodes_load_3d.py, nodes_webcam.py, nodes_preview_any.py, nodes_wanmove.py Uses search_aliases parameter in io.Schema() for v3 nodes, SEARCH_ALIASES class attribute for legacy nodes.	2026-01-22 18:36:58 -08:00

1 2 3 4 5 ...

4644 Commits