Commit Graph

4607 Commits

Author SHA1 Message Date
Rattus
ec5a81cfa4 implement lightweight safetensors with READ mmap
The CoW MMAP as used by safetensors is hardcoded to CoW which forcibly
consumes windows commit charge on a zero copy. RIP. Implement safetensors
in pytorch itself with a READ mmap to not get commit charged for all our
open models.
2026-01-21 14:35:38 +10:00
Rattus
932b37dcbe ops: defer creation of the parameters until state dict load
If running on Windows, defer creation of the layer parameters until the state
dict is loaded. This avoids a massive charge in windows commit charge spike
when a model is created and not loaded.

This problem doesnt exist on Linux as linux allows RAM overcommit,
however windows does not. Before dynamic memory work this was also a non issue
as every non-quant model would just immediate RAM load and need the memory
anyway.

Make the workaround windows specific, as there may be someone out there with
some training from scratch workflow (which this might break), and assume said
someone is on Linux.
2026-01-21 14:34:42 +10:00
Rattus
82f388f405 remove junk arg 2026-01-21 14:34:42 +10:00
Rattus
b5806c89d0 aimdo version bump 2026-01-21 14:34:40 +10:00
Rattus
b0d6f2a9fc main: Rework aimdo into process
Be more tolerant of unsupported platforms and fallback properly.
Fixes crash when cuda is not installed at all.
2026-01-21 14:34:23 +10:00
Rattus
265ae3ee58 sampling: improve progress meter accuracy for dynamic loading 2026-01-21 14:34:23 +10:00
Rattus
cb41b22d23 clip: support assign load when taking clip from a ckpt 2026-01-21 14:34:19 +10:00
Rattus
28dd1c4c1f sd: empty cache on tiler fallback
This is needed for aimdo where the cache cant self recover from
fragmentation. It is however a good thing to do anyway after an OOM
so make it unconditional.
2026-01-21 14:33:01 +10:00
Rattus
307d25e747 ruff 2026-01-21 14:33:01 +10:00
Rattus
5916464c87 misc cleanup 2026-01-21 14:33:00 +10:00
Rattus
645c4597d2 add missing del on unpin 2026-01-21 14:32:35 +10:00
Rattus
17cdb0284b write better tx commentary 2026-01-21 14:32:35 +10:00
Rattus
2e1c2667e7 mm: fix sync
Sync before deleting anything.
2026-01-21 14:32:35 +10:00
Rattus
33583d95f4 main: Go live with --fast dynamic_vram
Add the optional command line switch --fast dynamic_vram.

This is mutually exclusing --high-vram and --gpu-only which contradict
aimdos underlying feature.

Add appropriate installation warning and a startup message, match the
comfy debug level inconfiguring aimdo.

Add comfy-aimdo pip requirement. This will safely stub to a nop for
unsupported platforms.
2026-01-21 14:32:33 +10:00
Rattus
bacd916833 execution: add aimdo primary pytorch cache integration
We need to general pytorch cache defragmentation on an appropriate level for
aimdo. Do in here on the per node basis, which has a reasonable chance of
purging stale shapes out of the pytorch caching allocator and saving VRAM
without costing too much garbage collector thrash.

This looks like a lot of GC but because aimdo never fails from pytorch and
saves the pytorch allocator from ever need to defrag out of demand, but it
needs a oil change every now and then so we gotta do it. Doing it here also
means the pytorch temps are cleared from task manager VRAM usage so user
anxiety can go down a little when they see their vram drop back at the end
of workflows inline with inference usage (rather than assuming full VRAM
leaks).
2026-01-21 14:32:12 +10:00
Rattus
4ed6c6fc94 models: Use CoreModelPatcher
Use CoreModelPatcher for all internal ModelPatcher implementations. This drives
conditional use of the aimdo feature, while making sure custom node packs get
to keep ModelPatcher unchanged for the moment.
2026-01-21 14:32:12 +10:00
Rattus
8fe566b9cc ops/mp: implement aimdo
Implement a model patcher and caster for aimdo.

A new ModelPatcher implementation which backs onto comfy-aimdo to implement varying model load levels that can be adjusted during model use. The patcher defers all load processes to lazily load the model during use (e.g. the first step of a ksampler) and automatically negotiates a load level during the inference to maximize VRAM usage without OOMing. If inference requires more VRAM than is available weights are offloaded to make space before the OOM happens.

As for loading the weight onto the GPU, that happens via comfy_cast_weights which is now used in all cases. cast_bias_weight checks whether the VBAR assigned to the model has space for the weight (based on the same load priority semantics as the original ModelPatcher). If it does, the VRAM as returned by the Aimdo allocator is used as the parameter GPU side. The caster is responsible for populating the weight data. This is done using the usual offload_stream (which mean we now have asynchronous load overlapping first use compute).

Pinning works a little differently. When a weight is detected during load as unable to fit, a pin is allocated at the time of casting and the weight as used by the layer is DMAd back to the the pin using the GPU DMA TX engine, also using the asynchronous offload streams. This means you get to pin the Lora modified and requantized weights which can be a major speedup for offload+quantize+lora use cases, This works around the JIT Lora + FP8 exclusion and brings FP8MM to heavy offloading users (who probably really need it with more modest GPUs). There is a performance risk in that a CPU+RAM patch has been replace with a GPU+RAM patch but my initial performance results look good. Most users as likely to have a GPU that outruns their CPU in these woods.

Some common code is written to consolidate a layers tensors for aimdo mapping, pinning, and DMA transfers. interpret_gathered_like() allows unpacking a raw buffer as a set of tensors. This is used consistently to bundle and pack weights, quantization metadata (QuantizedTensor bits) and biases into one payload for DMA in the load process reducing Cuda overhead a little. Some Quantization metadata was missing async offload is some cases which is now added. This also pins quantization metadata and consolidates the number of cuda_host_register calls (which can be expensive).
2026-01-21 14:32:12 +10:00
Rattus
f00094a6b6 mp: add mode for non comfy weight prioritization
non-comfy weights dont get async offload and a few other performance
limitations. Load them at top priority accordingly.
2026-01-21 14:32:12 +10:00
Rattus
d2956bb5af mp/mm: APi expansions for dynamic loading
Add two api expansions, a flag for whether a model patcher is dynamic
a a very basic RAM freeing system.

Implement the semantics of the dynamic model patcher which never frees
VRAM ahead of time for the sake of another dynamic model patcher.

At the same time add an API for clearing out pins on a reservation of
model size x2 heuristic, as pins consume RAM in their own right in the
dynamic patcher.

This is actually less about OOMing RAM and more about performance, as
with assign=True load semantics there needs to be plenty headroom for
the OS to load models to dosk cache on demand so err on the side of
kicking old pins out.
2026-01-21 14:32:12 +10:00
Rattus
b06534e676 mp: wrap get_free_memory
Dynamic load needs to adjust these numbers based on future movements,
so wrap this in a MP API.
2026-01-21 14:32:12 +10:00
Rattus
095478f9f8 pinned_memory: add python
Add a python for managing pinned memory of the weight/bias module level.
This allocates, pins and attached a tensor to a module for the pin for this
module. It does not set the weight, just allocates a singular ram buffer
for population and bulk DMA transfer.
2026-01-21 14:32:12 +10:00
Rattus
2e2271135b move string_to_seed to utils.py
This needs to be visible by ops which may want to do stochastic rounding on
the fly.
2026-01-21 14:32:12 +10:00
Rattus
f9a225b590 mm: Implement cast buffer allocations 2026-01-21 14:32:12 +10:00
Rattus
8fda2eb5dc ops: Do bias dtype conversion on compute stream
For consistency with weights.
2026-01-21 14:32:12 +10:00
Rattus
243fb596f9 Reduce RAM and compute time in model saving with Loras
Get the model saving logic away from force_patch_weights and instead do
the patching JIT during safetensors saving.

Firstly switch off force_patch_weights in the load for save which avoids
creating CPU side tensors with loras calculated.

Then at save time, wrap the tensor to catch safetensors call to .to() and
patch it live.

This avoids having to ever have a lora-calculated copy of offloaded
weights on the CPU.

Also take advantage of the presence of the GPU when doing this Lora
calculation. The former force_patch_weights would just do eveyrthing on
the CPU. Its generally faster to go the GPU and back even if its just
a Lora application.
2026-01-21 14:32:12 +10:00
Markury
0fc15700be
Add LyCoris LoKr MLP layer support for Flux2 (#11997)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
2026-01-20 23:18:33 -05:00
comfyanonymous
e755268e7b
Config for Qwen 3 0.6B model. (#11998) 2026-01-20 23:08:31 -05:00
Mylo
c4a14df9a3
Dynamically detect chroma radiance patch size (#11991)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
2026-01-20 18:46:11 -05:00
Ivan Zorin
965d0ed509
fix: remove normalization of audio in LTX Mel spectrogram creation (#11990)
For LTX Audio VAE, remove normalization of audio during MEL spectrogram creation.
This aligs inference with training and prevents loud audio from being attenuated.
2026-01-20 18:44:28 -05:00
Alexander Piskun
ddc541ffda
feat(api-nodes): add WaveSpeed nodes (#11945) 2026-01-20 13:05:40 -08:00
comfyanonymous
8ccc0c94fa
Make omni stuff work on regular z image for easier testing. (#11985)
Some checks failed
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
Build package / Build Test (3.10) (push) Has been cancelled
Build package / Build Test (3.11) (push) Has been cancelled
Build package / Build Test (3.12) (push) Has been cancelled
Build package / Build Test (3.13) (push) Has been cancelled
Build package / Build Test (3.14) (push) Has been cancelled
2026-01-20 00:32:00 -05:00
Comfy Org PR Bot
4edb87aa50
Bump comfyui-frontend-package to 1.37.11 (#11976) 2026-01-19 23:57:50 -05:00
ComfyUI Wiki
0fc3b6e3a6
chore: update workflow templates to v0.8.15 (#11984) 2026-01-19 23:17:56 -05:00
comfyanonymous
2108167f9f
Support zimage omni base model. (#11979) 2026-01-19 23:17:38 -05:00
comfyanonymous
9d273d3ab1 ComfyUI v0.10.0 2026-01-19 22:40:18 -05:00
comfyanonymous
70c91b8248
Fix #11963 (#11982) 2026-01-19 22:32:40 -05:00
rkfg
0da5a0fe58
Convert mono audio to fake stereo for LTXV VAE encoding (#11965)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Build package / Build Test (3.10) (push) Waiting to run
Build package / Build Test (3.11) (push) Waiting to run
Build package / Build Test (3.12) (push) Waiting to run
Build package / Build Test (3.13) (push) Waiting to run
Build package / Build Test (3.14) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
2026-01-19 22:12:02 -05:00
comfyanonymous
e0eacb0688
Simpler way to implement the #11980 loras. (#11981) 2026-01-19 22:00:36 -05:00
Jedrzej Kosinski
7458e20465
Make Autogrow validation work properly (#11977)
* In-progress autogrow validation fixes - properly looks at required/optional inputs, now working on the edge case that all inputs are optional and nothing is plugged in (should just be an empty dictionary passed into node)

* Allow autogrow to work with all inputs being optional

* Revert accidentally pushed changes to nodes_logic.py
2026-01-19 16:58:30 -08:00
Jedrzej Kosinski
b931b37e30
feat(api-nodes): add Bria Edit node (#11978)
Co-authored-by: Alexander Piskun <bigcat88@icloud.com>
2026-01-19 16:47:14 -08:00
ComfyUI Wiki
866a4619db
chore: update workflow templates to v0.8.14 (#11974) 2026-01-19 14:21:35 -08:00
comfyanonymous
1a72bf2046
Readme update. (#11957)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
2026-01-18 19:53:43 -08:00
Alexander Piskun
034fac7054
chore(api-nodes): auto-discover all nodes_*.py files to avoid merge conflicts when adding new API nodes (#11943)
Some checks failed
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
Build package / Build Test (3.10) (push) Has been cancelled
Build package / Build Test (3.11) (push) Has been cancelled
Build package / Build Test (3.12) (push) Has been cancelled
Build package / Build Test (3.13) (push) Has been cancelled
Build package / Build Test (3.14) (push) Has been cancelled
Generate Pydantic Stubs from api.comfy.org / generate-models (push) Has been cancelled
2026-01-17 22:40:39 -08:00
Christian Byrne
a498556d0d
feat: add advanced parameter to Input classes for advanced widgets support (#11939)
Add 'advanced' boolean parameter to Input and WidgetInput base classes
and propagate to all typed Input subclasses (Boolean, Int, Float, String,
Combo, MultiCombo, Webcam, MultiType, MatchType, ImageCompare).

When set to True, the frontend will hide these inputs by default in a
collapsible 'Advanced Inputs' section in the right side panel, reducing
visual clutter for power-user options.

This enables nodes to expose advanced configuration options (like encoding
parameters, quality settings, etc.) without overwhelming typical users.

Frontend support: ComfyUI_frontend PR #7812
2026-01-17 19:06:03 -08:00
Alexander Piskun
f7ca41ff62
chore(api-nodes): remove check for pyav>=14.2 in code (it was added to requirements.txt long ago) (#11934) 2026-01-17 18:57:57 -08:00
Alexander Piskun
ac26065e61
chore(api-nodes): remove non-used; extract model to separate files (#11927)
* chore(api-nodes): remove non-used; extract model to separate files

* chore(api-nodes): remove non-needed prefix in filenames
2026-01-17 18:52:45 -08:00
comfyanonymous
190c4416cc
Bump comfy-kitchen dependency to version 0.2.7 (#11941) 2026-01-17 21:20:35 -05:00
Theephop
0fd10ffa09
fix: use .cpu() for waveform conversion in AudioFrame creation (#11787)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
2026-01-17 20:18:24 -05:00
Alex Butler
00c775950a
Update readme rdna3 nightly url (#11937) 2026-01-17 20:18:04 -05:00
comfyanonymous
7ac999bf30
Add image sizes to clip vision outputs. (#11923)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
2026-01-16 23:02:28 -05:00