Commit Graph

2215 Commits

Author SHA1 Message Date
rattus
d214aaf21e
Merge ea5775c620 into 77e2ed5e01 2026-05-15 03:23:20 +00:00
Jukka Seppänen
77e2ed5e01
feat: Support MoGe (CORE-168) (#13878)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
2026-05-15 10:34:56 +08:00
Rattus
d8b442709a make default 2026-05-15 00:37:06 +10:00
Rattus
18a74cb96a cli_args/execution: Implement lower background cache-ram threshold
Limit the amount of RAM background intermediates can use, so that
switching workflows doesn't degrade performance too much.
2026-05-15 00:36:35 +10:00
Rattus
31150538b0 implement pin registration swaps
Uncap the windows pins from 50% by extending the pool and have a pressure
mechanism to move the pin reservations om demand.

This unfortunately implies a GPU sync to do the freeing so significant
hysterisis needs to be added to consolidate these pressure events.
2026-05-15 00:35:23 +10:00
Talmaj
74c17a25e5
Fix void failing with RuntimeError: start (0) + length (464) exceeds dimension size (461). (#13873) 2026-05-13 12:37:30 -07:00
Rattus
3f717816e1 execution: implement pin eviction on RAM presure
Add back proper pin freeing on RAM pressure
2026-05-13 22:23:54 +10:00
Rattus
d61026d020 pins: implement freeing intermediate for pinned memory
Pinning is more important than inactive intermediates and the stream
pin buffer is more important than even active intermediates.
2026-05-13 22:23:54 +10:00
Rattus
ee927aafa8 ops: sync the CPU with only the offload stream activity
This was syncing with the offload stream which itself is synced with the
compute stream, so this was syncing CPU with compute transitively. Define
the event to sync it more gently.
2026-05-13 22:23:54 +10:00
comfyanonymous
2bd65f2091
Better Hidream O1 mem usage factor for non dynamic vram. (#13864) 2026-05-12 20:55:38 -07:00
comfyanonymous
0155ddcbe3
Fix dtype issue with hidream o1 (#13849) 2026-05-11 20:53:13 -07:00
Jukka Seppänen
8e53f001a4
feat: Support HiDream-O1-Image (CORE-187) (#13817)
* Initial HiDream01-image support

* Cleanup nodes

* Cleaner handling of empty placeholder models

* Remove snap_to_predefined, prefer tooltip for the trained resolutions

* Add model and block wrappers

* Fix shift tooltip

* Add node to work around the patch tile issue

Experimental, runs multiple passes with the patch grid offset and blends with various different methods.

* Qwen35 vision rotary_pos_emb cast fix

* Fix embedding layout type

* Some small optimizations

* Cleanup, don't need this fallback

* Prefix KV cache, cleanup

Bit of speed, reduce redundant code

* Get rid of redundant custom sampler, refactor noise scaling

Our existing lcm sampler is mathematically same, just added the missing options to it instead and a node to control them. Refactored the noise scaling and fix it for the stochastic samplers, add a generic node to control the initial noise scale.

* Update nodes_hidream_o1.py

* Fix some cache validation cases

* Keep existing sampling params

* Remove redundant video vision path

* Replace some numpy ops with torch

* Fx RoPE index for batch size > 1

* Prefer torch preprocessing

* Rename block_type to be compatible with existing patch nodes

* Fixes and tweaks
2026-05-11 20:35:53 -07:00
comfyanonymous
0a7d2ffd68
Support anima TE lora kohya format. (#13847) 2026-05-11 20:01:52 -07:00
Rattus
44c0a0602b ops: remove unused arg
This was defeatured in aimdo iteration
2026-05-12 11:04:59 +10:00
Rattus
3a3b75a7e3 implement pinned loras 2026-05-12 11:04:59 +10:00
Rattus
e48dace145 prepare for multiple pin sets 2026-05-12 11:04:59 +10:00
Rattus
8e473d756f lora: re-implement as inplace swiss-army-knife operation 2026-05-12 11:04:59 +10:00
Rattus
2b927e1783 LowVRAMPatch: change to two-phase visit 2026-05-12 11:04:59 +10:00
Rattus
8187cd783e Implement JIT pinned memory pressure
Replace the predictive pin pressure mechanism with JIT PIN memory
pressure.
2026-05-12 11:04:59 +10:00
Rattus
17955235b2 remove old pin path 2026-05-12 11:04:59 +10:00
Rattus
8070cb7780 Add stream host pin buffer for AIMDO casts
Introduce per-offload-stream HostBuffer reuse for pinned staging,
include it in cast buffer reset synchronization.

Defer actual casts that go via this pin path to a separate pass
such that the buffer can be allocated monolithically (to avoid
cudaHostRegister thrash).
2026-05-12 11:04:59 +10:00
Rattus
b66b642068 mm: use aimdo to do transfer from disk to pin
Aimdo implements a faster threaded loader.
2026-05-12 11:04:59 +10:00
Rattus
157965a1c9 pinned_memory: implement with aimdo growable buffer
Use a single growable buffer so we can do threaded pre-warming on
pinned memory.
2026-05-12 11:04:59 +10:00
Rattus
1fe3a13f84 model_management: disable non-dynamic smart memory
Disable smart memory outright for non dynamic models.

This is a minor step towards deprecation of --disable-dynamic-vram
and the legacy ModelPatcher.

This is needed for estimate-free model development, where new models
can opt-out of supplying a memory estimate and not have to worry
about hard VRAM allocations due to legacy non-dynamic model patchers

This is also a general stability increase for a lot of stray use cases
where estimates may still be off and going forward we are not going
to accurately maintain such estimates.
2026-05-12 11:04:59 +10:00
rattus
20e439419c
model_patcher: Fix safetensors saving of fp8 (#13835)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
This was missing proper weight scale casting in the saving path.
2026-05-11 12:48:10 -07:00
box4wangjing
f505cb4070
chore: remove extra word in comment (#13826) 2026-05-11 11:05:09 +08:00
Jukka Seppänen
3200f28e3a
Support Wan-Dancer (#13813)
* initial WanDancer support

* nodes_wandancer: Add list form of chunker.

Create an alternate list form of the node so the chunk gens can be
trivially looped by the comfy executor.

* Closer match to original soxr resampling

* Remove librosa node

* Cleanup

---------

Co-authored-by: Rattus <rattus128@gmail.com>
2026-05-09 14:02:56 -07:00
comfyanonymous
66669b2ded
I don't think there was any because nobody complained. (#13807)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
2026-05-08 17:32:14 -07:00
Alexis Rolland
c5ecd231a2
fix: Fix bug when mask not on same device (CORE-181) (#13801)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
2026-05-08 23:06:29 +08:00
Yousef R. Gamaleldin
d3c18c1636
Add support for BiRefNet background remove model (CORE-46) (#12747) 2026-05-08 17:59:24 +08:00
omahs
bac6fc35fb
Fix typos (#10986) 2026-05-08 17:14:45 +08:00
Talmaj
ef8f25601a
Add I2V for causal forcing model. (#13719) 2026-05-07 18:38:36 -07:00
Jukka Seppänen
8dc3f3f209
Improve SAM3 large input handling (#13767) 2026-05-07 17:18:28 -07:00
Jukka Seppänen
cd8c7a2306
Throttle dynamic VRAM prepare logging (#13704) 2026-05-07 10:41:13 +08:00
Talmaj
78b3096bf3
Void model - pass 1 & 2 (CORE-38) (#13403) 2026-05-05 19:59:04 -07:00
drozbay
e5369c0eec
feat: Context windows - add causal_window_fix to improve blending of context windows (CORE-100) (#13563)
* Context windows: add causal_window_fix toggle

* Fix slice_cond to correctly handle causal anchor index for temporal offsets
2026-05-05 16:40:53 -07:00
drozbay
1655f8089a
Add temporal_downscale_ratio to LatentFormat (#13702)
Co-authored-by: ozbayb <17261091+ozbayb@users.noreply.github.com>
Co-authored-by: Alexis Rolland <alexisrolland@hotmail.com>
Co-authored-by: Jukka Seppänen <40791699+kijai@users.noreply.github.com>
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
2026-05-05 16:30:00 -07:00
Talmaj
fed8d5efa6
feat: Auto-regressive video generation (CORE-25) (#13082)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
2026-05-04 21:01:22 -07:00
Jedrzej Kosinski
e758594e3b
Add deploy environment header (Comfy-Env) to partner node API calls (#13425) 2026-05-04 20:17:56 -07:00
Jedrzej Kosinski
ae457da84b
feat: add generic --feature-flag CLI arg and --list-feature-flags registry (#13685) 2026-05-04 19:50:26 -07:00
rattus
1265955b34
ops: handle multi-compute of the same weight (#13705)
If the same weight is used multiple times within the same prefetch
window, it should only apply compute state mutations once. Mark the
weight as fully resident on the first pass accordingly.
2026-05-04 16:40:57 -07:00
rattus
1ac78180b3
make control-net load order deterministic (#13701)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
Make this deterministic so speeds dont change base of load order. Load
them in reverse order so whatever the caller lists first is the top
priority.
2026-05-04 12:58:06 -07:00
rattus
c47633f3be
prefetch: guard against no offload (#13703)
cast_ will return no stream if there is no work to do. guard against
this is the consume logic.
2026-05-04 12:56:05 -07:00
Silver
b138133ffa
Enable triton comfy kitchen via cli-arg (#12730) 2026-05-03 14:07:21 -04:00
Jukka Seppänen
be95871adc
feat: Gemma4 text generation support (CORE-30) (#13376)
* initial gemma4 support

* parity with reference implementation

outputs can 100% match transformers with same sdpa flags, checkpoint this and then optimize

* Cleanup, video fixes

* cleanup, enable fused rms norm by default

* update comment

* Cleanup

* Update sd.py

* Various fixes

* Add fp8 scaled embedding support

* small fixes

* Translate think tokens

* Fix image encoder attention mask type

So it works with basic attention

* Handle thinking tokens different only for Gemma4

* Code cleanup

* Update nodes_textgen.py

* Use embed scale class instead of buffer

Slight difference to HF, but technically more accurate and simpler code

* Default to fused rms_norm

* Update gemma4.py
2026-05-02 22:46:15 -04:00
rattus
783782d5d7
Implement block prefetch + Lora Async load + and adopt in LTX (Speedup!) (CORE-111) (#13618)
* mm: Use Aimdo raw allocator for cast buffers

pytorch manages allocation of growing buffers on streams poorly. Pyt
has no windows support for the expandable segments allocator (which is
the right tool for this job), while also segmenting the memory by
stream such that it can be generally re-used. So kick the problem to
aimdo which can just grow a virtual region thats freed per stream.

* plan

* ops: move cpu handler up to the caller

* ops: split up prefetch from weight prep block prefetching API

Split up the casting and weight formating/lora stuff in prep for
arbitrary prefetch support.

* ops: implement block prefetching API

allow a model to construct a prefetch list and operate it for increased
async offload.

* ltxv2: Implement block prefetching

* Implement lora async offload

Implement async offload of loras.
2026-05-02 19:23:24 -04:00
Simon Lui
63103d519e
Remove IPEX and clean up checks and add missing synchronize during empty cache. (#13653) 2026-05-01 14:16:41 -07:00
Talmaj
cf9cbec596
Reformat models variable into multiline array CORE-59 (#13513)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
Co-authored-by: Talmaj Marinc <talmaj@comfy.org>
2026-05-01 17:20:11 +08:00
Rainer
e9c311b245
OneTainer ERNIE LoRA support (#13640) 2026-04-30 19:33:41 -04:00
blepping
a164c82913
Add high quality preview support for Flux2 latents (#13496) 2026-04-29 19:37:30 -04:00