Commit Graph

2391 Commits

Author SHA1 Message Date
comfyanonymous
e01e99d075
Support hunyuan image distilled model. (#9807) 2025-09-10 23:17:34 -04:00
patientx
666b2e05fa
Merge branch 'comfyanonymous:master' into master 2025-09-10 10:47:09 +03:00
comfyanonymous
543888d3d8
Fix lowvram issue with hunyuan image vae. (#9794) 2025-09-10 02:15:34 -04:00
comfyanonymous
85e34643f8
Support hunyuan image 2.1 regular model. (#9792) 2025-09-10 02:05:07 -04:00
comfyanonymous
5c33872e2f
Fix issue on old torch. (#9791) 2025-09-10 00:23:47 -04:00
comfyanonymous
b288fb0db8
Small refactor of some vae code. (#9787) 2025-09-09 18:09:56 -04:00
Rando717
4057f2984c
Update zluda.py (MEM_BUS_WIDTH#3)
Lower casing the lookup inside MEM_BUS_WIDTH, just in case of incorrect casing on Radeon Pro (PRO) GPUs.

fixed/lower-casing "Triton device properties" lookup inside MEM_BUS_WIDTH.
2025-09-09 20:04:20 +02:00
Rando717
13ba6a8a8d
Update zluda.py (cleanup print Triton version)
compacted, without exception, silent if no version string
2025-09-09 19:30:54 +02:00
Rando717
ce8900fa25
Update zluda.py (gpu_name_to_gfx)
-function changed into list of rules

-correct gfx codes attached to each GPU name

-addressed potential incorrect designation for  RX 6000 S Series, sort priority
2025-09-09 18:51:41 +02:00
patientx
a531352603
Merge branch 'comfyanonymous:master' into master 2025-09-09 01:35:58 +03:00
comfyanonymous
103a12cb66
Support qwen inpaint controlnet. (#9772) 2025-09-08 17:30:26 -04:00
patientx
6f38e729cc
Merge branch 'comfyanonymous:master' into master 2025-09-08 22:15:28 +03:00
Rando717
e7d48450a3
Update zluda.py (removed previously added gfx90c)
'radeon graphics' check is not viable enough
considering 'radeon (tm) graphics' also exists on Vega.

Plus gfx1036 Raphael (Ryzen 7000) is called 'radeon (tm) graphics' , same with Granite Ridge (Ryzen 9000).
2025-09-08 21:10:20 +02:00
contentis
97652d26b8
Add explicit casting in apply_rope for Qwen VL (#9759) 2025-09-08 15:08:18 -04:00
Rando717
590f46ab41
Update zluda.py (typo) 2025-09-08 20:31:49 +02:00
Rando717
675d6d8f4c
Update zluda.py (gfx gpu names)
-expanded GPU gfx names
-added RDNA4, RDNA3.5, ...
-added missing Polaris cards to prevent 'gfx1010' and 'gfx1030' fallback
-kept gfx designation mostly the same, based on available custom lib's for hip57/62

might need some post adjustments
2025-09-08 17:55:29 +02:00
Rando717
ddb1e3da47
Update zluda.py (typo) 2025-09-08 17:22:41 +02:00
Rando717
a7336ad630
Update zluda.py (MEM_BUS_WIDTH#2)
Added Vega10/20 cards.
Can't test, no clue if it has effect or just a placebo effect.
2025-09-08 17:19:03 +02:00
Rando717
40199a5244
Update zluda.py (print Triton version)
Added check for Triton version string, if it exists.
Could be useful info for troubleshooting reports.
2025-09-08 17:00:40 +02:00
patientx
b46622ffa5
Merge branch 'comfyanonymous:master' into master 2025-09-08 11:14:04 +03:00
comfyanonymous
fb763d4333
Fix amd_min_version crash when cpu device. (#9754) 2025-09-07 21:16:29 -04:00
patientx
9417753a6c
Merge branch 'comfyanonymous:master' into master 2025-09-07 13:16:57 +03:00
comfyanonymous
bcbd7884e3
Don't enable pytorch attention on AMD if triton isn't available. (#9747) 2025-09-07 00:29:38 -04:00
comfyanonymous
27a0fcccc3
Enable bf16 VAE on RDNA4. (#9746) 2025-09-06 23:25:22 -04:00
patientx
afbcd5d57e
Merge branch 'comfyanonymous:master' into master 2025-09-06 11:51:33 +03:00
comfyanonymous
ea6cdd2631
Print all fast options in --help (#9737) 2025-09-06 01:05:05 -04:00
patientx
3ca065a755
fix 2025-09-05 23:11:57 +03:00
patientx
0488fe3748
rmsnorm patch second try 2025-09-05 23:10:27 +03:00
patientx
8966009181
added rmsnorm patch for torch's older than 2.4 2025-09-05 22:43:39 +03:00
patientx
f9d7fcb696
Merge branch 'comfyanonymous:master' into master 2025-09-05 22:09:30 +03:00
comfyanonymous
2ee7879a0b
Fix lowvram issues with hunyuan3d 2.1 (#9735) 2025-09-05 14:57:35 -04:00
patientx
c7c7269f48
Merge branch 'comfyanonymous:master' into master 2025-09-05 17:11:07 +03:00
comfyanonymous
c9ebe70072
Some changes to the previous hunyuan PR. (#9725) 2025-09-04 20:39:02 -04:00
Yousef R. Gamaleldin
261421e218
Add Hunyuan 3D 2.1 Support (#8714) 2025-09-04 20:36:20 -04:00
patientx
d79e93a0a9
Merge branch 'comfyanonymous:master' into master 2025-09-04 12:41:48 +03:00
comfyanonymous
72855db715
Fix potential rope issue. (#9710) 2025-09-03 22:20:13 -04:00
patientx
991209d11d
Merge branch 'comfyanonymous:master' into master 2025-09-03 00:05:33 +03:00
comfyanonymous
e3018c2a5a
uso -> uxo/uno as requested. (#9688) 2025-09-02 16:12:07 -04:00
patientx
b30a38dca0
Merge branch 'comfyanonymous:master' into master 2025-09-02 22:46:44 +03:00
comfyanonymous
3412d53b1d
USO style reference. (#9677)
Load the projector.safetensors file with the ModelPatchLoader node and use
the siglip_vision_patch14_384.safetensors "clip vision" model and the
USOStyleReferenceNode.
2025-09-02 15:36:22 -04:00
patientx
47c6fb34c9
Merge branch 'comfyanonymous:master' into master 2025-09-02 09:46:42 +03:00
contentis
e2d1e5dad9
Enable Convolution AutoTuning (#9301) 2025-09-01 20:33:50 -04:00
comfyanonymous
27e067ce50
Implement the USO subject identity lora. (#9674)
Use the lora with FluxContextMultiReferenceLatentMethod node set to "uso"
and a ReferenceLatent node with the reference image.
2025-09-01 18:54:02 -04:00
patientx
9cb469282e
Merge branch 'comfyanonymous:master' into master 2025-08-31 11:24:57 +03:00
chaObserv
32a627bf1f
SEEDS: update noise decomposition and refactor (#9633)
- Update the decomposition to reflect interval dependency
- Extract phi computations into functions
- Use torch.lerp for interpolation
2025-08-31 00:01:45 -04:00
patientx
c6b0bf480f
Merge branch 'comfyanonymous:master' into master 2025-08-29 09:31:05 +03:00
comfyanonymous
e80a14ad50
Support wan2.2 5B fun control model. (#9611)
Use the Wan22FunControlToVideo node.
2025-08-28 22:13:07 -04:00
patientx
c8af694267
Merge pull request #279 from sfinktah/sfink-cudnn-benchmark
Added env_var for cudnn.benchmark
2025-08-28 23:17:05 +03:00
patientx
1db0a73a2a
Merge branch 'comfyanonymous:master' into master 2025-08-28 09:06:22 +03:00
comfyanonymous
4aa79dbf2c
Adjust flux mem usage factor a bit. (#9588) 2025-08-27 23:08:17 -04:00
patientx
fc93a6f534
Merge branch 'comfyanonymous:master' into master 2025-08-28 02:22:15 +03:00
Gangin Park
3aad339b63
Add DPM++ 2M SDE Heun (RES) sampler (#9542) 2025-08-27 19:07:31 -04:00
comfyanonymous
491755325c
Better s2v memory estimation. (#9584) 2025-08-27 19:02:42 -04:00
Christopher Anderson
cf22cbd8d5 Added env_var for cudnn.benchmark 2025-08-28 09:00:08 +10:00
comfyanonymous
496888fd68
Improve s2v performance when generating videos longer than 120 frames. (#9582) 2025-08-27 16:06:40 -04:00
comfyanonymous
b5ac6ed7ce
Fixes to make controlnet type models work on qwen edit and kontext. (#9581) 2025-08-27 15:26:28 -04:00
Kohaku-Blueleaf
b20ba1f27c
Fix #9537 (#9576) 2025-08-27 12:45:02 -04:00
patientx
eeab23fc0b
Merge branch 'comfyanonymous:master' into master 2025-08-27 10:07:57 +03:00
comfyanonymous
88aee596a3
WIP Wan 2.2 S2V model. (#9568) 2025-08-27 01:10:34 -04:00
patientx
c1aef0126d
Merge pull request #276 from sfinktah/sfink-cudnn-benchmark-env
Deleted torch.backends.cudnn.benchmark line, defaults are fine
2025-08-26 19:34:35 +03:00
patientx
1efeba7066
Merge branch 'comfyanonymous:master' into master 2025-08-26 10:41:38 +03:00
comfyanonymous
914c2a2973
Implement wav2vec2 as an audio encoder model. (#9549)
This is useless on its own but there are multiple models that use it.
2025-08-25 23:26:47 -04:00
Christopher Anderson
110cb0a9d9 Deleted torch.backends.cudnn.benchmark line, defaults are fine 2025-08-26 08:43:31 +10:00
Christopher Anderson
1b9a3b12c2 had to move cudnn disablement up much higher 2025-08-25 14:11:54 +10:00
Christopher Anderson
cd3d60254b argggh, white space hell 2025-08-25 09:52:58 +10:00
Christopher Anderson
184fa5921f worst PR ever, really. 2025-08-25 09:42:27 +10:00
Christopher Anderson
33c43b68c3 worst PR ever 2025-08-25 09:38:22 +10:00
Christopher Anderson
2a06dc8e87 Merge remote-tracking branch 'origin/sfink-cudnn-env' into sfink-cudnn-env
# Conflicts:
#	comfy/customzluda/zluda.py
2025-08-25 09:34:32 +10:00
Christopher Anderson
3504eeeb4a rebased onto upstream master (woops) 2025-08-25 09:32:34 +10:00
Christopher Anderson
7eda4587be Added env var TORCH_BACKENDS_CUDNN_ENABLED, defaults to 1. 2025-08-25 09:31:12 +10:00
Christopher Anderson
954644ef83 Added env var TORCH_BACKENDS_CUDNN_ENABLED, defaults to 1. 2025-08-25 08:56:48 +10:00
Rando717
053a6b95e5
Update zluda.py (MEM_BUS_WIDTH)
Added more cards, mostly RDNA(1) and Radeon Pro.

Reasoning: Every time zluda.py gets update I have to manually add 256 for my RX 5700, otherwise it default to 128. Also, manual local edits fails at git pull.
2025-08-24 18:39:40 +02:00
patientx
c92a07594b
Update zluda.py 2025-08-24 12:01:20 +03:00
patientx
dba9d20791
Update zluda.py 2025-08-24 10:23:30 +03:00
patientx
cdc04b5a8a
Merge branch 'comfyanonymous:master' into master 2025-08-23 07:47:07 +03:00
comfyanonymous
41048c69b4
Fix Conditioning masks on 3d latents. (#9506) 2025-08-22 23:15:44 -04:00
Jedrzej Kosinski
fc247150fe
Implement EasyCache and Invent LazyCache (#9496)
* Attempting a universal implementation of EasyCache, starting with flux as test; I screwed up the math a bit, but when I set it just right it works.

* Fixed math to make threshold work as expected, refactored code to use EasyCacheHolder instead of a dict wrapped by object

* Use sigmas from transformer_options instead of timesteps to be compatible with a greater amount of models, make end_percent work

* Make log statement when not skipping useful, preparing for per-cond caching

* Added DIFFUSION_MODEL wrapper around forward function for wan model

* Add subsampling for heuristic inputs

* Add subsampling to output_prev (output_prev_subsampled now)

* Properly consider conds in EasyCache logic

* Created SuperEasyCache to test what happens if caching and reuse is moved outside the scope of conds, added PREDICT_NOISE wrapper to facilitate this test

* Change max reuse_threshold to 3.0

* Mark EasyCache/SuperEasyCache as experimental (beta)

* Make Lumina2 compatible with EasyCache

* Add EasyCache support for Qwen Image

* Fix missing comma, curse you Cursor

* Add EasyCache support to AceStep

* Add EasyCache support to Chroma

* Added EasyCache support to Cosmos Predict t2i

* Make EasyCache not crash with Cosmos Predict ImagToVideo latents, but does not work well at all

* Add EasyCache support to hidream

* Added EasyCache support to hunyuan video

* Added EasyCache support to hunyuan3d

* Added EasyCache support to LTXV (not very good, but does not crash)

* Implemented EasyCache for aura_flow

* Renamed SuperEasyCache to LazyCache, hardcoded subsample_factor to 8 on nodes

* Eatra logging when verbose is true for EasyCache
2025-08-22 22:41:08 -04:00
contentis
fe31ad0276
Add elementwise fusions (#9495)
* Add elementwise fusions

* Add addcmul pattern to Qwen
2025-08-22 19:39:15 -04:00
patientx
7bc46389fa
Merge branch 'comfyanonymous:master' into master 2025-08-22 10:52:52 +03:00
comfyanonymous
ff57793659
Support InstantX Qwen controlnet. (#9488) 2025-08-22 00:53:11 -04:00
comfyanonymous
f7bd5e58dd
Make it easier to implement future qwen controlnets. (#9485) 2025-08-21 23:18:04 -04:00
patientx
7ff01ded58
Merge branch 'comfyanonymous:master' into master 2025-08-21 09:24:26 +03:00
comfyanonymous
0963493a9c
Support for Qwen Diffsynth Controlnets canny and depth. (#9465)
These are not real controlnets but actually a patch on the model so they
will be treated as such.

Put them in the models/model_patches/ folder.

Use the new ModelPatchLoader and QwenImageDiffsynthControlnet nodes.
2025-08-20 22:26:37 -04:00
patientx
6dca25e2a8
Merge branch 'comfyanonymous:master' into master 2025-08-20 10:14:34 +03:00
comfyanonymous
8d38ea3bbf
Fix bf16 precision issue with qwen image embeddings. (#9441) 2025-08-20 02:58:54 -04:00
comfyanonymous
5a8f502db5
Disable prompt weights for qwen. (#9438) 2025-08-20 01:08:11 -04:00
comfyanonymous
7cd2c4bd6a
Qwen rotary embeddings should now match reference code. (#9437) 2025-08-20 00:45:27 -04:00
comfyanonymous
dfa791eb4b
Rope fix for qwen vl. (#9435) 2025-08-19 20:47:42 -04:00
patientx
1cbb5fdc14
Merge branch 'comfyanonymous:master' into master 2025-08-19 10:21:12 +03:00
comfyanonymous
4977f203fa
P2 of qwen edit model. (#9412)
* P2 of qwen edit model.

* Typo.

* Fix normal qwen.

* Fix.

* Make the TextEncodeQwenImageEdit also set the ref latent.

If you don't want it to set the ref latent and want to use the
ReferenceLatent node with your custom latent instead just disconnect the
VAE.
2025-08-18 22:38:34 -04:00
patientx
3f09b4dba5
Merge branch 'comfyanonymous:master' into master 2025-08-18 15:14:34 +03:00
Jedrzej Kosinski
7f3b9b16c6
Make step index detection much more robust (#9392) 2025-08-17 18:54:07 -04:00
comfyanonymous
ed43784b0d
WIP Qwen edit model: The diffusion model part. (#9383) 2025-08-17 16:45:39 -04:00
patientx
64d6cf045e
Merge branch 'comfyanonymous:master' into master 2025-08-17 11:29:13 +03:00
comfyanonymous
0f2b8525bc
Qwen image model refactor. (#9375) 2025-08-16 17:51:28 -04:00
patientx
5a21015adb
Merge branch 'comfyanonymous:master' into master 2025-08-16 09:54:01 +03:00
comfyanonymous
1702e6df16
Implement wan2.2 camera model. (#9357)
Use the old WanCameraImageToVideo node.
2025-08-15 17:29:58 -04:00
patientx
eb283b5fd7
Merge branch 'comfyanonymous:master' into master 2025-08-16 00:26:31 +03:00
comfyanonymous
c308a8840a
Add FluxKontextMultiReferenceLatentMethod node. (#9356)
This node is only useful if someone trains the kontext model to properly
use multiple reference images via the index method.

The default is the offset method which feeds the multiple images like if
they were stitched together as one. This method works with the current
flux kontext model.
2025-08-15 15:50:39 -04:00
patientx
13f5f9d78f
Merge branch 'comfyanonymous:master' into master 2025-08-15 10:54:10 +03:00
comfyanonymous
e08ecfbd8a
Add warning when using old pytorch. (#9347) 2025-08-15 00:22:26 -04:00
comfyanonymous
4e5c230f6a
Fix last commit not working on older pytorch. (#9346) 2025-08-14 23:44:02 -04:00
Xiangxi Guo (Ryan)
f0d5d0111f
Avoid torch compile graphbreak for older pytorch versions (#9344)
Turns out torch.compile has some gaps in context manager decorator
syntax support. I've sent patches to fix that in PyTorch, but it won't
be available for all the folks running older versions of PyTorch, hence
this trivial patch.
2025-08-14 23:41:37 -04:00
comfyanonymous
ad19a069f6
Make SLG nodes work on Qwen Image model. (#9345) 2025-08-14 23:16:01 -04:00
patientx
a927fbd99b
Merge branch 'comfyanonymous:master' into master 2025-08-14 12:16:50 +03:00
Jedrzej Kosinski
e4f7ea105f
Added context window support to core sampling code (#9238)
* Added initial support for basic context windows - in progress

* Add prepare_sampling wrapper for context window to more accurately estimate latent memory requirements, fixed merging wrappers/callbacks dicts in prepare_model_patcher

* Made context windows compatible with different dimensions; works for WAN, but results are bad

* Fix comfy.patcher_extension.merge_nested_dicts calls in prepare_model_patcher in sampler_helpers.py

* Considering adding some callbacks to context window code to allow extensions of behavior without the need to rewrite code

* Made dim slicing cleaner

* Add Wan Context WIndows node for testing

* Made context schedule and fuse method functions be stored on the handler instead of needing to be registered in core code to be found

* Moved some code around between node_context_windows.py and context_windows.py

* Change manual context window nodes names/ids

* Added callbacks to IndexListContexHandler

* Adjusted default values for context_length and context_overlap, made schema.inputs definition for WAN Context Windows less annoying

* Make get_resized_cond more robust for various dim sizes

* Fix typo

* Another small fix
2025-08-13 21:33:05 -04:00
Simon Lui
c991a5da65
Fix XPU iGPU regressions (#9322)
* Change bf16 check and switch non-blocking to off default with option to force to regain speed on certain classes of iGPUs and refactor xpu check.

* Turn non_blocking off by default for xpu.

* Update README.md for Intel GPUs.
2025-08-13 19:13:35 -04:00
patientx
804c7097fa
Merge branch 'comfyanonymous:master' into master 2025-08-13 23:56:43 +03:00
comfyanonymous
9df8792d4b
Make last PR not crash comfy on old pytorch. (#9324) 2025-08-13 15:12:41 -04:00
contentis
3da5a07510
SDPA backend priority (#9299) 2025-08-13 14:53:27 -04:00
patientx
bcafc3f7a3
Merge branch 'comfyanonymous:master' into master 2025-08-13 10:36:20 +03:00
comfyanonymous
560d38f34c
Wan2.2 fun control support. (#9292) 2025-08-12 23:26:33 -04:00
patientx
f80a9bb674
Merge branch 'comfyanonymous:master' into master 2025-08-12 00:33:53 +03:00
PsychoLogicAu
2208aa616d
Support SimpleTuner lycoris lora for Qwen-Image (#9280) 2025-08-11 16:56:16 -04:00
patientx
c2686a3968
Merge branch 'comfyanonymous:master' into master 2025-08-10 12:09:19 +03:00
comfyanonymous
5828607ccf
Not sure if AMD actually support fp16 acc but it doesn't crash. (#9258) 2025-08-09 12:49:25 -04:00
patientx
89499c6fae
Merge branch 'comfyanonymous:master' into master 2025-08-08 11:40:07 +03:00
comfyanonymous
735bb4bdb1
Users report gfx1201 is buggy on flux with pytorch attention. (#9244) 2025-08-08 04:21:00 -04:00
patientx
8795ae98aa
Merge branch 'comfyanonymous:master' into master 2025-08-06 20:24:47 +03:00
flybirdxx
4c3e57b0ae
Fixed an issue where qwenLora could not be loaded properly. (#9208) 2025-08-06 13:23:11 -04:00
patientx
2e39e0999f
Update zluda.py 2025-08-05 19:21:20 +03:00
patientx
28957a7bd6
Merge branch 'comfyanonymous:master' into master 2025-08-05 13:37:09 +03:00
comfyanonymous
d044a24398
Fix default shift and any latent size for qwen image model. (#9186) 2025-08-05 06:12:27 -04:00
patientx
e419bade03
Merge pull request #244 from sfinktah/sfink-zluda-is-nasty
Bad ideas from zluda update.
2025-08-05 09:48:53 +03:00
patientx
ea8122f065
Merge branch 'comfyanonymous:master' into master 2025-08-05 09:47:31 +03:00
comfyanonymous
c012400240
Initial support for qwen image model. (#9179) 2025-08-04 22:53:25 -04:00
Christopher Anderson
4f853403fe Bad ideas from zluda update. 2025-08-05 06:00:55 +10:00
patientx
88b7fe87ff
Merge branch 'comfyanonymous:master' into master 2025-08-04 12:38:56 +03:00
comfyanonymous
03895dea7c
Fix another issue with the PR. (#9170) 2025-08-04 04:33:04 -04:00
comfyanonymous
84f9759424
Add some warnings and prevent crash when cond devices don't match. (#9169) 2025-08-04 04:20:12 -04:00
comfyanonymous
7991341e89
Various fixes for broken things from earlier PR. (#9168) 2025-08-04 04:02:40 -04:00
patientx
37415c40c1
device identification and setting triton arch override 2025-08-04 10:44:18 +03:00
patientx
d823c0c615
Merge branch 'comfyanonymous:master' into master 2025-08-04 10:42:15 +03:00
comfyanonymous
140ffc7fdc
Fix broken controlnet from last PR. (#9167) 2025-08-04 03:28:12 -04:00
comfyanonymous
182f90b5ec
Lower cond vram use by casting at the same time as device transfer. (#9159) 2025-08-04 03:11:53 -04:00
patientx
7258461c23
Merge branch 'comfyanonymous:master' into master 2025-08-03 16:33:54 +03:00
comfyanonymous
aebac22193
Cleanup. (#9160) 2025-08-03 07:08:11 -04:00
patientx
da4fc8189a
Merge branch 'comfyanonymous:master' into master 2025-08-03 00:17:56 +03:00
comfyanonymous
13aaa66ec2
Make sure context is on the right device. (#9154) 2025-08-02 15:09:23 -04:00
comfyanonymous
5f582a9757
Make sure all the conds are on the right device. (#9151) 2025-08-02 15:00:13 -04:00
patientx
83dbd68651
Merge branch 'comfyanonymous:master' into master 2025-08-01 14:42:25 +03:00
comfyanonymous
1e638a140b
Tiny wan vae optimizations. (#9136) 2025-08-01 05:25:38 -04:00
patientx
321d683af0
Merge branch 'comfyanonymous:master' into master 2025-07-31 14:49:33 +03:00
chaObserv
61b08d4ba6
Replace manual x * sigmoid(x) with torch silu in VAE nonlinearity (#9057) 2025-07-30 19:25:56 -04:00
comfyanonymous
da9dab7edd
Small wan camera memory optimization. (#9111) 2025-07-30 05:55:26 -04:00
patientx
1bd4b6489e
Merge branch 'comfyanonymous:master' into master 2025-07-30 11:11:46 +03:00
comfyanonymous
dca6bdd4fa
Make wan2.2 5B i2v take a lot less memory. (#9102) 2025-07-29 19:44:18 -04:00
patientx
d8ca8134c3
Merge branch 'comfyanonymous:master' into master 2025-07-29 11:56:59 +03:00
comfyanonymous
7d593baf91
Extra reserved vram on large cards on windows. (#9093) 2025-07-29 04:07:45 -04:00
patientx
fc4e82537c
Merge pull request #233 from sfinktah/sfink-flash-attn-gfx-startswith
This will allow much better support for gfx1032 and other things not …
2025-07-28 23:12:38 +03:00
patientx
7ba2a8d3b0
Merge branch 'comfyanonymous:master' into master 2025-07-28 22:15:10 +03:00
comfyanonymous
c60dc4177c
Remove unecessary clones in the wan2.2 VAE. (#9083) 2025-07-28 14:48:19 -04:00
Christopher Anderson
b5ede18481 This will allow much better support for gfx1032 and other things not specifically named 2025-07-29 04:21:45 +10:00
patientx
769ab3bd25
Merge branch 'comfyanonymous:master' into master 2025-07-28 15:21:30 +03:00
comfyanonymous
a88788dce6
Wan 2.2 support. (#9080) 2025-07-28 08:00:23 -04:00
patientx
5a45e12b61
Merge branch 'comfyanonymous:master' into master 2025-07-26 14:09:19 +03:00
comfyanonymous
0621d73a9c
Remove useless code. (#9059) 2025-07-26 04:44:19 -04:00
comfyanonymous
e6e5d33b35
Remove useless code. (#9041)
This is only needed on old pytorch 2.0 and older.
2025-07-25 04:58:28 -04:00
patientx
c3bf1d95e2
Merge branch 'comfyanonymous:master' into master 2025-07-25 10:20:29 +03:00
Eugene Fairley
4293e4da21
Add WAN ATI support (#8874)
* Add WAN ATI support

* Fixes

* Fix length

* Remove extra functions

* Fix

* Fix

* Ruff fix

* Remove torch.no_grad

* Add batch trajectory logic

* Scale inputs before and after motion patch

* Batch image/trajectory

* Ruff fix

* Clean up
2025-07-24 20:59:19 -04:00
patientx
970b7fb84f
Merge branch 'comfyanonymous:master' into master 2025-07-24 22:30:55 +03:00
comfyanonymous
69cb57b342
Print xpu device name. (#9035) 2025-07-24 15:06:25 -04:00
honglyua
0ccc88b03f
Support Iluvatar CoreX (#8585)
* Support Iluvatar CoreX
Co-authored-by: mingjiang.li <mingjiang.li@iluvatar.com>
2025-07-24 13:57:36 -04:00
patientx
30539d0d13
Merge branch 'comfyanonymous:master' into master 2025-07-24 13:59:09 +03:00
Kohaku-Blueleaf
eb2f78b4e0
[Training Node] algo support, grad acc, optional grad ckpt (#9015)
* Add factorization utils for lokr

* Add lokr train impl

* Add loha train impl

* Add adapter map for algo selection

* Add optional grad ckpt and algo selection

* Update __init__.py

* correct key name for loha

* Use custom fwd/bwd func and better init for loha

* Support gradient accumulation

* Fix bugs of loha

* use more stable init

* Add OFT training

* linting
2025-07-23 20:57:27 -04:00
chaObserv
e729a5cc11
Separate denoised and noise estimation in Euler CFG++ (#9008)
This will change their behavior with the sampling CONST type.
It also combines euler_cfg_pp and euler_ancestral_cfg_pp into one main function.
2025-07-23 19:47:05 -04:00
comfyanonymous
d3504e1778
Enable pytorch attention by default for gfx1201 on torch 2.8 (#9029) 2025-07-23 19:21:29 -04:00
comfyanonymous
a86a58c308
Fix xpu function not implemented p2. (#9027) 2025-07-23 18:18:20 -04:00
comfyanonymous
39dda1d40d
Fix xpu function not implemented. (#9026) 2025-07-23 18:10:59 -04:00
patientx
58f3250106
Merge branch 'comfyanonymous:master' into master 2025-07-23 23:49:46 +03:00
comfyanonymous
5ad33787de
Add default device argument. (#9023) 2025-07-23 14:20:49 -04:00
patientx
bd33a5d382
Merge branch 'comfyanonymous:master' into master 2025-07-23 03:27:52 +03:00
Simon Lui
255f139863
Add xpu version for async offload and some other things. (#9004) 2025-07-22 15:20:09 -04:00
patientx
b049c1df82
Merge branch 'comfyanonymous:master' into master 2025-07-17 00:49:42 +03:00
comfyanonymous
491fafbd64
Silence clip tokenizer warning. (#8934) 2025-07-16 14:42:07 -04:00
Harel Cain
9bc2798f72
LTXV VAE decoder: switch default padding mode (#8930) 2025-07-16 13:54:38 -04:00
patientx
d22d65cc68
Merge branch 'comfyanonymous:master' into master 2025-07-16 13:58:56 +03:00
comfyanonymous
50afba747c
Add attempt to work around the safetensors mmap issue. (#8928) 2025-07-16 03:42:17 -04:00
patientx
79e3b67425
Merge branch 'comfyanonymous:master' into master 2025-07-15 12:24:08 +03:00
Yoland Yan
543c24108c
Fix wrong reference bug (#8910) 2025-07-14 20:45:55 -04:00
patientx
3845c2ff7a
Merge branch 'comfyanonymous:master' into master 2025-07-12 14:59:05 +03:00
comfyanonymous
b40143984c
Add model detection error hint for lora. (#8880) 2025-07-12 03:49:26 -04:00
patientx
5ede75293f
Merge branch 'comfyanonymous:master' into master 2025-07-11 17:30:21 +03:00
comfyanonymous
938d3e8216
Remove windows line endings. (#8866) 2025-07-11 02:37:51 -04:00
patientx
43514805ed
Merge branch 'comfyanonymous:master' into master 2025-07-10 21:46:22 +03:00
guill
2b653e8c18
Support for async node functions (#8830)
* Support for async execution functions

This commit adds support for node execution functions defined as async. When
a node's execution function is defined as async, we can continue
executing other nodes while it is processing.

Standard uses of `await` should "just work", but people will still have
to be careful if they spawn actual threads. Because torch doesn't really
have async/await versions of functions, this won't particularly help
with most locally-executing nodes, but it does work for e.g. web
requests to other machines.

In addition to the execute function, the `VALIDATE_INPUTS` and
`check_lazy_status` functions can also be defined as async, though we'll
only resolve one node at a time right now for those.

* Add the execution model tests to CI

* Add a missing file

It looks like this got caught by .gitignore? There's probably a better
place to put it, but I'm not sure what that is.

* Add the websocket library for automated tests

* Add additional tests for async error cases

Also fixes one bug that was found when an async function throws an error
after being scheduled on a task.

* Add a feature flags message to reduce bandwidth

We now only send 1 preview message of the latest type the client can
support.

We'll add a console warning when the client fails to send a feature
flags message at some point in the future.

* Add async tests to CI

* Don't actually add new tests in this PR

Will do it in a separate PR

* Resolve unit test in GPU-less runner

* Just remove the tests that GHA can't handle

* Change line endings to UNIX-style

* Avoid loading model_management.py so early

Because model_management.py has a top-level `logging.info`, we have to
be careful not to import that file before we call `setup_logging`. If we
do, we end up having the default logging handler registered in addition
to our custom one.
2025-07-10 14:46:19 -04:00
patientx
b621197729
Merge branch 'comfyanonymous:master' into master 2025-07-09 00:16:45 +03:00
chaObserv
aac10ad23a
Add SA-Solver sampler (#8834) 2025-07-08 16:17:06 -04:00
josephrocca
974254218a
Un-hardcode chroma patch_size (#8840) 2025-07-08 15:56:59 -04:00
patientx
ab04a1c165
Merge branch 'comfyanonymous:master' into master 2025-07-06 14:41:03 +03:00
comfyanonymous
75d327abd5
Remove some useless code. (#8812) 2025-07-06 07:07:39 -04:00
patientx
65db0a046f
Merge branch 'comfyanonymous:master' into master 2025-07-06 02:44:08 +03:00
comfyanonymous
ee615ac269
Add warning when loading file unsafely. (#8800) 2025-07-05 14:34:57 -04:00
patientx
94464d7867
Merge branch 'comfyanonymous:master' into master 2025-07-04 11:03:58 +03:00
chaObserv
f41f323c52
Add the denoising step to several samplers (#8780) 2025-07-03 19:20:53 -04:00
patientx
455fc30fd8
Merge branch 'comfyanonymous:master' into master 2025-07-03 08:08:09 +03:00
City
d9277301d2
Initial code for new SLG node (#8759) 2025-07-02 20:13:43 -04:00
patientx
ac99b100ef
Merge branch 'comfyanonymous:master' into master 2025-07-02 12:50:51 +03:00
comfyanonymous
111f583e00
Fallback to regular op when fp8 op throws exception. (#8761) 2025-07-02 00:57:13 -04:00
patientx
fa03718ba9
Merge branch 'comfyanonymous:master' into master 2025-07-01 14:19:30 +03:00