EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-02-04 10:40:30 +08:00

Author	SHA1	Message	Date
comfyanonymous	17027f2a6a	Add a way to disable the final norm in the llama based TE models. (#10794 )	2025-11-18 22:36:03 -05:00
comfyanonymous	d526974576	Fix hunyuan 3d 2.0 (#10792 )	2025-11-18 16:46:19 -05:00
comfyanonymous	bd01d9f7fd	Add left padding support to tokenizers. (#10753 )	2025-11-15 06:54:40 -05:00
comfyanonymous	443056c401	Fix custom nodes import error. (#10747 ) This should fix the import errors but will break if the custom nodes actually try to use the class.	2025-11-14 03:26:05 -05:00
comfyanonymous	f60923590c	Use same code for chroma and flux blocks so that optimizations are shared. (#10746 )	2025-11-14 01:28:05 -05:00
rattus	94c298f962	flux: reduce VRAM usage (#10737 ) Cleanup a bunch of stack tensors on Flux. This take me from B=19 to B=22 for 1600x1600 on RTX5090.	2025-11-13 16:02:03 -08:00
contentis	3b3ef9a77a	Quantized Ops fixes (#10715 ) * offload support, bug fixes, remove mixins * add readme	2025-11-12 18:26:52 -05:00
rattus	1c7eaeca10	qwen: reduce VRAM usage (#10725 ) Clean up a bunch of stacked and no-longer-needed tensors on the QWEN VRAM peak (currently FFN). With this I go from OOMing at B=37x1328x1328 to being able to succesfully run B=47 (RTX5090).	2025-11-12 16:20:53 -05:00
rattus	18e7d6dba5	mm/mp: always unload re-used but modified models (#10724 ) The partial unloader path in model re-use flow skips straight to the actual unload without any check of the patching UUID. This means that if you do an upscale flow with a model patch on an existing model, it will not apply your patchings. Fix by delaying the partial_unload until after the uuid checks. This is done by making partial_unload a model of partial_load where extra_mem is -ve.	2025-11-12 16:19:53 -05:00
comfyanonymous	1199411747	Don't pin tensor if not a torch.nn.parameter.Parameter (#10718 )	2025-11-11 19:33:30 -05:00
rattus	c350009236	ops: Put weight cast on the offload stream (#10697 ) This needs to be on the offload stream. This reproduced a black screen with low resolution images on a slow bus when using FP8.	2025-11-09 22:52:11 -05:00
comfyanonymous	dea899f221	Unload weights if vram usage goes up between runs. (#10690 )	2025-11-09 18:51:33 -05:00
comfyanonymous	e632e5de28	Add logging for model unloading. (#10692 )	2025-11-09 18:06:39 -05:00
comfyanonymous	2abd2b5c20	Make ScaleROPE node work on Flux. (#10686 )	2025-11-08 15:52:02 -05:00
comfyanonymous	a1a70362ca	Only unpin tensor if it was pinned by ComfyUI (#10677 )	2025-11-07 11:15:05 -05:00
rattus	cf97b033ee	mm: guard against double pin and unpin explicitly (#10672 ) As commented, if you let cuda be the one to detect double pin/unpinning it actually creates an asyc GPU error.	2025-11-06 21:20:48 -05:00
comfyanonymous	09dc24c8a9	Pinned mem also seems to work on AMD. (#10658 )	2025-11-05 19:11:15 -05:00
comfyanonymous	1d69245981	Enable pinned memory by default on Nvidia. (#10656 ) Removed the --fast pinned_memory flag. You can use --disable-pinned-memory to disable it. Please report if it causes any issues.	2025-11-05 18:08:13 -05:00
comfyanonymous	97f198e421	Fix qwen controlnet regression. (#10657 )	2025-11-05 18:07:35 -05:00
comfyanonymous	c4a6b389de	Lower ltxv mem usage to what it was before previous pr. (#10643 ) Bring back qwen behavior to what it was before previous pr.	2025-11-04 22:47:35 -05:00
contentis	4cd881866b	Use single apply_rope function across models (#10547 )	2025-11-04 20:10:11 -05:00
comfyanonymous	7f3e4d486c	Limit amount of pinned memory on windows to prevent issues. (#10638 )	2025-11-04 17:37:50 -05:00
comfyanonymous	af4b7b5edb	More fp8 torch.compile regressions fixed. (#10625 )	2025-11-03 22:14:20 -05:00
comfyanonymous	0f4ef3afa0	This seems to slow things down slightly on Linux. (#10624 )	2025-11-03 21:47:14 -05:00
comfyanonymous	6b88478f9f	Bring back fp8 torch compile performance to what it should be. (#10622 )	2025-11-03 19:22:10 -05:00
comfyanonymous	e199c8cc67	Fixes (#10621 )	2025-11-03 17:58:24 -05:00
comfyanonymous	0652cb8e2d	Speed up torch.compile (#10620 )	2025-11-03 17:37:12 -05:00
comfyanonymous	958a17199a	People should update their pytorch versions. (#10618 )	2025-11-03 17:08:30 -05:00
comfyanonymous	97ff9fae7e	Clarify help text for --fast argument (#10609 ) Updated help text for the --fast argument to clarify potential risks.	2025-11-02 13:14:04 -05:00
rattus	135fa49ec2	Small speed improvements to --async-offload (#10593 ) * ops: dont take an offload stream if you dont need one * ops: prioritize mem transfer The async offload streams reason for existence is to transfer from RAM to GPU. The post processing compute steps are a bonus on the side stream, but if the compute stream is running a long kernel, it can stall the side stream, as it wait to type-cast the bias before transferring the weight. So do a pure xfer of the weight straight up, then do everything bias, then go back to fix the weight type and do weight patches.	2025-11-01 18:48:53 -04:00
comfyanonymous	44869ff786	Fix issue with pinned memory. (#10597 )	2025-11-01 17:25:59 -04:00
comfyanonymous	c58c13b2ba	Fix torch compile regression on fp8 ops. (#10580 )	2025-11-01 00:25:17 -04:00
comfyanonymous	7f374e42c8	ScaleROPE now works on Lumina models. (#10578 )	2025-10-31 15:41:40 -04:00
comfyanonymous	27d1bd8829	Fix rope scaling. (#10560 )	2025-10-30 22:51:58 -04:00
comfyanonymous	614cf9805e	Add a ScaleROPE node. Currently only works on WAN models. (#10559 )	2025-10-30 22:11:38 -04:00
rattus	513b0c46fb	Add RAM Pressure cache mode (#10454 ) * execution: Roll the UI cache into the outputs Currently the UI cache is parallel to the output cache with expectations of being a content superset of the output cache. At the same time the UI and output cache are maintained completely seperately, making it awkward to free the output cache content without changing the behaviour of the UI cache. There are two actual users (getters) of the UI cache. The first is the case of a direct content hit on the output cache when executing a node. This case is very naturally handled by merging the UI and outputs cache. The second case is the history JSON generation at the end of the prompt. This currently works by asking the cache for all_node_ids and then pulling the cache contents for those nodes. all_node_ids is the nodes of the dynamic prompt. So fold the UI cache into the output cache. The current UI cache setter now writes to a prompt-scope dict. When the output cache is set, just get this value from the dict and tuple up with the outputs. When generating the history, simply iterate prompt-scope dict. This prepares support for more complex caching strategies (like RAM pressure caching) where less than 1 workflow will be cached and it will be desirable to keep the UI cache and output cache in sync. * sd: Implement RAM getter for VAE * model_patcher: Implement RAM getter for ModelPatcher * sd: Implement RAM getter for CLIP * Implement RAM Pressure cache Implement a cache sensitive to RAM pressure. When RAM headroom drops down below a certain threshold, evict RAM-expensive nodes from the cache. Models and tensors are measured directly for RAM usage. An OOM score is then computed based on the RAM usage of the node. Note the due to indirection through shared objects (like a model patcher), multiple nodes can account the same RAM as their individual usage. The intent is this will free chains of nodes particularly model loaders and associate loras as they all score similar and are sorted in close to each other. Has a bias towards unloading model nodes mid flow while being able to keep results like text encodings and VAE. * execution: Convert the cache entry to NamedTuple As commented in review. Convert this to a named tuple and abstract away the tuple type completely from graph.py.	2025-10-30 17:39:02 -04:00
Jedrzej Kosinski	998bf60beb	Add units/info for the numbers displayed on 'load completely' and 'load partially' log messages (#10538 )	2025-10-29 19:37:06 -04:00
comfyanonymous	906c089957	Fix small performance regression with fp8 fast and scaled fp8. (#10537 )	2025-10-29 19:29:01 -04:00
comfyanonymous	25de7b1bfa	Try to fix slow load issue on low ram hardware with pinned mem. (#10536 )	2025-10-29 17:20:27 -04:00
rattus	ab7ab5be23	Fix Race condition in --async-offload that can cause corruption (#10501 ) * mm: factor out the current stream getter Make this a reusable function. * ops: sync the offload stream with the consumption of w&b This sync is nessacary as pytorch will queue cuda async frees on the same stream as created to tensor. In the case of async offload, this will be on the offload stream. Weights and biases can go out of scope in python which then triggers the pytorch garbage collector to queue the free operation on the offload stream possible before the compute stream has used the weight. This causes a use after free on weight data leading to total corruption of some workflows. So sync the offload stream with the compute stream after the weight has been used so the free has to wait for the weight to be used. The cast_bias_weight is extended in a backwards compatible way with the new behaviour opt-in on a defaulted parameter. This handles custom node packs calling cast_bias_weight and defeatures async-offload for them (as they do not handle the race). The pattern is now: cast_bias_weight(... , offloadable=True) #This might be offloaded thing(weight, bias, ...) uncast_bias_weight(...) * controlnet: adopt new cast_bias_weight synchronization scheme This is nessacary for safe async weight offloading. * mm: sync the last stream in the queue, not the next Currently this peeks ahead to sync the next stream in the queue of streams with the compute stream. This doesnt allow a lot of parallelization, as then end result is you can only get one weight load ahead regardless of how many streams you have. Rotate the loop logic here to synchronize the end of the queue before returning the next stream. This allows weights to be loaded ahead of the compute streams position.	2025-10-29 17:17:46 -04:00
comfyanonymous	ec4fc2a09a	Fix case of weights not being unpinned. (#10533 )	2025-10-29 15:48:06 -04:00
comfyanonymous	1a58087ac2	Reduce memory usage for fp8 scaled op. (#10531 )	2025-10-29 15:43:51 -04:00
comfyanonymous	e525673f72	Fix issue. (#10527 )	2025-10-29 00:37:00 -04:00
comfyanonymous	3fa7a5c04a	Speed up offloading using pinned memory. (#10526 ) To enable this feature use: --fast pinned_memory	2025-10-29 00:21:01 -04:00
contentis	8817f8fc14	Mixed Precision Quantization System (#10498 ) * Implement mixed precision operations with a registry design and metadate for quant spec in checkpoint. * Updated design using Tensor Subclasses * Fix FP8 MM * An actually functional POC * Remove CK reference and ensure correct compute dtype * Update unit tests * ruff lint * Implement mixed precision operations with a registry design and metadate for quant spec in checkpoint. * Updated design using Tensor Subclasses * Fix FP8 MM * An actually functional POC * Remove CK reference and ensure correct compute dtype * Update unit tests * ruff lint * Fix missing keys * Rename quant dtype parameter * Rename quant dtype parameter * Fix unittests for CPU build	2025-10-28 16:20:53 -04:00
comfyanonymous	f6bbc1ac84	Fix mistake. (#10484 )	2025-10-25 23:07:29 -04:00
comfyanonymous	098a352f13	Add warning for torch-directml usage (#10482 ) Added a warning message about the state of torch-directml.	2025-10-25 20:05:22 -04:00
comfyanonymous	426cde37f1	Remove useless function (#10472 )	2025-10-24 19:56:51 -04:00
comfyanonymous	1bcda6df98	WIP way to support multi multi dimensional latents. (#10456 )	2025-10-23 21:21:14 -04:00
comfyanonymous	9cdc64998f	Only disable cudnn on newer AMD GPUs. (#10437 )	2025-10-21 19:15:23 -04:00
comfyanonymous	2c2aa409b0	Log message for cudnn disable on AMD. (#10418 )	2025-10-20 15:43:24 -04:00
comfyanonymous	b4f30bd408	Pytorch is stupid. (#10398 )	2025-10-19 01:25:35 -04:00
comfyanonymous	dad076aee6	Speed up chroma radiance. (#10395 )	2025-10-18 23:19:52 -04:00
comfyanonymous	0cf33953a7	Fix batch size above 1 giving bad output in chroma radiance. (#10394 )	2025-10-18 23:15:34 -04:00
comfyanonymous	5b80addafd	Turn off cuda malloc by default when --fast autotune is turned on. (#10393 )	2025-10-18 22:35:46 -04:00
comfyanonymous	9da397ea2f	Disable torch compiler for cast_bias_weight function (#10384 ) * Disable torch compiler for cast_bias_weight function * Fix torch compile.	2025-10-17 20:03:28 -04:00
comfyanonymous	b1293d50ef	workaround also works on cudnn 91200 (#10375 )	2025-10-16 19:59:56 -04:00
comfyanonymous	19b466160c	Workaround for nvidia issue where VAE uses 3x more memory on torch 2.9 (#10373 )	2025-10-16 18:16:03 -04:00
Faych	afa8a24fe1	refactor: Replace manual patches merging with merge_nested_dicts (#10360 )	2025-10-15 17:16:09 -07:00
Jedrzej Kosinski	493b81e48f	Fix order of inputs nested merge_nested_dicts (#10362 )	2025-10-15 16:47:26 -07:00
comfyanonymous	1c10b33f9b	gfx942 doesn't support fp8 operations. (#10348 )	2025-10-15 00:21:11 -04:00
comfyanonymous	3374e900d0	Faster workflow cancelling. (#10301 )	2025-10-13 23:43:53 -04:00
comfyanonymous	dfff7e5332	Better memory estimation for the SD/Flux VAE on AMD. (#10334 )	2025-10-13 22:37:19 -04:00
comfyanonymous	e4ea393666	Fix loading old stable diffusion ckpt files on newer numpy. (#10333 )	2025-10-13 22:18:58 -04:00
comfyanonymous	c8674bc6e9	Enable RDNA4 pytorch attention on ROCm 7.0 and up. (#10332 )	2025-10-13 21:19:03 -04:00
rattus128	95ca2e56c8	WAN2.2: Fix cache VRAM leak on error (#10308 ) Same change pattern as `7e8dd275c2` applied to WAN2.2 If this suffers an exception (such as a VRAM oom) it will leave the encode() and decode() methods which skips the cleanup of the WAN feature cache. The comfy node cache then ultimately keeps a reference this object which is in turn reffing large tensors from the failed execution. The feature cache is currently setup at a class variable on the encoder/decoder however, the encode and decode functions always clear it on both entry and exit of normal execution. Its likely the design intent is this is usable as a streaming encoder where the input comes in batches, however the functions as they are today don't support that. So simplify by bringing the cache back to local variable, so that if it does VRAM OOM the cache itself is properly garbage when the encode()/decode() functions dissappear from the stack.	2025-10-13 15:23:11 -04:00
comfyanonymous	e693e4db6a	Always set diffusion model to eval() mode. (#10331 )	2025-10-13 14:57:27 -04:00
comfyanonymous	a125cd84b0	Improve AMD performance. (#10302 ) I honestly have no idea why this improves things but it does.	2025-10-12 00:28:01 -04:00
comfyanonymous	84e9ce32c6	Implement the mmaudio VAE. (#10300 )	2025-10-11 22:57:23 -04:00
comfyanonymous	f1dd6e50f8	Fix bug with applying loras on fp8 scaled without fp8 ops. (#10279 )	2025-10-09 19:02:40 -04:00
comfyanonymous	139addd53c	More surgical fix for #10267 (#10276 )	2025-10-09 16:37:35 -04:00
comfyanonymous	6e59934089	Refactor model sampling sigmas code. (#10250 )	2025-10-08 17:49:02 -04:00
comfyanonymous	8aea746212	Implement gemma 3 as a text encoder. (#10241 ) Not useful yet.	2025-10-06 22:08:08 -04:00
comfyanonymous	195e0b0639	Remove useless code. (#10223 )	2025-10-05 15:41:19 -04:00
Finn-Hecker	93d859cfaa	Fix type annotation syntax in MotionEncoder_tc __init__ (#10186 ) ## Summary Fixed incorrect type hint syntax in `MotionEncoder_tc.__init__()` parameter list. ## Changes - Line 647: Changed `num_heads=int` to `num_heads: int` - This corrects the parameter annotation from a default value assignment to proper type hint syntax ## Details The parameter was using assignment syntax (`=`) instead of type annotation syntax (`:`), which would incorrectly set the default value to the `int` class itself rather than annotating the expected type.	2025-10-03 14:32:19 -07:00
rattus128	4965c0e2ac	WAN: Fix cache VRAM leak on error (#10141 ) If this suffers an exception (such as a VRAM oom) it will leave the encode() and decode() methods which skips the cleanup of the WAN feature cache. The comfy node cache then ultimately keeps a reference this object which is in turn reffing large tensors from the failed execution. The feature cache is currently setup at a class variable on the encoder/decoder however, the encode and decode functions always clear it on both entry and exit of normal execution. Its likely the design intent is this is usable as a streaming encoder where the input comes in batches, however the functions as they are today don't support that. So simplify by bringing the cache back to local variable, so that if it does VRAM OOM the cache itself is properly garbage when the encode()/decode() functions dissappear from the stack.	2025-10-01 18:42:16 -04:00
rattus128	911331c06c	sd: fix VAE tiled fallback VRAM leak (#10139 ) When the VAE catches this VRAM OOM, it launches the fallback logic straight from the exception context. Python however refs the entire call stack that caused the exception including any local variables for the sake of exception report and debugging. In the case of tensors, this can hold on the references to GBs of VRAM and inhibit the VRAM allocated from freeing them. So dump the except context completely before going back to the VAE via the tiler by getting out of the except block with nothing but a flag. The greately increases the reliability of the tiler fallback, especially on low VRAM cards, as with the bug, if the leak randomly leaked more than the headroom needed for a single tile, the tiler would fallback would OOM and fail the flow.	2025-10-01 18:40:28 -04:00
comfyanonymous	a6f83a4a1a	Support the new hunyuan vae. (#10150 )	2025-10-01 17:19:13 -04:00
rattus128	653ceab414	Reduce Peak WAN inference VRAM usage - part II (#10062 ) * flux: math: Use _addcmul to avoid expensive VRAM intermediate The rope process can be the VRAM peak and this intermediate for the addition result before releasing the original can OOM. addcmul_ it. * wan: Delete the self attention before cross attention This saves VRAM when the cross attention and FFN are in play as the VRAM peak.	2025-09-27 18:14:16 -04:00
Jedrzej Kosinski	196954ab8c	Add 'input_cond' and 'input_uncond' to the args dictionary passed into sampler_cfg_function (#10044 )	2025-09-26 19:55:03 -07:00
comfyanonymous	1e098d6132	Don't add template to qwen2.5vl when template is in prompt. (#10043 ) Make the hunyuan image refiner template_end 36.	2025-09-26 18:34:17 -04:00
Guy Niv	c8d2117f02	Fix memory leak by properly detaching model finalizer (#9979 ) When unloading models in load_models_gpu(), the model finalizer was not being explicitly detached, leading to a memory leak. This caused linear memory consumption increase over time as models are repeatedly loaded and unloaded. This change prevents orphaned finalizer references from accumulating in memory during model switching operations.	2025-09-24 22:35:12 -04:00
comfyanonymous	fccab99ec0	Fix issue with .view() in HuMo. (#10014 )	2025-09-24 20:09:42 -04:00
comfyanonymous	1fee8827cb	Support for qwen edit plus model. Use the new TextEncodeQwenImageEditPlus. (#9986 )	2025-09-22 16:49:48 -04:00
comfyanonymous	d1d9eb94b1	Lower wan memory estimation value a bit. (#9964 ) Previous pr reduced the peak memory requirement.	2025-09-20 22:09:35 -04:00
Kohaku-Blueleaf	7be2b49b6b	Fix LoRA Trainer bugs with FP8 models. (#9854 ) * Fix adapter weight init * Fix fp8 model training * Avoid inference tensor	2025-09-20 21:24:48 -04:00
comfyanonymous	e8df53b764	Update WanAnimateToVideo to more easily extend videos. (#9959 )	2025-09-19 18:48:56 -04:00
comfyanonymous	dc95b6acc0	Basic WIP support for the wan animate model. (#9939 )	2025-09-19 03:07:17 -04:00
comfyanonymous	24b0fce099	Do padding of audio embed in model for humo for more flexibility. (#9935 )	2025-09-18 19:54:16 -04:00
DELUXA	8d6653fca6	Enable fp8 ops by default on gfx1200 (#9926 )	2025-09-18 19:50:37 -04:00
comfyanonymous	dd611a7700	Support the HuMo 17B model. (#9912 )	2025-09-17 18:39:24 -04:00
comfyanonymous	9288c78fc5	Support the HuMo model. (#9903 )	2025-09-17 00:12:48 -04:00
rattus128	e42682b24e	Reduce Peak WAN inference VRAM usage (#9898 ) * flux: Do the xq and xk ropes one at a time This was doing independendent interleaved tensor math on the q and k tensors, leading to the holding of more than the minimum intermediates in VRAM. On a bad day, it would VRAM OOM on xk intermediates. Do everything q and then everything k, so torch can garbage collect all of qs intermediates before k allocates its intermediates. This reduces peak VRAM usage for some WAN2.2 inferences (at least). * wan: Optimize qkv intermediates on attention As commented. The former logic computed independent pieces of QKV in parallel which help more inference intermediates in VRAM spiking VRAM usage. Fully roping Q and garbage collecting the intermediates before touching K reduces the peak inference VRAM usage.	2025-09-16 19:21:14 -04:00
comfyanonymous	a39ac59c3e	Add encoder part of whisper large v3 as an audio encoder model. (#9894 ) Not useful yet but some models use it.	2025-09-16 01:19:50 -04:00
blepping	1a85483da1	Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884 ) Correctly handle the case where w0 is passed by kwargs in BatchedBrownianTree	2025-09-15 20:05:03 -04:00
comfyanonymous	47a9cde5d3	Support the omnigen2 umo lora. (#9886 )	2025-09-15 18:10:55 -04:00
Jedrzej Kosinski	f228367c5e	Make ModuleNotFoundError ImportError instead (#9850 )	2025-09-13 21:34:21 -04:00
comfyanonymous	80b7c9455b	Changes to the previous radiance commit. (#9851 )	2025-09-13 18:03:34 -04:00
blepping	c1297f4eb3	Add support for Chroma Radiance (#9682 ) * Initial Chroma Radiance support * Minor Chroma Radiance cleanups * Update Radiance nodes to ensure latents/images are on the intermediate device * Fix Chroma Radiance memory estimation. * Increase Chroma Radiance memory usage factor * Increase Chroma Radiance memory usage factor once again * Ensure images are multiples of 16 for Chroma Radiance Add batch dimension and fix channels when necessary in ChromaRadianceImageToLatent node * Tile Chroma Radiance NeRF to reduce memory consumption, update memory usage factor * Update Radiance to support conv nerf final head type. * Allow setting NeRF embedder dtype for Radiance Bump Radiance nerf tile size to 32 Support EasyCache/LazyCache on Radiance (maybe) * Add ChromaRadianceStubVAE node * Crop Radiance image inputs to multiples of 16 instead of erroring to be in line with existing VAE behavior * Convert Chroma Radiance nodes to V3 schema. * Add ChromaRadianceOptions node and backend support. Cleanups/refactoring to reduce code duplication with Chroma. * Fix overriding the NeRF embedder dtype for Chroma Radiance * Minor Chroma Radiance cleanups * Move Chroma Radiance to its own directory in ldm Minor code cleanups and tooltip improvements * Fix Chroma Radiance embedder dtype overriding * Remove Radiance dynamic nerf_embedder dtype override feature * Unbork Radiance NeRF embedder init * Remove Chroma Radiance image conversion and stub VAE nodes Add a chroma_radiance option to the VAELoader builtin node which uses comfy.sd.PixelspaceConversionVAE Add a PixelspaceConversionVAE to comfy.sd for converting BHWC 0..1 <-> BCHW -1..1	2025-09-13 17:58:43 -04:00
Kimbing Ng	e5e70636e7	Remove single quote pattern to avoid wrong matches (#9842 )	2025-09-13 16:59:19 -04:00
comfyanonymous	29bf807b0e	Cleanup. (#9838 )	2025-09-12 21:57:04 -04:00
Jukka Seppänen	2559dee492	Support wav2vec base models (#9637 ) * Support wav2vec base models * trim trailing whitespace * Do interpolation after	2025-09-12 21:52:58 -04:00
comfyanonymous	a3b04de700	Hunyuan refiner vae now works with tiled. (#9836 )	2025-09-12 19:46:46 -04:00
Jedrzej Kosinski	d7f40442f9	Enable Runtime Selection of Attention Functions (#9639 ) * Looking into a @wrap_attn decorator to look for 'optimized_attention_override' entry in transformer_options * Created logging code for this branch so that it can be used to track down all the code paths where transformer_options would need to be added * Fix memory usage issue with inspect * Made WAN attention receive transformer_options, test node added to wan to test out attention override later * Added *kwargs to all attention functions so transformer_options could potentially be passed through Make sure wrap_attn doesn't make itself recurse infinitely, attempt to load SageAttention and FlashAttention if not enabled so that they can be marked as available or not, create registry for available attention * Turn off attention logging for now, make AttentionOverrideTestNode have a dropdown with available attention (this is a test node only) * Make flux work with optimized_attention_override * Add logs to verify optimized_attention_override is passed all the way into attention function * Make Qwen work with optimized_attention_override * Made hidream work with optimized_attention_override * Made wan patches_replace work with optimized_attention_override * Made SD3 work with optimized_attention_override * Made HunyuanVideo work with optimized_attention_override * Made Mochi work with optimized_attention_override * Made LTX work with optimized_attention_override * Made StableAudio work with optimized_attention_override * Made optimized_attention_override work with ACE Step * Made Hunyuan3D work with optimized_attention_override * Make CosmosPredict2 work with optimized_attention_override * Made CosmosVideo work with optimized_attention_override * Made Omnigen 2 work with optimized_attention_override * Made StableCascade work with optimized_attention_override * Made AuraFlow work with optimized_attention_override * Made Lumina work with optimized_attention_override * Made Chroma work with optimized_attention_override * Made SVD work with optimized_attention_override * Fix WanI2VCrossAttention so that it expects to receive transformer_options * Fixed Wan2.1 Fun Camera transformer_options passthrough * Fixed WAN 2.1 VACE transformer_options passthrough * Add optimized to get_attention_function * Disable attention logs for now * Remove attention logging code * Remove _register_core_attention_functions, as we wouldn't want someone to call that, just in case * Satisfy ruff * Remove AttentionOverrideTest node, that's something to cook up for later	2025-09-12 18:07:38 -04:00
comfyanonymous	b149e2e1e3	Better way of doing the generator for the hunyuan image noise aug. (#9834 )	2025-09-12 17:53:15 -04:00
comfyanonymous	7757d5a657	Set default hunyuan refiner shift to 4.0 (#9833 )	2025-09-12 16:40:12 -04:00
comfyanonymous	e600520f8a	Fix hunyuan refiner blownout colors at noise aug less than 0.25 (#9832 )	2025-09-12 16:35:34 -04:00
comfyanonymous	fd2b820ec2	Add noise augmentation to hunyuan image refiner. (#9831 ) This was missing and should help with colors being blown out.	2025-09-12 16:03:08 -04:00
comfyanonymous	33bd9ed9cb	Implement hunyuan image refiner model. (#9817 )	2025-09-12 00:43:20 -04:00
comfyanonymous	18de0b2830	Fast preview for hunyuan image. (#9814 )	2025-09-11 19:33:02 -04:00
comfyanonymous	e01e99d075	Support hunyuan image distilled model. (#9807 )	2025-09-10 23:17:34 -04:00
comfyanonymous	543888d3d8	Fix lowvram issue with hunyuan image vae. (#9794 )	2025-09-10 02:15:34 -04:00
comfyanonymous	85e34643f8	Support hunyuan image 2.1 regular model. (#9792 )	2025-09-10 02:05:07 -04:00
comfyanonymous	5c33872e2f	Fix issue on old torch. (#9791 )	2025-09-10 00:23:47 -04:00
comfyanonymous	b288fb0db8	Small refactor of some vae code. (#9787 )	2025-09-09 18:09:56 -04:00
comfyanonymous	103a12cb66	Support qwen inpaint controlnet. (#9772 )	2025-09-08 17:30:26 -04:00
contentis	97652d26b8	Add explicit casting in apply_rope for Qwen VL (#9759 )	2025-09-08 15:08:18 -04:00
comfyanonymous	fb763d4333	Fix amd_min_version crash when cpu device. (#9754 )	2025-09-07 21:16:29 -04:00
comfyanonymous	bcbd7884e3	Don't enable pytorch attention on AMD if triton isn't available. (#9747 )	2025-09-07 00:29:38 -04:00
comfyanonymous	27a0fcccc3	Enable bf16 VAE on RDNA4. (#9746 )	2025-09-06 23:25:22 -04:00
comfyanonymous	ea6cdd2631	Print all fast options in --help (#9737 )	2025-09-06 01:05:05 -04:00
comfyanonymous	2ee7879a0b	Fix lowvram issues with hunyuan3d 2.1 (#9735 )	2025-09-05 14:57:35 -04:00
comfyanonymous	c9ebe70072	Some changes to the previous hunyuan PR. (#9725 )	2025-09-04 20:39:02 -04:00
Yousef R. Gamaleldin	261421e218	Add Hunyuan 3D 2.1 Support (#8714 )	2025-09-04 20:36:20 -04:00
comfyanonymous	72855db715	Fix potential rope issue. (#9710 )	2025-09-03 22:20:13 -04:00
comfyanonymous	e3018c2a5a	uso -> uxo/uno as requested. (#9688 )	2025-09-02 16:12:07 -04:00
comfyanonymous	3412d53b1d	USO style reference. (#9677 ) Load the projector.safetensors file with the ModelPatchLoader node and use the siglip_vision_patch14_384.safetensors "clip vision" model and the USOStyleReferenceNode.	2025-09-02 15:36:22 -04:00
contentis	e2d1e5dad9	Enable Convolution AutoTuning (#9301 )	2025-09-01 20:33:50 -04:00
comfyanonymous	27e067ce50	Implement the USO subject identity lora. (#9674 ) Use the lora with FluxContextMultiReferenceLatentMethod node set to "uso" and a ReferenceLatent node with the reference image.	2025-09-01 18:54:02 -04:00
chaObserv	32a627bf1f	SEEDS: update noise decomposition and refactor (#9633 ) - Update the decomposition to reflect interval dependency - Extract phi computations into functions - Use torch.lerp for interpolation	2025-08-31 00:01:45 -04:00
comfyanonymous	e80a14ad50	Support wan2.2 5B fun control model. (#9611 ) Use the Wan22FunControlToVideo node.	2025-08-28 22:13:07 -04:00
comfyanonymous	4aa79dbf2c	Adjust flux mem usage factor a bit. (#9588 )	2025-08-27 23:08:17 -04:00
Gangin Park	3aad339b63	Add DPM++ 2M SDE Heun (RES) sampler (#9542 )	2025-08-27 19:07:31 -04:00
comfyanonymous	491755325c	Better s2v memory estimation. (#9584 )	2025-08-27 19:02:42 -04:00
comfyanonymous	496888fd68	Improve s2v performance when generating videos longer than 120 frames. (#9582 )	2025-08-27 16:06:40 -04:00
comfyanonymous	b5ac6ed7ce	Fixes to make controlnet type models work on qwen edit and kontext. (#9581 )	2025-08-27 15:26:28 -04:00
Kohaku-Blueleaf	b20ba1f27c	Fix #9537 (#9576 )	2025-08-27 12:45:02 -04:00
comfyanonymous	88aee596a3	WIP Wan 2.2 S2V model. (#9568 )	2025-08-27 01:10:34 -04:00
comfyanonymous	914c2a2973	Implement wav2vec2 as an audio encoder model. (#9549 ) This is useless on its own but there are multiple models that use it.	2025-08-25 23:26:47 -04:00
comfyanonymous	41048c69b4	Fix Conditioning masks on 3d latents. (#9506 )	2025-08-22 23:15:44 -04:00
Jedrzej Kosinski	fc247150fe	Implement EasyCache and Invent LazyCache (#9496 ) * Attempting a universal implementation of EasyCache, starting with flux as test; I screwed up the math a bit, but when I set it just right it works. * Fixed math to make threshold work as expected, refactored code to use EasyCacheHolder instead of a dict wrapped by object * Use sigmas from transformer_options instead of timesteps to be compatible with a greater amount of models, make end_percent work * Make log statement when not skipping useful, preparing for per-cond caching * Added DIFFUSION_MODEL wrapper around forward function for wan model * Add subsampling for heuristic inputs * Add subsampling to output_prev (output_prev_subsampled now) * Properly consider conds in EasyCache logic * Created SuperEasyCache to test what happens if caching and reuse is moved outside the scope of conds, added PREDICT_NOISE wrapper to facilitate this test * Change max reuse_threshold to 3.0 * Mark EasyCache/SuperEasyCache as experimental (beta) * Make Lumina2 compatible with EasyCache * Add EasyCache support for Qwen Image * Fix missing comma, curse you Cursor * Add EasyCache support to AceStep * Add EasyCache support to Chroma * Added EasyCache support to Cosmos Predict t2i * Make EasyCache not crash with Cosmos Predict ImagToVideo latents, but does not work well at all * Add EasyCache support to hidream * Added EasyCache support to hunyuan video * Added EasyCache support to hunyuan3d * Added EasyCache support to LTXV (not very good, but does not crash) * Implemented EasyCache for aura_flow * Renamed SuperEasyCache to LazyCache, hardcoded subsample_factor to 8 on nodes * Eatra logging when verbose is true for EasyCache	2025-08-22 22:41:08 -04:00
contentis	fe31ad0276	Add elementwise fusions (#9495 ) * Add elementwise fusions * Add addcmul pattern to Qwen	2025-08-22 19:39:15 -04:00
comfyanonymous	ff57793659	Support InstantX Qwen controlnet. (#9488 )	2025-08-22 00:53:11 -04:00
comfyanonymous	f7bd5e58dd	Make it easier to implement future qwen controlnets. (#9485 )	2025-08-21 23:18:04 -04:00
comfyanonymous	0963493a9c	Support for Qwen Diffsynth Controlnets canny and depth. (#9465 ) These are not real controlnets but actually a patch on the model so they will be treated as such. Put them in the models/model_patches/ folder. Use the new ModelPatchLoader and QwenImageDiffsynthControlnet nodes.	2025-08-20 22:26:37 -04:00
comfyanonymous	8d38ea3bbf	Fix bf16 precision issue with qwen image embeddings. (#9441 )	2025-08-20 02:58:54 -04:00
comfyanonymous	5a8f502db5	Disable prompt weights for qwen. (#9438 )	2025-08-20 01:08:11 -04:00
comfyanonymous	7cd2c4bd6a	Qwen rotary embeddings should now match reference code. (#9437 )	2025-08-20 00:45:27 -04:00
comfyanonymous	dfa791eb4b	Rope fix for qwen vl. (#9435 )	2025-08-19 20:47:42 -04:00
comfyanonymous	4977f203fa	P2 of qwen edit model. (#9412 ) * P2 of qwen edit model. * Typo. * Fix normal qwen. * Fix. * Make the TextEncodeQwenImageEdit also set the ref latent. If you don't want it to set the ref latent and want to use the ReferenceLatent node with your custom latent instead just disconnect the VAE.	2025-08-18 22:38:34 -04:00
Jedrzej Kosinski	7f3b9b16c6	Make step index detection much more robust (#9392 )	2025-08-17 18:54:07 -04:00
comfyanonymous	ed43784b0d	WIP Qwen edit model: The diffusion model part. (#9383 )	2025-08-17 16:45:39 -04:00
comfyanonymous	0f2b8525bc	Qwen image model refactor. (#9375 )	2025-08-16 17:51:28 -04:00
comfyanonymous	1702e6df16	Implement wan2.2 camera model. (#9357 ) Use the old WanCameraImageToVideo node.	2025-08-15 17:29:58 -04:00
comfyanonymous	c308a8840a	Add FluxKontextMultiReferenceLatentMethod node. (#9356 ) This node is only useful if someone trains the kontext model to properly use multiple reference images via the index method. The default is the offset method which feeds the multiple images like if they were stitched together as one. This method works with the current flux kontext model.	2025-08-15 15:50:39 -04:00
comfyanonymous	e08ecfbd8a	Add warning when using old pytorch. (#9347 )	2025-08-15 00:22:26 -04:00
comfyanonymous	4e5c230f6a	Fix last commit not working on older pytorch. (#9346 )	2025-08-14 23:44:02 -04:00
Xiangxi Guo (Ryan)	f0d5d0111f	Avoid torch compile graphbreak for older pytorch versions (#9344 ) Turns out torch.compile has some gaps in context manager decorator syntax support. I've sent patches to fix that in PyTorch, but it won't be available for all the folks running older versions of PyTorch, hence this trivial patch.	2025-08-14 23:41:37 -04:00
comfyanonymous	ad19a069f6	Make SLG nodes work on Qwen Image model. (#9345 )	2025-08-14 23:16:01 -04:00
Jedrzej Kosinski	e4f7ea105f	Added context window support to core sampling code (#9238 ) * Added initial support for basic context windows - in progress * Add prepare_sampling wrapper for context window to more accurately estimate latent memory requirements, fixed merging wrappers/callbacks dicts in prepare_model_patcher * Made context windows compatible with different dimensions; works for WAN, but results are bad * Fix comfy.patcher_extension.merge_nested_dicts calls in prepare_model_patcher in sampler_helpers.py * Considering adding some callbacks to context window code to allow extensions of behavior without the need to rewrite code * Made dim slicing cleaner * Add Wan Context WIndows node for testing * Made context schedule and fuse method functions be stored on the handler instead of needing to be registered in core code to be found * Moved some code around between node_context_windows.py and context_windows.py * Change manual context window nodes names/ids * Added callbacks to IndexListContexHandler * Adjusted default values for context_length and context_overlap, made schema.inputs definition for WAN Context Windows less annoying * Make get_resized_cond more robust for various dim sizes * Fix typo * Another small fix	2025-08-13 21:33:05 -04:00
Simon Lui	c991a5da65	Fix XPU iGPU regressions (#9322 ) * Change bf16 check and switch non-blocking to off default with option to force to regain speed on certain classes of iGPUs and refactor xpu check. * Turn non_blocking off by default for xpu. * Update README.md for Intel GPUs.	2025-08-13 19:13:35 -04:00
comfyanonymous	9df8792d4b	Make last PR not crash comfy on old pytorch. (#9324 )	2025-08-13 15:12:41 -04:00
contentis	3da5a07510	SDPA backend priority (#9299 )	2025-08-13 14:53:27 -04:00
comfyanonymous	560d38f34c	Wan2.2 fun control support. (#9292 )	2025-08-12 23:26:33 -04:00
PsychoLogicAu	2208aa616d	Support SimpleTuner lycoris lora for Qwen-Image (#9280 )	2025-08-11 16:56:16 -04:00
comfyanonymous	5828607ccf	Not sure if AMD actually support fp16 acc but it doesn't crash. (#9258 )	2025-08-09 12:49:25 -04:00
comfyanonymous	735bb4bdb1	Users report gfx1201 is buggy on flux with pytorch attention. (#9244 )	2025-08-08 04:21:00 -04:00
flybirdxx	4c3e57b0ae	Fixed an issue where qwenLora could not be loaded properly. (#9208 )	2025-08-06 13:23:11 -04:00
comfyanonymous	d044a24398	Fix default shift and any latent size for qwen image model. (#9186 )	2025-08-05 06:12:27 -04:00
comfyanonymous	c012400240	Initial support for qwen image model. (#9179 )	2025-08-04 22:53:25 -04:00
comfyanonymous	03895dea7c	Fix another issue with the PR. (#9170 )	2025-08-04 04:33:04 -04:00
comfyanonymous	84f9759424	Add some warnings and prevent crash when cond devices don't match. (#9169 )	2025-08-04 04:20:12 -04:00
comfyanonymous	7991341e89	Various fixes for broken things from earlier PR. (#9168 )	2025-08-04 04:02:40 -04:00
comfyanonymous	140ffc7fdc	Fix broken controlnet from last PR. (#9167 )	2025-08-04 03:28:12 -04:00
comfyanonymous	182f90b5ec	Lower cond vram use by casting at the same time as device transfer. (#9159 )	2025-08-04 03:11:53 -04:00
comfyanonymous	aebac22193	Cleanup. (#9160 )	2025-08-03 07:08:11 -04:00
comfyanonymous	13aaa66ec2	Make sure context is on the right device. (#9154 )	2025-08-02 15:09:23 -04:00
comfyanonymous	5f582a9757	Make sure all the conds are on the right device. (#9151 )	2025-08-02 15:00:13 -04:00
comfyanonymous	1e638a140b	Tiny wan vae optimizations. (#9136 )	2025-08-01 05:25:38 -04:00
chaObserv	61b08d4ba6	Replace manual x * sigmoid(x) with torch silu in VAE nonlinearity (#9057 )	2025-07-30 19:25:56 -04:00
comfyanonymous	da9dab7edd	Small wan camera memory optimization. (#9111 )	2025-07-30 05:55:26 -04:00
comfyanonymous	dca6bdd4fa	Make wan2.2 5B i2v take a lot less memory. (#9102 )	2025-07-29 19:44:18 -04:00
comfyanonymous	7d593baf91	Extra reserved vram on large cards on windows. (#9093 )	2025-07-29 04:07:45 -04:00
comfyanonymous	c60dc4177c	Remove unecessary clones in the wan2.2 VAE. (#9083 )	2025-07-28 14:48:19 -04:00
comfyanonymous	a88788dce6	Wan 2.2 support. (#9080 )	2025-07-28 08:00:23 -04:00
comfyanonymous	0621d73a9c	Remove useless code. (#9059 )	2025-07-26 04:44:19 -04:00
comfyanonymous	e6e5d33b35	Remove useless code. (#9041 ) This is only needed on old pytorch 2.0 and older.	2025-07-25 04:58:28 -04:00
Eugene Fairley	4293e4da21	Add WAN ATI support (#8874 ) * Add WAN ATI support * Fixes * Fix length * Remove extra functions * Fix * Fix * Ruff fix * Remove torch.no_grad * Add batch trajectory logic * Scale inputs before and after motion patch * Batch image/trajectory * Ruff fix * Clean up	2025-07-24 20:59:19 -04:00
comfyanonymous	69cb57b342	Print xpu device name. (#9035 )	2025-07-24 15:06:25 -04:00
honglyua	0ccc88b03f	Support Iluvatar CoreX (#8585 ) * Support Iluvatar CoreX Co-authored-by: mingjiang.li <mingjiang.li@iluvatar.com>	2025-07-24 13:57:36 -04:00
Kohaku-Blueleaf	eb2f78b4e0	[Training Node] algo support, grad acc, optional grad ckpt (#9015 ) * Add factorization utils for lokr * Add lokr train impl * Add loha train impl * Add adapter map for algo selection * Add optional grad ckpt and algo selection * Update __init__.py * correct key name for loha * Use custom fwd/bwd func and better init for loha * Support gradient accumulation * Fix bugs of loha * use more stable init * Add OFT training * linting	2025-07-23 20:57:27 -04:00
chaObserv	e729a5cc11	Separate denoised and noise estimation in Euler CFG++ (#9008 ) This will change their behavior with the sampling CONST type. It also combines euler_cfg_pp and euler_ancestral_cfg_pp into one main function.	2025-07-23 19:47:05 -04:00
comfyanonymous	d3504e1778	Enable pytorch attention by default for gfx1201 on torch 2.8 (#9029 )	2025-07-23 19:21:29 -04:00
comfyanonymous	a86a58c308	Fix xpu function not implemented p2. (#9027 )	2025-07-23 18:18:20 -04:00
comfyanonymous	39dda1d40d	Fix xpu function not implemented. (#9026 )	2025-07-23 18:10:59 -04:00
comfyanonymous	5ad33787de	Add default device argument. (#9023 )	2025-07-23 14:20:49 -04:00
Simon Lui	255f139863	Add xpu version for async offload and some other things. (#9004 )	2025-07-22 15:20:09 -04:00
comfyanonymous	491fafbd64	Silence clip tokenizer warning. (#8934 )	2025-07-16 14:42:07 -04:00
Harel Cain	9bc2798f72	LTXV VAE decoder: switch default padding mode (#8930 )	2025-07-16 13:54:38 -04:00
comfyanonymous	50afba747c	Add attempt to work around the safetensors mmap issue. (#8928 )	2025-07-16 03:42:17 -04:00
Yoland Yan	543c24108c	Fix wrong reference bug (#8910 )	2025-07-14 20:45:55 -04:00
comfyanonymous	b40143984c	Add model detection error hint for lora. (#8880 )	2025-07-12 03:49:26 -04:00
comfyanonymous	938d3e8216	Remove windows line endings. (#8866 )	2025-07-11 02:37:51 -04:00
guill	2b653e8c18	Support for async node functions (#8830 ) * Support for async execution functions This commit adds support for node execution functions defined as async. When a node's execution function is defined as async, we can continue executing other nodes while it is processing. Standard uses of `await` should "just work", but people will still have to be careful if they spawn actual threads. Because torch doesn't really have async/await versions of functions, this won't particularly help with most locally-executing nodes, but it does work for e.g. web requests to other machines. In addition to the execute function, the `VALIDATE_INPUTS` and `check_lazy_status` functions can also be defined as async, though we'll only resolve one node at a time right now for those. * Add the execution model tests to CI * Add a missing file It looks like this got caught by .gitignore? There's probably a better place to put it, but I'm not sure what that is. * Add the websocket library for automated tests * Add additional tests for async error cases Also fixes one bug that was found when an async function throws an error after being scheduled on a task. * Add a feature flags message to reduce bandwidth We now only send 1 preview message of the latest type the client can support. We'll add a console warning when the client fails to send a feature flags message at some point in the future. * Add async tests to CI * Don't actually add new tests in this PR Will do it in a separate PR * Resolve unit test in GPU-less runner * Just remove the tests that GHA can't handle * Change line endings to UNIX-style * Avoid loading model_management.py so early Because model_management.py has a top-level `logging.info`, we have to be careful not to import that file before we call `setup_logging`. If we do, we end up having the default logging handler registered in addition to our custom one.	2025-07-10 14:46:19 -04:00
chaObserv	aac10ad23a	Add SA-Solver sampler (#8834 )	2025-07-08 16:17:06 -04:00
josephrocca	974254218a	Un-hardcode chroma patch_size (#8840 )	2025-07-08 15:56:59 -04:00
comfyanonymous	75d327abd5	Remove some useless code. (#8812 )	2025-07-06 07:07:39 -04:00
comfyanonymous	ee615ac269	Add warning when loading file unsafely. (#8800 )	2025-07-05 14:34:57 -04:00
chaObserv	f41f323c52	Add the denoising step to several samplers (#8780 )	2025-07-03 19:20:53 -04:00
City	d9277301d2	Initial code for new SLG node (#8759 )	2025-07-02 20:13:43 -04:00
comfyanonymous	111f583e00	Fallback to regular op when fp8 op throws exception. (#8761 )	2025-07-02 00:57:13 -04:00
chaObserv	b22e97dcfa	Migrate ER-SDE from VE to VP algorithm and add its sampler node (#8744 ) Apply alpha scaling in the algorithm for reverse-time SDE and add custom ER-SDE sampler node for other solver types (SDE, ODE).	2025-07-01 02:38:52 -04:00
comfyanonymous	170c7bb90c	Fix contiguous issue with pytorch nightly. (#8729 )	2025-06-29 06:38:40 -04:00
comfyanonymous	396454fa41	Reorder the schedulers so simple is the default one. (#8722 )	2025-06-28 18:12:56 -04:00
xufeng	ba9548f756	“--whitelist-custom-nodes” args for comfy core to go with “--disable-all-custom-nodes” for development purposes (#8592 ) * feat: “--whitelist-custom-nodes” args for comfy core to go with “--disable-all-custom-nodes” for development purposes * feat: Simplify custom nodes whitelist logic to use consistent code paths	2025-06-28 15:24:02 -04:00
comfyanonymous	c36be0ea09	Fix memory estimation bug with kontext. (#8709 )	2025-06-27 17:21:12 -04:00
comfyanonymous	9093301a49	Don't add tiny bit of random noise when VAE encoding. (#8705 ) Shouldn't change outputs but might make things a tiny bit more deterministic.	2025-06-27 14:14:56 -04:00
comfyanonymous	ef5266b1c1	Support Flux Kontext Dev model. (#8679 )	2025-06-26 11:28:41 -04:00
comfyanonymous	a96e65df18	Disable omnigen2 fp16 on older pytorch versions. (#8672 )	2025-06-26 03:39:09 -04:00
comfyanonymous	ec70ed6aea	Omnigen2 model implementation. (#8669 )	2025-06-25 19:35:57 -04:00
comfyanonymous	7a13f74220	unet -> diffusion model (#8659 )	2025-06-25 04:52:34 -04:00
chaObserv	8042eb20c6	Singlestep DPM++ SDE for RF (#8627 ) Refactor the algorithm, and apply alpha scaling.	2025-06-24 14:59:09 -04:00
comfyanonymous	1883e70b43	Fix exception when using a noise mask with cosmos predict2. (#8621 ) * Fix exception when using a noise mask with cosmos predict2. * Fix ruff.	2025-06-21 03:30:39 -04:00
comfyanonymous	f7fb193712	Small flux optimization. (#8611 )	2025-06-20 05:37:32 -04:00
comfyanonymous	7e9267fa77	Make flux controlnet work with sd3 text enc. (#8599 )	2025-06-19 18:50:05 -04:00
comfyanonymous	91d40086db	Fix pytorch warning. (#8593 )	2025-06-19 11:04:52 -04:00
chaObserv	8e81c507d2	Multistep DPM++ SDE samplers for RF (#8541 ) Include alpha in sampling and minor refactoring	2025-06-16 14:47:10 -04:00
comfyanonymous	e1c6dc720e	Allow setting min_length with tokenizer_data. (#8547 )	2025-06-16 13:43:52 -04:00
comfyanonymous	7ea79ebb9d	Add correct eps to ltxv rmsnorm. (#8542 )	2025-06-15 12:21:25 -04:00
comfyanonymous	d6a2137fc3	Support Cosmos predict2 image to video models. (#8535 ) Use the CosmosPredict2ImageToVideoLatent node.	2025-06-14 21:37:07 -04:00
chaObserv	53e8d8193c	Generalize SEEDS samplers (#8529 ) Restore VP algorithm for RF and refactor noise_coeffs and half-logSNR calculations	2025-06-14 16:58:16 -04:00
comfyanonymous	29596bd53f	Small cosmos attention code refactor. (#8530 )	2025-06-14 05:02:05 -04:00
Kohaku-Blueleaf	520eb77b72	LoRA Trainer: LoRA training node in weight adapter scheme (#8446 )	2025-06-13 19:25:59 -04:00
comfyanonymous	c69af655aa	Uncap cosmos predict2 res and fix mem estimation. (#8518 )	2025-06-13 07:30:18 -04:00
comfyanonymous	251f54a2ad	Basic initial support for cosmos predict2 text to image 2B and 14B models. (#8517 )	2025-06-13 07:05:23 -04:00
pythongosssss	50c605e957	Add support for sqlite database (#8444 ) * Add support for sqlite database * fix	2025-06-11 16:43:39 -04:00
comfyanonymous	8a4ff747bd	Fix mistake in last commit. (#8496 ) * Move to right place.	2025-06-11 15:13:29 -04:00
comfyanonymous	af1eb58be8	Fix black images on some flux models in fp16. (#8495 )	2025-06-11 15:09:11 -04:00
comfyanonymous	6e28a46454	Apple most likely is never fixing the fp16 attention bug. (#8485 )	2025-06-10 13:06:24 -04:00
comfyanonymous	7f800d04fa	Enable AMD fp8 and pytorch attention on some GPUs. (#8474 ) Information is from the pytorch source code.	2025-06-09 12:50:39 -04:00
comfyanonymous	97755eed46	Enable fp8 ops by default on gfx1201 (#8464 )	2025-06-08 14:15:34 -04:00
comfyanonymous	daf9d25ee2	Cleaner torch version comparisons. (#8453 )	2025-06-07 10:01:15 -04:00
comfyanonymous	3b4b171e18	Alternate fix for #8435 (#8442 )	2025-06-06 09:43:27 -04:00
comfyanonymous	4248b1618f	Let chroma TE work on regular flux. (#8429 )	2025-06-05 10:07:17 -04:00
comfyanonymous	fb4754624d	Make the casting in lists the same as regular inputs. (#8373 )	2025-06-01 05:39:54 -04:00
comfyanonymous	19e45e9b0e	Make it easier to pass lists of tensors to models. (#8358 )	2025-05-31 20:00:20 -04:00
drhead	08b7cc7506	use fused multiply-add pointwise ops in chroma (#8279 )	2025-05-30 18:09:54 -04:00
comfyanonymous	704fc78854	Put ROCm version in tuple to make it easier to enable stuff based on it. (#8348 )	2025-05-30 15:41:02 -04:00
comfyanonymous	f2289a1f59	Delete useless file. (#8327 )	2025-05-29 08:29:37 -04:00
comfyanonymous	5e5e46d40c	Not really tested WAN Phantom Support. (#8321 )	2025-05-28 23:46:15 -04:00

... 3 4 5 6 7 ...

1988 Commits