EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-02-08 20:42:32 +08:00

Author	SHA1	Message	Date
comfyanonymous	d8a7a32779	Cleanup old TODO.	2025-01-20 03:44:13 -05:00
Sergii Dymchenko	ebf038d4fa	Use `torch.special.expm1` (#6388 ) * Use `torch.special.expm1` This function provides greater precision than `exp(x) - 1` for small values of `x`. Found with TorchFix https://github.com/pytorch-labs/torchfix/ * Use non-alias	2025-01-19 04:54:32 -05:00
catboxanon	b1a02131c9	Remove comfy.samplers self-import (#6506 )	2025-01-18 17:49:51 -05:00
comfyanonymous	507199d9a8	Uni pc sampler now works with audio and video models.	2025-01-18 05:27:58 -05:00
comfyanonymous	2f3ab40b62	Add warning when using old pytorch versions.	2025-01-17 18:47:27 -05:00
comfyanonymous	0aa2368e46	Fix some cosmos fp8 issues.	2025-01-16 17:45:37 -05:00
comfyanonymous	cca96a85ae	Fix cosmos VAE failing with videos longer than 121 frames.	2025-01-16 16:30:06 -05:00
comfyanonymous	31831e6ef1	Code refactor.	2025-01-16 07:23:54 -05:00
comfyanonymous	88ceb28e20	Tweak hunyuan memory usage factor.	2025-01-16 06:31:03 -05:00
comfyanonymous	23289a6a5c	Clean up some debug lines.	2025-01-16 04:24:39 -05:00
comfyanonymous	9d8b6c1f46	More accurate memory estimation for cosmos and hunyuan video.	2025-01-16 03:48:40 -05:00
comfyanonymous	6320d05696	Slightly lower hunyuan video memory usage.	2025-01-16 00:23:01 -05:00
comfyanonymous	25683b5b02	Lower cosmos diffusion model memory usage.	2025-01-15 23:46:42 -05:00
comfyanonymous	4758fb64b9	Lower cosmos VAE memory usage by a bit.	2025-01-15 22:57:52 -05:00
comfyanonymous	008761166f	Optimize first attention block in cosmos VAE.	2025-01-15 21:48:46 -05:00
comfyanonymous	cba58fff0b	Remove unsafe embedding load for very old pytorch.	2025-01-15 04:32:23 -05:00
comfyanonymous	2feb8d0b77	Force safe loading of files in torch format on pytorch 2.4+ If this breaks something for you make an issue.	2025-01-15 03:50:27 -05:00
Pam	c78a45685d	Rewrite res_multistep sampler and implement res_multistep_cfg_pp sampler. (#6462 )	2025-01-14 18:20:06 -05:00
comfyanonymous	3aaabb12d4	Implement Cosmos Image/Video to World (Video) diffusion models. Use CosmosImageToVideoLatent to set the input image/video.	2025-01-14 05:14:10 -05:00
comfyanonymous	1f1c7b7b56	Remove useless code.	2025-01-13 03:52:37 -05:00
comfyanonymous	90f349f93d	Add res_multistep sampler from the cosmos code. This sampler should work with all models.	2025-01-12 03:10:07 -05:00
Jedrzej Kosinski	6c9bd11fa3	Hooks Part 2 - TransformerOptionsHook and AdditionalModelsHook (#6377 ) * Add 'sigmas' to transformer_options so that downstream code can know about the full scope of current sampling run, fix Hook Keyframes' guarantee_steps=1 inconsistent behavior with sampling split across different Sampling nodes/sampling runs by referencing 'sigmas' * Cleaned up hooks.py, refactored Hook.should_register and add_hook_patches to use target_dict instead of target so that more information can be provided about the current execution environment if needed * Refactor WrapperHook into TransformerOptionsHook, as there is no need to separate out Wrappers/Callbacks/Patches into different hook types (all affect transformer_options) * Refactored HookGroup to also store a dictionary of hooks separated by hook_type, modified necessary code to no longer need to manually separate out hooks by hook_type * In inner_sample, change "sigmas" to "sampler_sigmas" in transformer_options to not conflict with the "sigmas" that will overwrite "sigmas" in _calc_cond_batch * Refactored 'registered' to be HookGroup instead of a list of Hooks, made AddModelsHook operational and compliant with should_register result, moved TransformerOptionsHook handling out of ModelPatcher.register_all_hook_patches, support patches in TransformerOptionsHook properly by casting any patches/wrappers/hooks to proper device at sample time * Made hook clone code sane, made clear ObjectPatchHook and SetInjectionsHook are not yet operational * Fix performance of hooks when hooks are appended via Cond Pair Set Props nodes by properly caching between positive and negative conds, make hook_patches_backup behave as intended (in the case that something pre-registers WeightHooks on the ModelPatcher instead of registering it at sample time) * Filter only registered hooks on self.conds in CFGGuider.sample * Make hook_scope functional for TransformerOptionsHook * removed 4 whitespace lines to satisfy Ruff, * Add a get_injections function to ModelPatcher * Made TransformerOptionsHook contribute to registered hooks properly, added some doc strings and removed a so-far unused variable * Rename AddModelsHooks to AdditionalModelsHook, rename SetInjectionsHook to InjectionsHook (not yet implemented, but at least getting the naming figured out) * Clean up a typehint	2025-01-11 12:20:23 -05:00
comfyanonymous	ee8a7ab69d	Fast latent preview for Cosmos.	2025-01-11 04:41:24 -05:00
comfyanonymous	2ff3104f70	WIP support for Nvidia Cosmos 7B and 14B text to world (video) models.	2025-01-10 09:14:16 -05:00
comfyanonymous	129d8908f7	Add argument to skip the output reshaping in the attention functions.	2025-01-10 06:27:37 -05:00
comfyanonymous	ff838657fa	Cleaner handling of attention mask in ltxv model code.	2025-01-09 07:12:03 -05:00
comfyanonymous	2307ff6746	Improve some logging messages.	2025-01-08 19:05:22 -05:00
comfyanonymous	d0f3752e33	Properly calculate inner dim for t5 model. This is required to support some different types of t5 models.	2025-01-07 17:33:03 -05:00
comfyanonymous	4209edf48d	Make a few more samplers deterministic.	2025-01-07 02:12:32 -05:00
Chenlei Hu	d055325783	Document get_attr and get_model_object (#6357 ) * Document get_attr and get_model_object * Update model_patcher.py * Update model_patcher.py * Update model_patcher.py	2025-01-06 20:12:22 -05:00
comfyanonymous	916d1e14a9	Make ancestral samplers more deterministic.	2025-01-06 03:04:32 -05:00
Jedrzej Kosinski	c496e53519	In inner_sample, change "sigmas" to "sampler_sigmas" in transformer_options to not conflict with the "sigmas" that will overwrite "sigmas" in _calc_cond_batch (#6360 )	2025-01-06 01:36:47 -05:00
comfyanonymous	d45ebb63f6	Remove old unused function.	2025-01-04 07:20:54 -05:00
comfyanonymous	9e9c8a1c64	Clear cache as often on AMD as Nvidia. I think the issue this was working around has been solved. If you notice that this change slows things down or causes stutters on your AMD GPU with ROCm on Linux please report it.	2025-01-02 08:44:16 -05:00
Andrew Kvochko	0f11d60afb	Fix temporal tiling for decoder, remove redundant tiles. (#6306 ) This commit fixes the temporal tile size calculation, and removes a redundant tile at the end of the range when its elements are completely covered by the previous tile. Co-authored-by: Andrew Kvochko <a.kvochko@lightricks.com>	2025-01-01 16:29:01 -05:00
comfyanonymous	79eea51a1d	Fix and enforce all ruff W rules.	2025-01-01 03:08:33 -05:00
blepping	c0338a46a4	Fix unknown sampler error handling in calculate_sigmas function (#6280 ) Modernize calculate_sigmas function	2024-12-31 17:33:50 -05:00
Jedrzej Kosinski	1c99734e5a	Add missing model_options param (#6296 )	2024-12-31 14:46:55 -05:00
filtered	67758f50f3	Fix custom node type-hinting examples (#6281 ) * Fix import in comfy_types doc / sample * Clarify docstring	2024-12-31 03:41:09 -05:00
comfyanonymous	b7572b2f87	Fix and enforce no trailing whitespace.	2024-12-31 03:16:37 -05:00
blepping	a90aafafc1	Add kl_optimal scheduler (#6206 ) * Add kl_optimal scheduler * Rename kl_optimal_schedule to kl_optimal_scheduler to be more consistent	2024-12-30 05:09:38 -05:00
comfyanonymous	d9b7cfac7e	Fix and enforce new lines at the end of files.	2024-12-30 04:14:59 -05:00
Jedrzej Kosinski	3507870535	Add 'sigmas' to transformer_options so that downstream code can know about the full scope of current sampling run, fix Hook Keyframes' guarantee_steps=1 inconsistent behavior with sampling split across different Sampling nodes/sampling runs by referencing 'sigmas' (#6273 )	2024-12-30 03:42:49 -05:00
comfyanonymous	a618f768e0	Auto reshape 2d to 3d latent for single image generation on video model.	2024-12-29 02:26:49 -05:00
comfyanonymous	b504bd606d	Add ruff rule for empty line with trailing whitespace.	2024-12-28 05:23:08 -05:00
comfyanonymous	d170292594	Remove some trailing white space.	2024-12-27 18:02:30 -05:00
filtered	9cfd185676	Add option to log non-error output to stdout (#6243 ) * nit * Add option to log non-error output to stdout - No change to default behaviour - Adds CLI argument: --log-stdout - With this arg present, any logging of a level below logging.ERROR will be sent to stdout instead of stderr	2024-12-27 14:40:05 -05:00
comfyanonymous	4b5bcd8ac4	Closer memory estimation for hunyuan dit model.	2024-12-27 07:37:00 -05:00
comfyanonymous	ceb50b2cbf	Closer memory estimation for pixart models.	2024-12-27 07:30:09 -05:00
comfyanonymous	160ca08138	Use python 3.9 in launch test instead of 3.8 Fix ruff check.	2024-12-26 20:05:54 -05:00
Huazhong Ji	c4bfdba330	Support ascend npu (#5436 ) * support ascend npu Co-authored-by: YukMingLaw <lymmm2@163.com> Co-authored-by: starmountain1997 <guozr1997@hotmail.com> Co-authored-by: Ginray <ginray0215@gmail.com>	2024-12-26 19:36:50 -05:00
comfyanonymous	ee9547ba31	Improve temporal VAE Encode (Tiled) math.	2024-12-26 07:18:49 -05:00
comfyanonymous	19a64d6291	Cleanup some mac related code.	2024-12-25 05:32:51 -05:00
comfyanonymous	b486885e08	Disable bfloat16 on older mac.	2024-12-25 05:18:50 -05:00
comfyanonymous	0229228f3f	Clean up the VAE dtypes code.	2024-12-25 04:50:34 -05:00
comfyanonymous	99a1fb6027	Make fast fp8 take a bit less peak memory.	2024-12-24 18:05:19 -05:00
comfyanonymous	73e04987f7	Prevent black images in VAE Decode (Tiled) node. Overlap should be minimum 1 with tiling 2 for tiled temporal VAE decoding.	2024-12-24 07:36:30 -05:00
comfyanonymous	5388df784a	Add temporal tiling to VAE Encode (Tiled) node.	2024-12-24 07:10:09 -05:00
comfyanonymous	bc6dac4327	Add temporal tiling to VAE Decode (Tiled) node. You can now do tiled VAE decoding on the temporal direction for videos.	2024-12-23 20:03:37 -05:00
comfyanonymous	15564688ed	Add a try except block so if torch version is weird it won't crash.	2024-12-23 03:22:48 -05:00
Simon Lui	c6b9c11ef6	Add oneAPI device selector for xpu and some other changes. (#6112 ) * Add oneAPI device selector and some other minor changes. * Fix device selector variable name. * Flip minor version check sign. * Undo changes to README.md.	2024-12-23 03:18:32 -05:00
comfyanonymous	e44d0ac7f7	Make --novram completely offload weights. This flag is mainly used for testing the weight offloading, it shouldn't actually be used in practice. Remove useless import.	2024-12-23 01:51:08 -05:00
comfyanonymous	56bc64f351	Comment out some useless code.	2024-12-22 23:51:14 -05:00
zhangp365	f7d83b72e0	fixed a bug in ldm/pixart/blocks.py (#6158 )	2024-12-22 23:44:20 -05:00
comfyanonymous	80f07952d2	Fix lowvram issue with ltxv vae.	2024-12-22 23:20:17 -05:00
comfyanonymous	57f330caf9	Relax minimum ratio of weights loaded in memory on nvidia. This should make it possible to do higher res images/longer videos by further offloading weights to CPU memory. Please report an issue if this slows down things on your system.	2024-12-22 03:06:37 -05:00
comfyanonymous	da13b6b827	Get rid of meshgrid warning.	2024-12-20 18:02:12 -05:00
comfyanonymous	c86cd58573	Remove useless code.	2024-12-20 17:50:03 -05:00
comfyanonymous	b5fe39211a	Remove some useless code.	2024-12-20 17:43:50 -05:00
comfyanonymous	e946667216	Some fixes/cleanups to pixart code. Commented out the masking related code because it is never used in this implementation.	2024-12-20 17:10:52 -05:00
Chenlei Hu	d7969cb070	Replace print with logging (#6138 ) * Replace print with logging * nit * nit * nit * nit * nit * nit	2024-12-20 16:24:55 -05:00
City	bddb02660c	Add PixArt model support (#6055 ) * PixArt initial version * PixArt Diffusers convert logic * pos_emb and interpolation logic * Reduce duplicate code * Formatting * Use optimized attention * Edit empty token logic * Basic PixArt LoRA support * Fix aspect ratio logic * PixArtAlpha text encode with conds * Use same detection key logic for PixArt diffusers	2024-12-20 15:25:00 -05:00
comfyanonymous	418eb7062d	Support new LTXV VAE.	2024-12-20 04:38:29 -05:00
comfyanonymous	cac68ca813	Fix some more video tiled encode issues. The downscale_ratio formula for the temporal had issues with some frame numbers.	2024-12-19 23:14:03 -05:00
comfyanonymous	52c1d933b2	Fix tiled hunyuan video VAE encode issue. Some shapes like 1024x1024 with tile_size 256 and overlap 64 had issues.	2024-12-19 22:55:15 -05:00
comfyanonymous	2dda7c11a3	More proper fix for the memory issue.	2024-12-19 16:21:56 -05:00
comfyanonymous	3ad3248ad7	Fix lowvram bug when using a model multiple times in a row. The memory system would load an extra 64MB each time until either the model was completely in memory or OOM.	2024-12-19 16:04:56 -05:00
comfyanonymous	c441048a4f	Make VAE Encode tiled node work with video VAE.	2024-12-19 05:31:39 -05:00
comfyanonymous	9f4b181ab3	Add fast previews for hunyuan video.	2024-12-18 18:24:23 -05:00
comfyanonymous	cbbf077593	Small optimizations.	2024-12-18 18:23:28 -05:00
comfyanonymous	ff2ff02168	Support old diffusion-pipe hunyuan video loras.	2024-12-18 06:23:54 -05:00
comfyanonymous	4c5c4ddeda	Fix regression in VAE code on old pytorch versions.	2024-12-18 03:08:28 -05:00
comfyanonymous	37e5390f5f	Add: --use-sage-attention to enable SageAttention. You need to have the library installed first.	2024-12-18 01:56:10 -05:00
comfyanonymous	a4f59bc65e	Pick attention implementation based on device in llama code.	2024-12-18 01:30:20 -05:00
comfyanonymous	ca457f7ba1	Properly tokenize the template for hunyuan video.	2024-12-17 16:22:02 -05:00
comfyanonymous	cd6f615038	Fix tiled vae not working with some shapes.	2024-12-17 16:22:02 -05:00
comfyanonymous	e4e1bff605	Support diffusion-pipe hunyuan video lora format.	2024-12-17 07:14:21 -05:00
comfyanonymous	d6656b0c0c	Support llama hunyuan video text encoder in scaled fp8 format.	2024-12-17 04:19:22 -05:00
comfyanonymous	f4cdedea62	Fix regression with ltxv VAE.	2024-12-17 02:17:31 -05:00
comfyanonymous	39b1fc4ccc	Adjust used dtypes for hunyuan video VAE and diffusion model.	2024-12-16 23:31:10 -05:00
comfyanonymous	bda1482a27	Basic Hunyuan Video model support.	2024-12-16 19:35:40 -05:00
comfyanonymous	19ee5d9d8b	Don't expand mask when not necessary. Expanding seems to slow down inference.	2024-12-16 18:22:50 -05:00
Raphael Walker	61b50720d0	Add support for attention masking in Flux (#5942 ) * fix attention OOM in xformers * allow passing attention mask in flux attention * allow an attn_mask in flux * attn masks can be done using replace patches instead of a separate dict * fix return types * fix return order * enumerate * patch the right keys * arg names * fix a silly bug * fix xformers masks * replace match with if, elif, else * mask with image_ref_size * remove unused import * remove unused import 2 * fix pytorch/xformers attention This corrects a weird inconsistency with skip_reshape. It also allows masks of various shapes to be passed, which will be automtically expanded (in a memory-efficient way) to a size that is compatible with xformers or pytorch sdpa respectively. * fix mask shapes	2024-12-16 18:21:17 -05:00
comfyanonymous	e83063bf24	Support conv3d in PatchEmbed.	2024-12-14 05:46:04 -05:00
comfyanonymous	4e14032c02	Make pad_to_patch_size function work on multi dim.	2024-12-13 07:22:05 -05:00
Chenlei Hu	563291ee51	Enforce all pyflake lint rules (#6033 ) * Enforce F821 undefined-name * Enforce all pyflake lint rules	2024-12-12 19:29:37 -05:00
Chenlei Hu	2cddbf0821	Lint and fix undefined names (1/N) (#6028 )	2024-12-12 18:55:26 -05:00
Chenlei Hu	60749f345d	Lint and fix undefined names (3/N) (#6030 )	2024-12-12 18:49:40 -05:00
Chenlei Hu	d9d7f3c619	Lint all unused variables (#5989 ) * Enable F841 * Autofix * Remove all unused variable assignment	2024-12-12 17:59:16 -05:00
comfyanonymous	fd5dfb812c	Set initial load devices for te and model to mps device on mac.	2024-12-12 06:00:31 -05:00
comfyanonymous	7a7efe8424	Support loading some checkpoint files with nested dicts.	2024-12-11 08:04:54 -05:00
comfyanonymous	44db978531	Fix a few things in text enc code for models with no eos token.	2024-12-10 23:07:26 -05:00
comfyanonymous	1c8d11e48a	Support different types of tokenizers. Support tokenizers without an eos token. Pass full sentences to tokenizer for more efficient tokenizing.	2024-12-10 15:03:39 -05:00
catboxanon	23827ca312	Add `cond_scale` to `sampler_post_cfg_function` (#5985 )	2024-12-09 20:13:18 -05:00
Chenlei Hu	0fd4e6c778	Lint unused import (#5973 ) * Lint unused import * nit * Remove unused imports * revert fix_torch import * nit	2024-12-09 15:24:39 -05:00
comfyanonymous	e2fafe0686	Make CLIP set last layer node work with t5 models.	2024-12-09 03:57:14 -05:00
Haoming	fbf68c4e52	clamp input (#5928 )	2024-12-07 14:00:31 -05:00
comfyanonymous	8af9a91e0c	A few improvements to #5937 .	2024-12-06 05:49:15 -05:00
Michael Kupchick	005d2d3a13	ltxv: add noise to guidance image to ensure generated motion. (#5937 )	2024-12-06 05:46:08 -05:00
comfyanonymous	1e21f4c14e	Make timestep ranges more usable on rectified flow models. This breaks some old workflows but should make the nodes actually useful.	2024-12-05 16:40:58 -05:00
Chenlei Hu	48272448ad	[Developer Experience] Add node typing (#5676 ) * [Developer Experience] Add node typing * Shim StrEnum * nit * nit * nit	2024-12-04 15:01:00 -05:00
comfyanonymous	452179fe4f	Make ModelPatcher class clone function work with inheritance.	2024-12-03 13:57:57 -05:00
comfyanonymous	c1b92b719d	Some optimizations to euler a.	2024-12-03 06:11:52 -05:00
comfyanonymous	57e8bf6a9f	Fix case where a memory leak could cause crash. Now the only symptom of code messing up and keeping references to a model object when it should not will be endless prints in the log instead of the next workflow crashing ComfyUI.	2024-12-02 19:49:49 -05:00
Jedrzej Kosinski	0ee322ec5f	ModelPatcher Overhaul and Hook Support (#5583 ) * Added hook_patches to ModelPatcher for weights (model) * Initial changes to calc_cond_batch to eventually support hook_patches * Added current_patcher property to BaseModel * Consolidated add_hook_patches_as_diffs into add_hook_patches func, fixed fp8 support for model-as-lora feature * Added call to initialize_timesteps on hooks in process_conds func, and added call prepare current keyframe on hooks in calc_cond_batch * Added default_conds support in calc_cond_batch func * Added initial set of hook-related nodes, added code to register hooks for loras/model-as-loras, small renaming/refactoring * Made CLIP work with hook patches * Added initial hook scheduling nodes, small renaming/refactoring * Fixed MaxSpeed and default conds implementations * Added support for adding weight hooks that aren't registered on the ModelPatcher at sampling time * Made Set Clip Hooks node work with hooks from Create Hook nodes, began work on better Create Hook Model As LoRA node * Initial work on adding 'model_as_lora' lora type to calculate_weight * Continued work on simpler Create Hook Model As LoRA node, started to implement ModelPatcher callbacks, attachments, and additional_models * Fix incorrect ref to create_hook_patches_clone after moving function * Added injections support to ModelPatcher + necessary bookkeeping, added additional_models support in ModelPatcher, conds, and hooks * Added wrappers to ModelPatcher to facilitate standardized function wrapping * Started scaffolding for other hook types, refactored get_hooks_from_cond to organize hooks by type * Fix skip_until_exit logic bug breaking injection after first run of model * Updated clone_has_same_weights function to account for new ModelPatcher properties, improved AutoPatcherEjector usage in partially_load * Added WrapperExecutor for non-classbound functions, added calc_cond_batch wrappers * Refactored callbacks+wrappers to allow storing lists by id * Added forward_timestep_embed_patch type, added helper functions on ModelPatcher for emb_patch and forward_timestep_embed_patch, added helper functions for removing callbacks/wrappers/additional_models by key, added custom_should_register prop to hooks * Added get_attachment func on ModelPatcher * Implement basic MemoryCounter system for determing with cached weights due to hooks should be offloaded in hooks_backup * Modified ControlNet/T2IAdapter get_control function to receive transformer_options as additional parameter, made the model_options stored in extra_args in inner_sample be a clone of the original model_options instead of same ref * Added create_model_options_clone func, modified type annotations to use __future__ so that I can use the better type annotations * Refactored WrapperExecutor code to remove need for WrapperClassExecutor (now gone), added sampler.sample wrapper (pending review, will likely keep but will see what hacks this could currently let me get rid of in ACN/ADE) * Added Combine versions of Cond/Cond Pair Set Props nodes, renamed Pair Cond to Cond Pair, fixed default conds never applying hooks (due to hooks key typo) * Renamed Create Hook Model As LoRA nodes to make the test node the main one (more changes pending) * Added uuid to conds in CFGGuider and uuids to transformer_options to allow uniquely identifying conds in batches during sampling * Fixed models not being unloaded properly due to current_patcher reference; the current ComfyUI model cleanup code requires that nothing else has a reference to the ModelPatcher instances * Fixed default conds not respecting hook keyframes, made keyframes not reset cache when strength is unchanged, fixed Cond Set Default Combine throwing error, fixed model-as-lora throwing error during calculate_weight after a recent ComfyUI update, small refactoring/scaffolding changes for hooks * Changed CreateHookModelAsLoraTest to be the new CreateHookModelAsLora, rename old ones as 'direct' and will be removed prior to merge * Added initial support within CLIP Text Encode (Prompt) node for scheduling weight hook CLIP strength via clip_start_percent/clip_end_percent on conds, added schedule_clip toggle to Set CLIP Hooks node, small cleanup/fixes * Fix range check in get_hooks_for_clip_schedule so that proper keyframes get assigned to corresponding ranges * Optimized CLIP hook scheduling to treat same strength as same keyframe * Less fragile memory management. * Make encode_from_tokens_scheduled call cleaner, rollback change in model_patcher.py for hook_patches_backup dict * Fix issue. * Remove useless function. * Prevent and detect some types of memory leaks. * Run garbage collector when switching workflow if needed. * Moved WrappersMP/CallbacksMP/WrapperExecutor to patcher_extension.py * Refactored code to store wrappers and callbacks in transformer_options, added apply_model and diffusion_model.forward wrappers * Fix issue. * Refactored hooks in calc_cond_batch to be part of get_area_and_mult tuple, added extra_hooks to ControlBase to allow custom controlnets w/ hooks, small cleanup and renaming * Fixed inconsistency of results when schedule_clip is set to False, small renaming/typo fixing, added initial support for ControlNet extra_hooks to work in tandem with normal cond hooks, initial work on calc_cond_batch merging all subdicts in returned transformer_options * Modified callbacks and wrappers so that unregistered types can be used, allowing custom_nodes to have their own unique callbacks/wrappers if desired * Updated different hook types to reflect actual progress of implementation, initial scaffolding for working WrapperHook functionality * Fixed existing weight hook_patches (pre-registered) not working properly for CLIP * Removed Register/Direct hook nodes since they were present only for testing, removed diff-related weight hook calculation as improved_memory removes unload_model_clones and using sample time registered hooks is less hacky * Added clip scheduling support to all other native ComfyUI text encoding nodes (sdxl, flux, hunyuan, sd3) * Made WrapperHook functional, added another wrapper/callback getter, added ON_DETACH callback to ModelPatcher * Made opt_hooks append by default instead of replace, renamed comfy.hooks set functions to be more accurate * Added apply_to_conds to Set CLIP Hooks, modified relevant code to allow text encoding to automatically apply hooks to output conds when apply_to_conds is set to True * Fix cached_hook_patches not respecting target_device/memory_counter results * Fixed issue with setting weights from hooks instead of copying them, added additional memory_counter check when caching hook patches * Remove unnecessary torch.no_grad calls for hook patches * Increased MemoryCounter minimum memory to leave free by 2 until a better way to get inference memory estimate of currently loaded models exists For encode_from_tokens_scheduled, allow start_percent and end_percent in add_dict to limit which scheduled conds get encoded for optimization purposes * Removed a .to call on results of calculate_weight in patch_hook_weight_to_device that was screwing up the intermediate results for fp8 prior to being passed into stochastic_rounding call * Made encode_from_tokens_scheduled work when no hooks are set on patcher * Small cleanup of comments * Turn off hook patch caching when only 1 hook present in sampling, replace some current_hook = None with calls to self.patch_hooks(None) instead to avoid a potential edge case * On Cond/Cond Pair nodes, removed opt_ prefix from optional inputs * Allow both FLOATS and FLOAT for floats_strength input * Revert change, does not work * Made patch_hook_weight_to_device respect set_func and convert_func * Make discard_model_sampling True by default * Add changes manually from 'master' so merge conflict resolution goes more smoothly * Cleaned up text encode nodes with just a single clip.encode_from_tokens_scheduled call * Make sure encode_from_tokens_scheduled will respect use_clip_schedule on clip * Made nodes in nodes_hooks be marked as experimental (beta) * Add get_nested_additional_models for cases where additional_models could have their own additional_models, and add robustness for circular additional_models references * Made finalize_default_conds area math consistent with other sampling code * Changed 'opt_hooks' input of Cond/Cond Pair Set Default Combine nodes to 'hooks' * Remove a couple old TODO's and a no longer necessary workaround	2024-12-02 14:51:02 -05:00
comfyanonymous	79d5ceae6e	Improved memory management. (#5450 ) * Less fragile memory management. * Fix issue. * Remove useless function. * Prevent and detect some types of memory leaks. * Run garbage collector when switching workflow if needed. * Fix issue.	2024-12-02 14:39:34 -05:00
comfyanonymous	2d5b3e0078	Remove useless code.	2024-12-02 06:49:55 -05:00
comfyanonymous	8e4118c0de	make dpm_2_ancestral work with rectified flow.	2024-12-01 07:37:41 -05:00
comfyanonymous	26fb2c68e8	Add a way to disable cropping in the CLIPVisionEncode node.	2024-11-28 20:24:47 -05:00
comfyanonymous	bf2650a80e	Fast previews for ltxv.	2024-11-28 06:46:15 -05:00
comfyanonymous	b666539595	Remove print.	2024-11-27 20:28:39 -05:00
comfyanonymous	95d8713482	Missing parentheses.	2024-11-27 13:45:32 -05:00
comfyanonymous	497db6212f	Alternative fix for #5767	2024-11-26 17:53:04 -05:00
comfyanonymous	4c82741b54	Support official SD3.5 Controlnets.	2024-11-26 11:31:25 -05:00
comfyanonymous	15c39ea757	Support for the official mochi lora format.	2024-11-26 03:34:36 -05:00
comfyanonymous	b7143b74ce	Flux inpaint model does not work in fp16.	2024-11-26 01:33:01 -05:00
comfyanonymous	61196d8857	Add option to inference the diffusion model in fp32 and fp64.	2024-11-25 05:00:23 -05:00
comfyanonymous	b4526d3fc3	Skip layer guidance now works on hydit model.	2024-11-24 05:54:30 -05:00
comfyanonymous	ab885b33ba	Skip layer guidance node now works on LTX-Video.	2024-11-23 10:33:05 -05:00
comfyanonymous	839ed3368e	Some improvements to the lowvram unloading.	2024-11-22 20:59:15 -05:00
comfyanonymous	6e8cdcd3cb	Fix some tiled VAE decoding issues with LTX-Video.	2024-11-22 18:00:34 -05:00
comfyanonymous	e5c3f4b87f	LTXV lowvram fixes.	2024-11-22 17:17:11 -05:00
comfyanonymous	bc6be6c11e	Some fixes to the lowvram system.	2024-11-22 16:40:04 -05:00
comfyanonymous	5818f6cf51	Remove print.	2024-11-22 10:49:15 -05:00
comfyanonymous	5e16f1d24b	Support Lightricks LTX-Video model.	2024-11-22 08:46:39 -05:00
comfyanonymous	2fd9c1308a	Fix mask issue in some attention functions.	2024-11-22 02:10:09 -05:00
comfyanonymous	8f0009aad0	Support new flux model variants.	2024-11-21 08:38:23 -05:00
comfyanonymous	41444b5236	Add some new weight patching functionality. Add a way to reshape lora weights. Allow weight patches to all weight not just .weight and .bias Add a way for a lora to set a weight to a specific value.	2024-11-21 07:19:17 -05:00
comfyanonymous	07f6eeaa13	Fix mask issue with attention_xformers.	2024-11-20 17:07:46 -05:00
comfyanonymous	22535d0589	Skip layer guidance now works on stable audio model.	2024-11-20 07:33:06 -05:00
comfyanonymous	b699a15062	Refactor inpaint/ip2p code.	2024-11-19 03:25:25 -05:00
comfyanonymous	d9f90965c8	Support block replace patches in auraflow.	2024-11-17 08:19:59 -05:00
comfyanonymous	41886af138	Add transformer options blocks replace patch to mochi.	2024-11-16 20:48:14 -05:00
comfyanonymous	3b9a6cf2b1	Fix issue with 3d masks.	2024-11-13 07:18:30 -05:00
comfyanonymous	8ebf2d8831	Add block replace transformer_options to flux.	2024-11-12 08:00:39 -05:00
comfyanonymous	eb476e6ea9	Allow 1D masks for 1D latents.	2024-11-11 14:44:52 -05:00
comfyanonymous	8b275ce5be	Support auto detecting some zsnr anime checkpoints.	2024-11-11 05:34:11 -05:00
comfyanonymous	2a18e98ccf	Refactor so that zsnr can be set in the sampling_settings.	2024-11-11 04:55:56 -05:00
comfyanonymous	bdeb1c171c	Fast previews for mochi.	2024-11-10 03:39:35 -05:00
comfyanonymous	8b90e50979	Properly handle and reshape masks when used on 3d latents.	2024-11-09 15:30:19 -05:00
comfyanonymous	2865f913f7	Free memory before doing tiled decode.	2024-11-07 04:01:24 -05:00
comfyanonymous	b49616f951	Make VAEDecodeTiled node work with video VAEs.	2024-11-07 03:47:12 -05:00
comfyanonymous	5e29e7a488	Remove scaled_fp8 key after reading it to silence warning.	2024-11-06 04:56:42 -05:00
comfyanonymous	8afb97cd3f	Fix unknown VAE being detected as the mochi VAE.	2024-11-05 03:43:27 -05:00
contentis	69694f40b3	fix dynamic shape export (#5490 )	2024-11-04 14:59:28 -05:00
comfyanonymous	6c9dbde7de	Fix mochi all in one checkpoint t5xxl key names.	2024-11-03 01:40:42 -05:00
comfyanonymous	fabf449feb	Mochi VAE encoder.	2024-11-01 17:33:09 -04:00
Aarni Koskela	1c8286a44b	Avoid SyntaxWarning in UniPC docstring (#5442 )	2024-10-31 15:17:26 -04:00
comfyanonymous	1af4a47fd1	Bump up mac version for attention upcast bug workaround.	2024-10-31 15:15:31 -04:00
comfyanonymous	daa1565b93	Fix diffusers flux controlnet regression.	2024-10-30 13:11:34 -04:00
comfyanonymous	09fdb2b269	Support SD3.5 medium diffusers format weights and loras.	2024-10-30 04:24:00 -04:00
comfyanonymous	30c0c81351	Add a way to patch blocks in SD3.	2024-10-29 00:48:32 -04:00
comfyanonymous	13b0ff8a6f	Update SD3 code.	2024-10-28 21:58:52 -04:00
comfyanonymous	c320801187	Remove useless line.	2024-10-28 17:41:12 -04:00
comfyanonymous	669d9e4c67	Set default shift on mochi to 6.0	2024-10-27 22:21:04 -04:00
comfyanonymous	9ee0a6553a	float16 inference is a bit broken on mochi.	2024-10-27 04:56:40 -04:00
comfyanonymous	5cbb01bc2f	Basic Genmo Mochi video model support. To use: "Load CLIP" node with t5xxl + type mochi "Load Diffusion Model" node with the mochi dit file. "Load VAE" with the mochi vae file. EmptyMochiLatentVideo node for the latent. euler + linear_quadratic in the KSampler node.	2024-10-26 06:54:00 -04:00
comfyanonymous	c3ffbae067	Make LatentUpscale nodes work on 3d latents.	2024-10-26 01:50:51 -04:00
comfyanonymous	d605677b33	Make euler_ancestral work on flow models (credit: Ashen).	2024-10-25 19:53:44 -04:00
PsychoLogicAu	af8cf79a2d	support SimpleTuner lycoris lora for SD3 (#5340 )	2024-10-24 01:18:32 -04:00
comfyanonymous	66b0961a46	Fix ControlLora issue with last commit.	2024-10-23 17:02:40 -04:00
comfyanonymous	754597c8a9	Clean up some controlnet code. Remove self.device which was useless.	2024-10-23 14:19:05 -04:00
comfyanonymous	915fdb5745	Fix lowvram edge case.	2024-10-22 16:34:50 -04:00
contentis	5a8a48931a	remove attention abstraction (#5324 )	2024-10-22 14:02:38 -04:00
comfyanonymous	8ce2a1052c	Optimizations to --fast and scaled fp8.	2024-10-22 02:12:28 -04:00
comfyanonymous	f82314fcfc	Fix duplicate sigmas on beta scheduler.	2024-10-21 20:19:45 -04:00
comfyanonymous	0075c6d096	Mixed precision diffusion models with scaled fp8. This change allows supports for diffusion models where all the linears are scaled fp8 while the other weights are the original precision.	2024-10-21 18:12:51 -04:00
comfyanonymous	83ca891118	Support scaled fp8 t5xxl model.	2024-10-20 22:27:00 -04:00
comfyanonymous	f9f9faface	Fixed model merging issue with scaled fp8.	2024-10-20 06:24:31 -04:00
comfyanonymous	471cd3eace	fp8 casting is fast on GPUs that support fp8 compute.	2024-10-20 00:54:47 -04:00
comfyanonymous	a68bbafddb	Support diffusion models with scaled fp8 weights.	2024-10-19 23:47:42 -04:00
comfyanonymous	73e3a9e676	Clamp output when rounding weight to prevent Nan.	2024-10-19 19:07:10 -04:00
comfyanonymous	67158994a4	Use the lowvram cast_to function for everything.	2024-10-17 17:25:56 -04:00
comfyanonymous	0bedfb26af	Revert "Fix Transformers FutureWarning (#5140 )" This reverts commit `95b7cf9bbe`.	2024-10-16 12:36:19 -04:00
comfyanonymous	f584758271	Cleanup some useless lines.	2024-10-14 21:02:39 -04:00
svdc	95b7cf9bbe	Fix Transformers FutureWarning (#5140 ) * Update sd1_clip.py Fix Transformers FutureWarning * Update sd1_clip.py Fix comment	2024-10-14 20:12:20 -04:00
comfyanonymous	3c60ecd7a8	Fix fp8 ops staying enabled.	2024-10-12 14:10:13 -04:00
comfyanonymous	7ae6626723	Remove useless argument.	2024-10-12 07:16:21 -04:00
comfyanonymous	6632365e16	model_options consistency between functions. weight_dtype -> dtype	2024-10-11 20:51:19 -04:00
Kadir Nar	ad07796777	🐛 Add device to variable c (#5210 )	2024-10-11 20:37:50 -04:00
comfyanonymous	1b80895285	Make clip loader nodes support loading sd3 t5xxl in lower precision. Add attention mask support in the SD3 text encoder code.	2024-10-10 15:06:15 -04:00
Dr.Lt.Data	5f9d5a244b	Hotfix for the div zero occurrence when memory_used_encode is 0 (#5121 ) https://github.com/comfyanonymous/ComfyUI/issues/5069#issuecomment-2382656368	2024-10-09 23:34:34 -04:00
Jonathan Avila	4b2f0d9413	Increase maximum macOS version to 15.0.1 when forcing upcast attention (#5191 )	2024-10-09 22:21:41 -04:00
comfyanonymous	e38c94228b	Add a weight_dtype fp8_e4m3fn_fast to the Diffusion Model Loader node. This is used to load weights in fp8 and use fp8 matrix multiplication.	2024-10-09 19:43:17 -04:00
comfyanonymous	203942c8b2	Fix flux doras with diffusers keys.	2024-10-08 19:03:40 -04:00
comfyanonymous	8dfa0cc552	Make SD3 fast previews a little better.	2024-10-07 09:19:59 -04:00
comfyanonymous	e5ecdfdd2d	Make fast previews for SDXL a little better by adding a bias.	2024-10-06 19:27:04 -04:00
comfyanonymous	7d29fbf74b	Slightly improve the fast previews for flux by adding a bias.	2024-10-06 17:55:46 -04:00
comfyanonymous	7d2467e830	Some minor cleanups.	2024-10-05 13:22:39 -04:00
comfyanonymous	6f021d8aa0	Let --verbose have an argument for the log level.	2024-10-04 10:05:34 -04:00
comfyanonymous	d854ed0bcf	Allow using SD3 type te output on flux model.	2024-10-03 09:44:54 -04:00
comfyanonymous	abcd006b8c	Allow more permutations of clip/t5 in dual clip loader.	2024-10-03 09:26:11 -04:00
comfyanonymous	d985d1d7dc	CLIP Loader node now supports clip_l and clip_g only for SD3.	2024-10-02 04:25:17 -04:00
comfyanonymous	d1cdf51e1b	Refactor some of the TE detection code.	2024-10-01 07:08:41 -04:00
comfyanonymous	b4626ab93e	Add simpletuner lycoris format for SD unet.	2024-09-30 06:03:27 -04:00
comfyanonymous	a9e459c2a4	Use torch.nn.functional.linear in RGB preview code. Add an optional bias to the latent RGB preview code.	2024-09-29 11:27:49 -04:00
comfyanonymous	3bb4dec720	Fix issue with loras, lowvram and --fast fp8.	2024-09-28 14:42:32 -04:00
City	8733191563	Flux torch.compile fix (#5082 )	2024-09-27 22:07:51 -04:00
comfyanonymous	bdd4a22a2e	Fix flux TE not loading t5 embeddings.	2024-09-24 22:57:22 -04:00
chaObserv	479a427a48	Add dpmpp_2m_cfg_pp (#4992 )	2024-09-24 02:42:56 -04:00
comfyanonymous	3a0eeee320	Make --listen listen on both ipv4 and ipv6 at the same time by default.	2024-09-23 04:38:19 -04:00
comfyanonymous	9c41bc8d10	Remove useless line.	2024-09-23 02:32:29 -04:00
comfyanonymous	7a415f47a9	Add an optional VAE input to the ControlNetApplyAdvanced node. Deprecate the other controlnet nodes.	2024-09-22 01:24:52 -04:00
comfyanonymous	dc96a1ae19	Load controlnet in fp8 if weights are in fp8.	2024-09-21 04:50:12 -04:00
comfyanonymous	2d810b081e	Add load_controlnet_state_dict function.	2024-09-21 01:51:51 -04:00
comfyanonymous	9f7e9f0547	Add an error message when a controlnet needs a VAE but none is given.	2024-09-21 01:33:18 -04:00
comfyanonymous	70a708d726	Fix model merging issue.	2024-09-20 02:31:44 -04:00
yoinked	e7d4782736	add laplace scheduler [2407.03297] (#4990 ) * add laplace scheduler [2407.03297] * should be here instead lol * better settings	2024-09-19 23:23:09 -04:00
comfyanonymous	ad66f7c7d8	Add model_options to load_controlnet function.	2024-09-19 08:23:35 -04:00
Simon Lui	de8e8e3b0d	Fix xpu Pytorch nightly build from calling optimize which doesn't exist. (#4978 )	2024-09-19 05:11:42 -04:00
pharmapsychotic	0b7dfa986d	Improve tiling calculations to reduce number of tiles that need to be processed. (#4944 )	2024-09-17 03:51:10 -04:00
comfyanonymous	d514bb38ee	Add some option to model_options for the text encoder. load_device, offload_device and the initial_device can now be set.	2024-09-17 03:49:54 -04:00
comfyanonymous	0849c80e2a	get_key_patches now works without unloading the model.	2024-09-17 01:57:59 -04:00
comfyanonymous	e813abbb2c	Long CLIP L support for SDXL, SD3 and Flux. Use the *CLIPLoader nodes.	2024-09-15 07:59:38 -04:00
comfyanonymous	f48e390032	Support AliMama SD3 and Flux inpaint controlnets. Use the ControlNetInpaintingAliMamaApply node.	2024-09-14 09:05:16 -04:00
comfyanonymous	cf80d28689	Support loading controlnets with different input.	2024-09-13 09:54:37 -04:00
Robin Huang	b962db9952	Add cli arg to override user directory (#4856 ) * Override user directory. * Use overridden user directory. * Remove prints. * Remove references to global user_files. * Remove unused replace_folder function. * Remove newline. * Remove global during get_user_directory. * Add validation.	2024-09-12 08:10:27 -04:00
comfyanonymous	9d720187f1	types -> comfy_types to fix import issue.	2024-09-12 03:57:46 -04:00
comfyanonymous	9f4daca9d9	Doesn't really make sense for cfg_pp sampler to call regular one.	2024-09-11 02:51:36 -04:00
yoinked	b5d0f2a908	Add CFG++ to DPM++ 2S Ancestral (#3871 ) * Update sampling.py * Update samplers.py * my bad * "fix" the sampler * Update samplers.py * i named it wrong * minor sampling improvements mainly using a dynamic rho value (hey this sounds a lot like smea!!!) * revert rho change rho? r? its just 1/2	2024-09-11 02:49:44 -04:00
comfyanonymous	9c5fca75f4	Fix lora issue.	2024-09-08 10:10:47 -04:00
comfyanonymous	32a60a7bac	Support onetrainer text encoder Flux lora.	2024-09-08 09:31:41 -04:00
Jim Winkens	bb52934ba4	Fix import issue (#4815 )	2024-09-07 05:28:32 -04:00
comfyanonymous	ea77750759	Support a generic Comfy format for text encoder loras. This is a format with keys like: text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.v_proj.lora_up.weight Instead of waiting for me to add support for specific lora formats you can convert your text encoder loras to this format instead. If you want to see an example save a text encoder lora with the SaveLora node with the commit right after this one.	2024-09-07 02:20:39 -04:00
comfyanonymous	c27ebeb1c2	Fix onnx export not working on flux.	2024-09-06 03:21:52 -04:00
comfyanonymous	5cbaa9e07c	Mistoline flux controlnet support.	2024-09-05 00:05:17 -04:00
comfyanonymous	c7427375ee	Prioritize freeing partially offloaded models first.	2024-09-04 19:47:32 -04:00
Jedrzej Kosinski	f04229b84d	Add emb_patch support to UNetModel forward (#4779 )	2024-09-04 14:35:15 -04:00
Silver	f067ad15d1	Make live preview size a configurable launch argument (#4649 ) * Make live preview size a configurable launch argument * Remove import from testing phase * Update cli_args.py	2024-09-03 19:16:38 -04:00
comfyanonymous	483004dd1d	Support newer glora format.	2024-09-03 17:02:19 -04:00
comfyanonymous	00a5d08103	Lower fp8 lora memory usage.	2024-09-03 01:25:05 -04:00
comfyanonymous	d043997d30	Flux onetrainer lora.	2024-09-02 08:22:15 -04:00
comfyanonymous	8d31a6632f	Speed up inference on nvidia 10 series on Linux.	2024-09-01 17:29:31 -04:00
comfyanonymous	b643eae08b	Make minimum_inference_memory() depend on --reserve-vram	2024-09-01 01:18:34 -04:00
comfyanonymous	935ae153e1	Cleanup.	2024-08-30 12:53:59 -04:00
Chenlei Hu	e91662e784	Get logs endpoint & system_stats additions (#4690 ) * Add route for getting output logs * Include ComfyUI version * Move to own function * Changed to memory logger * Unify logger setup logic * Fix get version git fallback --------- Co-authored-by: pythongosssss <125205205+pythongosssss@users.noreply.github.com>	2024-08-30 12:46:37 -04:00
comfyanonymous	63fafaef45	Fix potential issue with hydit controlnets.	2024-08-30 04:58:41 -04:00
comfyanonymous	6eb5d64522	Fix glora lowvram issue.	2024-08-29 19:07:23 -04:00
comfyanonymous	10a79e9898	Implement model part of flux union controlnet.	2024-08-29 18:41:22 -04:00
comfyanonymous	ea3f39bd69	InstantX depth flux controlnet.	2024-08-29 02:14:19 -04:00
comfyanonymous	b33cd61070	InstantX canny controlnet.	2024-08-28 19:02:50 -04:00
comfyanonymous	d31e226650	Unify RMSNorm code.	2024-08-28 16:56:38 -04:00
comfyanonymous	38c22e631a	Fix case where model was not properly unloaded in merging workflows.	2024-08-27 19:03:51 -04:00
Chenlei Hu	6bbdcd28ae	Support weight padding on diff weight patch (#4576 )	2024-08-27 13:55:37 -04:00
comfyanonymous	ab130001a8	Do RMSNorm in native type.	2024-08-27 02:41:56 -04:00
comfyanonymous	2ca8f6e23d	Make the stochastic fp8 rounding reproducible.	2024-08-26 15:12:06 -04:00
comfyanonymous	7985ff88b9	Use less memory in float8 lora patching by doing calculations in fp16.	2024-08-26 14:45:58 -04:00
comfyanonymous	c6812947e9	Fix potential memory leak.	2024-08-26 02:07:32 -04:00
comfyanonymous	9230f65823	Fix some controlnets OOMing when loading.	2024-08-25 05:54:29 -04:00
comfyanonymous	8ae23d8e80	Fix onnx export.	2024-08-23 17:52:47 -04:00
comfyanonymous	7df42b9a23	Fix dora.	2024-08-23 04:58:59 -04:00
comfyanonymous	5d8bbb7281	Cleanup.	2024-08-23 04:06:27 -04:00
comfyanonymous	2c1d2375d6	Fix.	2024-08-23 04:04:55 -04:00
Simon Lui	64ccb3c7e3	Rework IPEX check for future inclusion of XPU into Pytorch upstream and do a bit more optimization of ipex.optimize(). (#4562 )	2024-08-23 03:59:57 -04:00
Scorpinaus	9465b23432	Added SD15_Inpaint_Diffusers model support for unet_config_from_diffusers_unet function (#4565 )	2024-08-23 03:57:08 -04:00
comfyanonymous	c0b0da264b	Missing imports.	2024-08-22 17:20:51 -04:00
comfyanonymous	c26ca27207	Move calculate function to comfy.lora	2024-08-22 17:12:00 -04:00
comfyanonymous	7c6bb84016	Code cleanups.	2024-08-22 17:05:12 -04:00
comfyanonymous	c54d3ed5e6	Fix issue with models staying loaded in memory.	2024-08-22 15:58:20 -04:00
comfyanonymous	c7ee4b37a1	Try to fix some lora issues.	2024-08-22 15:32:18 -04:00
David	7b70b266d8	Generalize MacOS version check for force-upcast-attention (#4548 ) This code automatically forces upcasting attention for MacOS versions 14.5 and 14.6. My computer returns the string "14.6.1" for `platform.mac_ver()[0]`, so this generalizes the comparison to catch more versions. I am running MacOS Sonoma 14.6.1 (latest version) and was seeing black image generation on previously functional workflows after recent software updates. This PR solved the issue for me. See comfyanonymous/ComfyUI#3521	2024-08-22 13:24:21 -04:00
comfyanonymous	8f60d093ba	Fix issue.	2024-08-22 10:38:24 -04:00
comfyanonymous	843a7ff70c	fp16 is actually faster than fp32 on a GTX 1080.	2024-08-21 23:23:50 -04:00
comfyanonymous	a60620dcea	Fix slow performance on 10 series Nvidia GPUs.	2024-08-21 16:39:02 -04:00
comfyanonymous	015f73dc49	Try a different type of flux fp16 fix.	2024-08-21 16:17:15 -04:00
comfyanonymous	904bf58e7d	Make --fast work on pytorch nightly.	2024-08-21 14:01:41 -04:00
Svein Ove Aas	5f50263088	Replace use of .view with .reshape (#4522 ) When generating images with fp8_e4_m3 Flux and batch size >1, using --fast, ComfyUI throws a "view size is not compatible with input tensor's size and stride" error pointing at the first of these two calls to view. As reshape is semantically equivalent to view except for working on a broader set of inputs, there should be no downside to changing this. The only difference is that it clones the underlying data in cases where .view would error out. I have confirmed that the output still looks as expected, but cannot confirm that no mutable use is made of the tensors anywhere. Note that --fast is only marginally faster than the default.	2024-08-21 11:21:48 -04:00
comfyanonymous	03ec517afb	Remove useless line, adjust windows default reserved vram.	2024-08-21 00:47:19 -04:00
comfyanonymous	510f3438c1	Speed up fp8 matrix mult by using better code.	2024-08-20 22:53:26 -04:00
comfyanonymous	ea63b1c092	Simpletrainer lycoris format.	2024-08-20 12:05:13 -04:00
comfyanonymous	9953f22fce	Add --fast argument to enable experimental optimizations. Optimizations that might break things/lower quality will be put behind this flag first and might be enabled by default in the future. Currently the only optimization is float8_e4m3fn matrix multiplication on 4000/ADA series Nvidia cards or later. If you have one of these cards you will see a speed boost when using fp8_e4m3fn flux for example.	2024-08-20 11:55:51 -04:00
comfyanonymous	d1a6bd6845	Support loading long clipl model with the CLIP loader node.	2024-08-20 10:46:36 -04:00
comfyanonymous	83dbac28eb	Properly set if clip text pooled projection instead of using hack.	2024-08-20 10:46:36 -04:00
comfyanonymous	538cb068bc	Make cast_to a nop if weight is already good.	2024-08-20 10:46:36 -04:00
comfyanonymous	1b3eee672c	Fix potential issue with multi devices.	2024-08-20 10:46:36 -04:00
comfyanonymous	9eee470244	New load_text_encoder_state_dicts function. Now you can load text encoders straight from a list of state dicts.	2024-08-19 17:36:35 -04:00
comfyanonymous	045377ea89	Add a --reserve-vram argument if you don't want comfy to use all of it. --reserve-vram 1.0 for example will make ComfyUI try to keep 1GB vram free. This can also be useful if workflows are failing because of OOM errors but in that case please report it if --reserve-vram improves your situation.	2024-08-19 17:16:18 -04:00
comfyanonymous	4d341b78e8	Bug fixes.	2024-08-19 16:28:55 -04:00
comfyanonymous	6138f92084	Use better dtype for the lowvram lora system.	2024-08-19 15:35:25 -04:00
comfyanonymous	be0726c1ed	Remove duplication.	2024-08-19 15:26:50 -04:00
comfyanonymous	4506ddc86a	Better subnormal fp8 stochastic rounding. Thanks Ashen.	2024-08-19 13:38:03 -04:00
comfyanonymous	20ace7c853	Code cleanup.	2024-08-19 12:48:59 -04:00
comfyanonymous	22ec02afc0	Handle subnormal numbers in float8 rounding.	2024-08-19 05:51:08 -04:00
comfyanonymous	39f114c44b	Less broken non blocking?	2024-08-18 16:53:17 -04:00
comfyanonymous	6730f3e1a3	Disable non blocking. It fixed some perf issues but caused other issues that need to be debugged.	2024-08-18 14:38:09 -04:00
comfyanonymous	73332160c8	Enable non blocking transfers in lowvram mode.	2024-08-18 10:29:33 -04:00
comfyanonymous	2622c55aff	Automatically use RF variant of dpmpp_2s_ancestral if RF model.	2024-08-18 00:47:25 -04:00
Ashen	1beb348ee2	dpmpp_2s_ancestral_RF for rectified flow (Flux, SD3 and Auraflow).	2024-08-18 00:33:30 -04:00
comfyanonymous	d31df04c8a	Indentation.	2024-08-17 23:00:44 -04:00
Xrvk	e68763f40c	Add Flux model support for InstantX style controlnet residuals (#4444 ) * Add Flux model support for InstantX style controlnet residuals * Refactor Flux controlnet residual step to a separate method * Rollback minor change * New format for applying controlnet residuals: input->double_blocks, output->single_blocks * Adjust XLabs Flux controlnet to fit new syntax of applying Flux controlnet residuals * Remove unnecessary import and minor style change	2024-08-17 22:58:23 -04:00
comfyanonymous	4f7a3cb6fb	unet -> diffusion_models.	2024-08-17 21:31:04 -04:00
comfyanonymous	bb222ceddb	Fix loras having a weak effect when applied on fp8.	2024-08-17 15:20:17 -04:00
comfyanonymous	fca42836f2	Add model_options for text encoder.	2024-08-17 11:17:20 -04:00
comfyanonymous	cd5017c1c9	calculate_weight function to use a different dtype.	2024-08-17 01:06:08 -04:00
comfyanonymous	83f343146a	Fix potential lowvram issue.	2024-08-16 17:12:42 -04:00
Matthew Turnshek	1770fc77ed	Implement support for taef1 latent previews (#4409 ) * add taef1 handling to several places * remove guess_latent_channels and add latent_channels info directly to flux model * remove TODO * fix numbers	2024-08-16 12:53:13 -04:00
comfyanonymous	5960f946a9	Move a few files from comfy -> comfy_execution. Python code in the comfy folder should not import things from outside it.	2024-08-15 11:21:14 -04:00
guill	5cfe38f41c	Execution Model Inversion (#2666 ) * Execution Model Inversion This PR inverts the execution model -- from recursively calling nodes to using a topological sort of the nodes. This change allows for modification of the node graph during execution. This allows for two major advantages: 1. The implementation of lazy evaluation in nodes. For example, if a "Mix Images" node has a mix factor of exactly 0.0, the second image input doesn't even need to be evaluated (and visa-versa if the mix factor is 1.0). 2. Dynamic expansion of nodes. This allows for the creation of dynamic "node groups". Specifically, custom nodes can return subgraphs that replace the original node in the graph. This is an incredibly powerful concept. Using this functionality, it was easy to implement: a. Components (a.k.a. node groups) b. Flow control (i.e. while loops) via tail recursion c. All-in-one nodes that replicate the WebUI functionality d. and more All of those were able to be implemented entirely via custom nodes, so those features are not a part of this PR. (There are some front-end changes that should occur before that functionality is made widely available, particularly around variant sockets.) The custom nodes associated with this PR can be found at: https://github.com/BadCafeCode/execution-inversion-demo-comfyui Note that some of them require that variant socket types ("") be enabled. Allow `input_info` to be of type `None` * Handle errors (like OOM) more gracefully * Add a command-line argument to enable variants This allows the use of nodes that have sockets of type '' without applying a patch to the code. Fix an overly aggressive assertion. This could happen when attempting to evaluate `IS_CHANGED` for a node during the creation of the cache (in order to create the cache key). * Fix Pyright warnings * Add execution model unit tests * Fix issue with unused literals Behavior should now match the master branch with regard to undeclared inputs. Undeclared inputs that are socket connections will be used while undeclared inputs that are literals will be ignored. * Make custom VALIDATE_INPUTS skip normal validation Additionally, if `VALIDATE_INPUTS` takes an argument named `input_types`, that variable will be a dictionary of the socket type of all incoming connections. If that argument exists, normal socket type validation will not occur. This removes the last hurdle for enabling variant types entirely from custom nodes, so I've removed that command-line option. I've added appropriate unit tests for these changes. * Fix example in unit test This wouldn't have caused any issues in the unit test, but it would have bugged the UI if someone copy+pasted it into their own node pack. * Use fstrings instead of '%' formatting syntax * Use custom exception types. * Display an error for dependency cycles Previously, dependency cycles that were created during node expansion would cause the application to quit (due to an uncaught exception). Now, we'll throw a proper error to the UI. We also make an attempt to 'blame' the most relevant node in the UI. * Add docs on when ExecutionBlocker should be used * Remove unused functionality * Rename ExecutionResult.SLEEPING to PENDING * Remove superfluous function parameter * Pass None for uneval inputs instead of default This applies to `VALIDATE_INPUTS`, `check_lazy_status`, and lazy values in evaluation functions. * Add a test for mixed node expansion This test ensures that a node that returns a combination of expanded subgraphs and literal values functions correctly. * Raise exception for bad get_node calls. * Minor refactor of IsChangedCache.get * Refactor `map_node_over_list` function * Fix ui output for duplicated nodes * Add documentation on `check_lazy_status` * Add file for execution model unit tests * Clean up Javascript code as per review * Improve documentation Converted some comments to docstrings as per review * Add a new unit test for mixed lazy results This test validates that when an output list is fed to a lazy node, the node will properly evaluate previous nodes that are needed by any inputs to the lazy node. No code in the execution model has been changed. The test already passes. * Allow kwargs in VALIDATE_INPUTS functions When kwargs are used, validation is skipped for all inputs as if they had been mentioned explicitly. * List cached nodes in `execution_cached` message This was previously just bugged in this PR.	2024-08-15 11:21:11 -04:00
comfyanonymous	0f9c2a7822	Try to fix SDXL OOM issue on some configurations.	2024-08-14 23:08:54 -04:00
comfyanonymous	f1d6cef71c	Revert "Disable cuda malloc by default." This reverts commit `50bf66e5c4`.	2024-08-14 08:38:07 -04:00
comfyanonymous	33fb282d5c	Fix issue.	2024-08-14 02:51:47 -04:00
comfyanonymous	50bf66e5c4	Disable cuda malloc by default.	2024-08-14 02:49:25 -04:00
comfyanonymous	a5af64d3ce	Revert "Not sure if this actually changes anything but it can't hurt." This reverts commit `34608de2e9`.	2024-08-14 01:05:17 -04:00
comfyanonymous	34608de2e9	Not sure if this actually changes anything but it can't hurt.	2024-08-13 13:29:16 -04:00
comfyanonymous	39fb74c5bd	Fix bug when model cannot be partially unloaded.	2024-08-13 03:57:55 -04:00
comfyanonymous	74e124f4d7	Fix some issues with TE being in lowvram mode.	2024-08-12 23:42:21 -04:00
comfyanonymous	a562c17e8a	load_unet -> load_diffusion_model with a model_options argument.	2024-08-12 23:20:57 -04:00
comfyanonymous	5942c17d55	Order of operations matters.	2024-08-12 21:56:18 -04:00
comfyanonymous	c032b11e07	xlabs Flux controlnet implementation. (#4260 ) * xlabs Flux controlnet. * Fix not working on old python. * Remove comment.	2024-08-12 21:22:22 -04:00
comfyanonymous	b8ffb2937f	Memory tweaks.	2024-08-12 15:07:11 -04:00
comfyanonymous	5d43e75e5b	Fix some issues with the model sometimes not getting patched.	2024-08-12 12:27:54 -04:00
comfyanonymous	517f4a94e4	Fix some lora loading slowdowns.	2024-08-12 11:50:32 -04:00
comfyanonymous	52a471c5c7	Change name of log.	2024-08-12 10:35:06 -04:00
comfyanonymous	ad76574cb8	Fix some potential issues with the previous commits.	2024-08-12 00:23:29 -04:00
comfyanonymous	9acfe4df41	Support loading directly to vram with CLIPLoader node.	2024-08-12 00:06:01 -04:00
comfyanonymous	9829b013ea	Fix mistake in last commit.	2024-08-12 00:00:17 -04:00
comfyanonymous	5c69cde037	Load TE model straight to vram if certain conditions are met.	2024-08-11 23:52:43 -04:00
comfyanonymous	e9589d6d92	Add a way to set model dtype and ops from load_checkpoint_guess_config.	2024-08-11 08:50:34 -04:00
comfyanonymous	0d82a798a5	Remove the ckpt_path from load_state_dict_guess_config.	2024-08-11 08:37:35 -04:00
ljleb	925fff26fd	alternative to `load_checkpoint_guess_config` that accepts a loaded state dict (#4249 ) * make alternative fn * add back ckpt path as 2nd argument?	2024-08-11 08:36:52 -04:00
comfyanonymous	75b9b55b22	Fix issues with #4302 and support loading diffusers format flux.	2024-08-10 21:28:24 -04:00
Jaret Burkett	1765f1c60c	FLUX: Added full diffusers mapping for FLUX.1 schnell and dev. Adds full LoRA support from diffusers LoRAs. (#4302 )	2024-08-10 21:26:41 -04:00
comfyanonymous	1de69fe4d5	Fix some issues with inference slowing down.	2024-08-10 16:21:25 -04:00
comfyanonymous	ae197f651b	Speed up hunyuan dit inference a bit.	2024-08-10 07:36:27 -04:00
comfyanonymous	1b5b8ca81a	Fix regression.	2024-08-09 21:45:21 -04:00
comfyanonymous	6678d5cf65	Fix regression.	2024-08-09 14:02:38 -04:00
TTPlanetPig	e172564eea	Update controlnet.py to fix the default controlnet weight as constant (#4285 )	2024-08-09 13:40:05 -04:00
comfyanonymous	a3cc326748	Better fix for lowvram issue.	2024-08-09 12:16:25 -04:00
comfyanonymous	86a97e91fc	Fix controlnet regression.	2024-08-09 12:08:58 -04:00
comfyanonymous	5acdadc9f3	Fix issue with some lowvram weights.	2024-08-09 03:58:28 -04:00
comfyanonymous	55ad9d5f8c	Fix regression.	2024-08-09 03:36:40 -04:00
comfyanonymous	a9f04edc58	Implement text encoder part of HunyuanDiT loras.	2024-08-09 03:21:10 -04:00
comfyanonymous	a475ec2300	Cleanup HunyuanDit controlnets. Use the: ControlNetApply SD3 and HunyuanDiT node.	2024-08-09 02:59:34 -04:00
来新璐	06eb9fb426	feat: add support for HunYuanDit ControlNet (#4245 ) * add support for HunYuanDit ControlNet * fix hunyuandit controlnet * fix typo in hunyuandit controlnet * fix typo in hunyuandit controlnet * fix code format style * add control_weight support for HunyuanDit Controlnet * use control_weights in HunyuanDit Controlnet * fix typo	2024-08-09 02:59:24 -04:00
comfyanonymous	413322645e	Raw torch is faster than einops?	2024-08-08 22:09:29 -04:00
comfyanonymous	11200de970	Cleaner code.	2024-08-08 20:07:09 -04:00
comfyanonymous	037c38eb0f	Try to improve inference speed on some machines.	2024-08-08 17:29:27 -04:00
comfyanonymous	1e11d2d1f5	Better prints.	2024-08-08 17:29:27 -04:00
comfyanonymous	66d4233210	Fix.	2024-08-08 15:16:51 -04:00
comfyanonymous	591010b7ef	Support diffusers text attention flux loras.	2024-08-08 14:45:52 -04:00

... 5 6 7 8 9 ...

1617 Commits