ComfyUI/comfy
Rattus ca73329fcd mm: allow unload of the current model
In some workflows, its possible for a model to be used twice but with
different requirements for the inference VRAM.

Currently, once a model is loaded at a certain level of offload, it will
be preserved at that level of offload if it is used again. This will OOM
if there is a major change in the size of the inference VRAM. This happens
in your classic latent upscaling workflow where the same model is used twice
to generate and upscale.

This is very noticable for WAN in particlar.

Fix by two-passing the model VRAM unload process, firstly trying with
the existing list on idle models and then try again adding the actual
models that are about to be loaded. This will implement the partial
offload you need of your hot-in-VRAM model to make space for the bigger
inference.

Improve info messages regarding any unloads done.
2025-11-08 18:29:16 +10:00
..
audio_encoders Support the HuMo model. (#9903) 2025-09-17 00:12:48 -04:00
cldm Replace print with logging (#6138) 2024-12-20 16:24:55 -05:00
comfy_types LoRA Trainer: LoRA training node in weight adapter scheme (#8446) 2025-06-13 19:25:59 -04:00
extra_samplers Uni pc sampler now works with audio and video models. 2025-01-18 05:27:58 -05:00
image_encoders Add Hunyuan 3D 2.1 Support (#8714) 2025-09-04 20:36:20 -04:00
k_diffusion Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884) 2025-09-15 20:05:03 -04:00
ldm Fix qwen controlnet regression. (#10657) 2025-11-05 18:07:35 -05:00
sd1_tokenizer Silence clip tokenizer warning. (#8934) 2025-07-16 14:42:07 -04:00
t2i_adapter Controlnet refactor. 2024-06-27 18:43:11 -04:00
taesd Improvements to the TAESD3 implementation. 2024-06-16 02:04:24 -04:00
text_encoders Implement gemma 3 as a text encoder. (#10241) 2025-10-06 22:08:08 -04:00
weight_adapter Fix LoRA Trainer bugs with FP8 models. (#9854) 2025-09-20 21:24:48 -04:00
checkpoint_pickle.py Remove pytorch_lightning dependency. 2023-06-13 10:11:33 -04:00
cli_args.py Enable pinned memory by default on Nvidia. (#10656) 2025-11-05 18:08:13 -05:00
clip_config_bigg.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
clip_model.py USO style reference. (#9677) 2025-09-02 15:36:22 -04:00
clip_vision_config_g.json Add support for clip g vision model to CLIPVisionLoader. 2023-08-18 11:13:29 -04:00
clip_vision_config_h.json Add support for unCLIP SD2.x models. 2023-04-01 23:19:15 -04:00
clip_vision_config_vitl_336_llava.json Support llava clip vision model. 2025-03-06 00:24:43 -05:00
clip_vision_config_vitl_336.json support clip-vit-large-patch14-336 (#4042) 2024-07-17 13:12:50 -04:00
clip_vision_config_vitl.json Add support for unCLIP SD2.x models. 2023-04-01 23:19:15 -04:00
clip_vision_siglip_384.json Support new flux model variants. 2024-11-21 08:38:23 -05:00
clip_vision_siglip_512.json Support 512 siglip model. 2025-04-05 07:01:01 -04:00
clip_vision.py Some changes to the previous hunyuan PR. (#9725) 2025-09-04 20:39:02 -04:00
conds.py Add some warnings and prevent crash when cond devices don't match. (#9169) 2025-08-04 04:20:12 -04:00
context_windows.py Make step index detection much more robust (#9392) 2025-08-17 18:54:07 -04:00
controlnet.py Fix Race condition in --async-offload that can cause corruption (#10501) 2025-10-29 17:17:46 -04:00
diffusers_convert.py Remove useless code. 2025-01-24 06:15:54 -05:00
diffusers_load.py load_unet -> load_diffusion_model with a model_options argument. 2024-08-12 23:20:57 -04:00
float.py Clamp output when rounding weight to prevent Nan. 2024-10-19 19:07:10 -04:00
gligen.py Remove some useless code. (#8812) 2025-07-06 07:07:39 -04:00
hooks.py Hooks Part 2 - TransformerOptionsHook and AdditionalModelsHook (#6377) 2025-01-11 12:20:23 -05:00
latent_formats.py Add support for Chroma Radiance (#9682) 2025-09-13 17:58:43 -04:00
lora_convert.py Implement the USO subject identity lora. (#9674) 2025-09-01 18:54:02 -04:00
lora.py Support the omnigen2 umo lora. (#9886) 2025-09-15 18:10:55 -04:00
model_base.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
model_detection.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
model_management.py mm: allow unload of the current model 2025-11-08 18:29:16 +10:00
model_patcher.py Fix issue with pinned memory. (#10597) 2025-11-01 17:25:59 -04:00
model_sampling.py Refactor model sampling sigmas code. (#10250) 2025-10-08 17:49:02 -04:00
nested_tensor.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
ops.py This seems to slow things down slightly on Linux. (#10624) 2025-11-03 21:47:14 -05:00
options.py Only parse command line args when main.py is called. 2023-09-13 11:38:20 -04:00
patcher_extension.py Fix order of inputs nested merge_nested_dicts (#10362) 2025-10-15 16:47:26 -07:00
pixel_space_convert.py Changes to the previous radiance commit. (#9851) 2025-09-13 18:03:34 -04:00
quant_ops.py More fp8 torch.compile regressions fixed. (#10625) 2025-11-03 22:14:20 -05:00
rmsnorm.py Add warning when using old pytorch. (#9347) 2025-08-15 00:22:26 -04:00
sample.py Fix mistake. (#10484) 2025-10-25 23:07:29 -04:00
sampler_helpers.py Added context window support to core sampling code (#9238) 2025-08-13 21:33:05 -04:00
samplers.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
sd1_clip_config.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
sd1_clip.py Disable prompt weights for qwen. (#9438) 2025-08-20 01:08:11 -04:00
sd.py Add RAM Pressure cache mode (#10454) 2025-10-30 17:39:02 -04:00
sdxl_clip.py Add a T5TokenizerOptions node to set options for the T5 tokenizer. (#7803) 2025-04-25 19:36:00 -04:00
supported_models_base.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
supported_models.py Lower wan memory estimation value a bit. (#9964) 2025-09-20 22:09:35 -04:00
utils.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00