ComfyUI/comfy
Rattus ca73329fcd mm: allow unload of the current model
In some workflows, its possible for a model to be used twice but with
different requirements for the inference VRAM.

Currently, once a model is loaded at a certain level of offload, it will
be preserved at that level of offload if it is used again. This will OOM
if there is a major change in the size of the inference VRAM. This happens
in your classic latent upscaling workflow where the same model is used twice
to generate and upscale.

This is very noticable for WAN in particlar.

Fix by two-passing the model VRAM unload process, firstly trying with
the existing list on idle models and then try again adding the actual
models that are about to be loaded. This will implement the partial
offload you need of your hot-in-VRAM model to make space for the bigger
inference.

Improve info messages regarding any unloads done.
2025-11-08 18:29:16 +10:00
..
audio_encoders Support the HuMo model. (#9903) 2025-09-17 00:12:48 -04:00
cldm Replace print with logging (#6138) 2024-12-20 16:24:55 -05:00
comfy_types LoRA Trainer: LoRA training node in weight adapter scheme (#8446) 2025-06-13 19:25:59 -04:00
extra_samplers Uni pc sampler now works with audio and video models. 2025-01-18 05:27:58 -05:00
image_encoders Add Hunyuan 3D 2.1 Support (#8714) 2025-09-04 20:36:20 -04:00
k_diffusion Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884) 2025-09-15 20:05:03 -04:00
ldm Fix qwen controlnet regression. (#10657) 2025-11-05 18:07:35 -05:00
sd1_tokenizer Silence clip tokenizer warning. (#8934) 2025-07-16 14:42:07 -04:00
t2i_adapter
taesd
text_encoders Implement gemma 3 as a text encoder. (#10241) 2025-10-06 22:08:08 -04:00
weight_adapter Fix LoRA Trainer bugs with FP8 models. (#9854) 2025-09-20 21:24:48 -04:00
checkpoint_pickle.py
cli_args.py Enable pinned memory by default on Nvidia. (#10656) 2025-11-05 18:08:13 -05:00
clip_config_bigg.json
clip_model.py USO style reference. (#9677) 2025-09-02 15:36:22 -04:00
clip_vision_config_g.json
clip_vision_config_h.json
clip_vision_config_vitl_336_llava.json Support llava clip vision model. 2025-03-06 00:24:43 -05:00
clip_vision_config_vitl_336.json
clip_vision_config_vitl.json
clip_vision_siglip_384.json Support new flux model variants. 2024-11-21 08:38:23 -05:00
clip_vision_siglip_512.json Support 512 siglip model. 2025-04-05 07:01:01 -04:00
clip_vision.py Some changes to the previous hunyuan PR. (#9725) 2025-09-04 20:39:02 -04:00
conds.py Add some warnings and prevent crash when cond devices don't match. (#9169) 2025-08-04 04:20:12 -04:00
context_windows.py Make step index detection much more robust (#9392) 2025-08-17 18:54:07 -04:00
controlnet.py Fix Race condition in --async-offload that can cause corruption (#10501) 2025-10-29 17:17:46 -04:00
diffusers_convert.py Remove useless code. 2025-01-24 06:15:54 -05:00
diffusers_load.py
float.py
gligen.py Remove some useless code. (#8812) 2025-07-06 07:07:39 -04:00
hooks.py Hooks Part 2 - TransformerOptionsHook and AdditionalModelsHook (#6377) 2025-01-11 12:20:23 -05:00
latent_formats.py Add support for Chroma Radiance (#9682) 2025-09-13 17:58:43 -04:00
lora_convert.py Implement the USO subject identity lora. (#9674) 2025-09-01 18:54:02 -04:00
lora.py Support the omnigen2 umo lora. (#9886) 2025-09-15 18:10:55 -04:00
model_base.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
model_detection.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
model_management.py mm: allow unload of the current model 2025-11-08 18:29:16 +10:00
model_patcher.py Fix issue with pinned memory. (#10597) 2025-11-01 17:25:59 -04:00
model_sampling.py Refactor model sampling sigmas code. (#10250) 2025-10-08 17:49:02 -04:00
nested_tensor.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
ops.py This seems to slow things down slightly on Linux. (#10624) 2025-11-03 21:47:14 -05:00
options.py
patcher_extension.py Fix order of inputs nested merge_nested_dicts (#10362) 2025-10-15 16:47:26 -07:00
pixel_space_convert.py Changes to the previous radiance commit. (#9851) 2025-09-13 18:03:34 -04:00
quant_ops.py More fp8 torch.compile regressions fixed. (#10625) 2025-11-03 22:14:20 -05:00
rmsnorm.py Add warning when using old pytorch. (#9347) 2025-08-15 00:22:26 -04:00
sample.py Fix mistake. (#10484) 2025-10-25 23:07:29 -04:00
sampler_helpers.py Added context window support to core sampling code (#9238) 2025-08-13 21:33:05 -04:00
samplers.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
sd1_clip_config.json
sd1_clip.py Disable prompt weights for qwen. (#9438) 2025-08-20 01:08:11 -04:00
sd.py Add RAM Pressure cache mode (#10454) 2025-10-30 17:39:02 -04:00
sdxl_clip.py Add a T5TokenizerOptions node to set options for the T5 tokenizer. (#7803) 2025-04-25 19:36:00 -04:00
supported_models_base.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
supported_models.py Lower wan memory estimation value a bit. (#9964) 2025-09-20 22:09:35 -04:00
utils.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00