ComfyUI/comfy
Rattus 4bb34b85b7 mm: make model offloading deffered with weakrefs
RAMPressure caching may ned to purge the same model that you are
currently trying to offload for VRAM freeing. In this case, RAMPressure
cache takes priority and needs to be able to pull the trigger on dumping
the whole model and freeing the ModelPatcher in question. To do this,
defer the actual tranfer of model weights from GPU to RAM to
model_management state and not as part of ModelPatcher. This is dones as
a list of weakrefs.

If RAM cache decides to free to model you are currently unloading, then
the ModelPatcher and refs simply dissappear in the middle of the
unloading process, and both RAM and VRAM will be freed.

The unpatcher now queues the individual leaf modules to be offloaded
one-by-one so that RAM levels can be monitored.

Note that the UnloadPartially that is potentially done as part of a
load will not be freeable this way, however it shouldn't be anyway as
that is the currently active model and RAM cache cannot save you if
you cant even fit the one model you are currently trying to use.
2025-12-19 19:32:51 +10:00
..
audio_encoders
cldm
comfy_types
extra_samplers
image_encoders
k_diffusion Fix the last step with non-zero sigma in sa_solver (#11380) 2025-12-17 13:57:40 -05:00
ldm Diffusion model part of Qwen Image Layered. (#11408) 2025-12-18 20:21:14 -05:00
sd1_tokenizer
t2i_adapter
taesd Support video tiny VAEs (#10884) 2025-11-28 19:40:19 -05:00
text_encoders Fix qwen scaled fp8 not working with kandinsky. Make basic t2i wf work. (#11162) 2025-12-06 17:50:10 -08:00
weight_adapter
checkpoint_pickle.py
cli_args.py feat(preview): add per-queue live preview method override (#11261) 2025-12-15 15:57:39 -08:00
clip_config_bigg.json
clip_model.py
clip_vision_config_g.json
clip_vision_config_h.json
clip_vision_config_vitl_336_llava.json
clip_vision_config_vitl_336.json
clip_vision_config_vitl.json
clip_vision_siglip_384.json
clip_vision_siglip_512.json
clip_vision.py
conds.py
context_windows.py Add context windows callback for custom cond handling (#11208) 2025-12-15 16:06:32 -08:00
controlnet.py
diffusers_convert.py
diffusers_load.py
float.py
gligen.py
hooks.py
latent_formats.py Support video tiny VAEs (#10884) 2025-11-28 19:40:19 -05:00
lora_convert.py
lora.py Support "transformer." LoRA prefix for Z-Image (#11135) 2025-12-08 15:17:26 -05:00
model_base.py sd: Free RAM on main model load 2025-12-19 19:32:51 +10:00
model_detection.py Diffusion model part of Qwen Image Layered. (#11408) 2025-12-18 20:21:14 -05:00
model_management.py mm: make model offloading deffered with weakrefs 2025-12-19 19:32:51 +10:00
model_patcher.py mm: make model offloading deffered with weakrefs 2025-12-19 19:32:51 +10:00
model_sampling.py
nested_tensor.py
ops.py Fix pytorch warnings. (#11314) 2025-12-13 18:45:23 -05:00
options.py
patcher_extension.py
pixel_space_convert.py
quant_ops.py Fix nan issue when quantizing fp16 tensor. (#11213) 2025-12-09 17:03:21 -05:00
rmsnorm.py
sample.py
sampler_helpers.py skip_load_model -> force_full_load (#11390) 2025-12-17 23:29:32 -05:00
samplers.py Add exp_heun_2_x0 sampler series (#11360) 2025-12-16 23:35:43 -05:00
sd1_clip_config.json
sd1_clip.py Make old scaled fp8 format use the new mixed quant ops system. (#11000) 2025-12-05 14:35:42 -05:00
sd.py sd: Free RAM on main model load 2025-12-19 19:32:51 +10:00
sdxl_clip.py
supported_models_base.py Fix some custom nodes. (#11134) 2025-12-05 18:25:31 -05:00
supported_models.py Only enable fp16 on ZImage on newer pytorch. (#11344) 2025-12-15 22:33:27 -05:00
utils.py Update warning for old pytorch version. (#11319) 2025-12-14 04:02:50 -05:00