ComfyUI/comfy
Rattus 95e6413587 mp: Account for the VRAM cost of weight offloading
when checking the VRAM headroom, assume that the weight needs to be
offloaded, and only load if it has space for both the load and offload
 * the number of streams.

As the weights are ordered from largest to smallest by offload cost
this is guaranteed to fit in VRAM (tm), as all weights that follow
will be smaller.

Make the partial unload aware of this system as well by saving the
budget for offload VRAM to the model state and accounting accordingly.
Its possible that partial unload increases the size of the largest
offloaded weights, and thus needs to unload a little bit more than
asked to accomodate the bigger temp buffers.

Honor the existing codes floor on model weight loading of 128MB by
having the patcher honor this separately withough regard to offloading.
Otherwise when MM specifies its 128MB minimum, MP will see the biggest
weights, and budget that 128MB to only offload buffer and load nothing
which isnt the intent of these minimums. The same clamp applies in
case of partial offload of the currently loading model.
2025-11-27 15:39:50 +10:00
..
audio_encoders Support the HuMo model. (#9903) 2025-09-17 00:12:48 -04:00
cldm Add better error message for common error. (#10846) 2025-11-23 04:55:22 -05:00
comfy_types
extra_samplers
image_encoders Add Hunyuan 3D 2.1 Support (#8714) 2025-09-04 20:36:20 -04:00
k_diffusion Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884) 2025-09-15 20:05:03 -04:00
ldm block info (#10841) 2025-11-26 20:28:44 -08:00
sd1_tokenizer
t2i_adapter
taesd
text_encoders Z Image model. (#10892) 2025-11-25 18:41:45 -05:00
weight_adapter Fix loras not working on mixed fp8. (#10899) 2025-11-26 00:07:58 -05:00
checkpoint_pickle.py
cli_args.py --disable-api-nodes now sets CSP header to force frontend offline. (#10829) 2025-11-21 17:51:55 -05:00
clip_config_bigg.json
clip_model.py USO style reference. (#9677) 2025-09-02 15:36:22 -04:00
clip_vision_config_g.json
clip_vision_config_h.json
clip_vision_config_vitl_336_llava.json
clip_vision_config_vitl_336.json
clip_vision_config_vitl.json
clip_vision_siglip_384.json
clip_vision_siglip_512.json
clip_vision.py Some changes to the previous hunyuan PR. (#9725) 2025-09-04 20:39:02 -04:00
conds.py
context_windows.py Make step index detection much more robust (#9392) 2025-08-17 18:54:07 -04:00
controlnet.py Fix Race condition in --async-offload that can cause corruption (#10501) 2025-10-29 17:17:46 -04:00
diffusers_convert.py
diffusers_load.py
float.py
gligen.py
hooks.py
latent_formats.py Add cheap latent preview for flux 2. (#10907) 2025-11-26 04:00:43 -05:00
lora_convert.py Implement the USO subject identity lora. (#9674) 2025-09-01 18:54:02 -04:00
lora.py Support the omnigen2 umo lora. (#9886) 2025-09-15 18:10:55 -04:00
model_base.py Fix Flux2 reference image mem estimation. (#10905) 2025-11-26 02:36:19 -05:00
model_detection.py Z Image model. (#10892) 2025-11-25 18:41:45 -05:00
model_management.py mm: remove 128MB minimum 2025-11-27 15:39:50 +10:00
model_patcher.py mp: Account for the VRAM cost of weight offloading 2025-11-27 15:39:50 +10:00
model_sampling.py Refactor model sampling sigmas code. (#10250) 2025-10-08 17:49:02 -04:00
nested_tensor.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
ops.py Fix loras not working on mixed fp8. (#10899) 2025-11-26 00:07:58 -05:00
options.py
patcher_extension.py Fix order of inputs nested merge_nested_dicts (#10362) 2025-10-15 16:47:26 -07:00
pixel_space_convert.py Changes to the previous radiance commit. (#9851) 2025-09-13 18:03:34 -04:00
quant_ops.py Fix loras not working on mixed fp8. (#10899) 2025-11-26 00:07:58 -05:00
rmsnorm.py Add warning when using old pytorch. (#9347) 2025-08-15 00:22:26 -04:00
sample.py Fix mistake. (#10484) 2025-10-25 23:07:29 -04:00
sampler_helpers.py Added context window support to core sampling code (#9238) 2025-08-13 21:33:05 -04:00
samplers.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
sd1_clip_config.json
sd1_clip.py Lower vram usage for flux 2 text encoder. (#10887) 2025-11-25 14:58:39 -05:00
sd.py Z Image model. (#10892) 2025-11-25 18:41:45 -05:00
sdxl_clip.py
supported_models_base.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
supported_models.py Adjustments to Z Image. (#10893) 2025-11-25 19:02:51 -05:00
utils.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00