ComfyUI/comfy
rattus 783782d5d7
Implement block prefetch + Lora Async load + and adopt in LTX (Speedup!) (CORE-111) (#13618)
* mm: Use Aimdo raw allocator for cast buffers

pytorch manages allocation of growing buffers on streams poorly. Pyt
has no windows support for the expandable segments allocator (which is
the right tool for this job), while also segmenting the memory by
stream such that it can be generally re-used. So kick the problem to
aimdo which can just grow a virtual region thats freed per stream.

* plan

* ops: move cpu handler up to the caller

* ops: split up prefetch from weight prep block prefetching API

Split up the casting and weight formating/lora stuff in prep for
arbitrary prefetch support.

* ops: implement block prefetching API

allow a model to construct a prefetch list and operate it for increased
async offload.

* ltxv2: Implement block prefetching

* Implement lora async offload

Implement async offload of loras.
2026-05-02 19:23:24 -04:00
..
audio_encoders Fix fp16 audio encoder models (#12811) 2026-03-06 18:20:07 -05:00
cldm Add better error message for common error. (#10846) 2025-11-23 04:55:22 -05:00
comfy_types fix: use frontend-compatible format for Float gradient_stops (#12789) 2026-03-12 10:14:28 -07:00
extra_samplers Uni pc sampler now works with audio and video models. 2025-01-18 05:27:58 -05:00
image_encoders Add Hunyuan 3D 2.1 Support (#8714) 2025-09-04 20:36:20 -04:00
k_diffusion ace15: Use dynamic_vram friendly trange (#12409) 2026-02-11 14:53:42 -05:00
ldm Implement block prefetch + Lora Async load + and adopt in LTX (Speedup!) (CORE-111) (#13618) 2026-05-02 19:23:24 -04:00
sd1_tokenizer Silence clip tokenizer warning. (#8934) 2025-07-16 14:42:07 -04:00
t2i_adapter
taesd Add high quality preview support for Flux2 latents (#13496) 2026-04-29 19:37:30 -04:00
text_encoders Cogvideox (#13402) 2026-04-29 19:30:08 -04:00
weight_adapter MPDynamic: force load flux img_in weight (Fixes flux1 canny+depth lora crash) (#12446) 2026-02-15 20:30:09 -05:00
cli_args.py Remove IPEX and clean up checks and add missing synchronize during empty cache. (#13653) 2026-05-01 14:16:41 -07:00
clip_config_bigg.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
clip_model.py Support the siglip 2 naflex model as a clip vision model. (#11831) 2026-01-12 17:05:54 -05:00
clip_vision_config_g.json
clip_vision_config_h.json
clip_vision_config_vitl_336_llava.json Support llava clip vision model. 2025-03-06 00:24:43 -05:00
clip_vision_config_vitl_336.json
clip_vision_config_vitl.json
clip_vision_siglip2_base_naflex.json Support the siglip 2 naflex model as a clip vision model. (#11831) 2026-01-12 17:05:54 -05:00
clip_vision_siglip_384.json Support new flux model variants. 2024-11-21 08:38:23 -05:00
clip_vision_siglip_512.json Support 512 siglip model. 2025-04-05 07:01:01 -04:00
clip_vision.py Reduce RAM usage, fix VRAM OOMs, and fix Windows shared memory spilling with adaptive model loading (#11845) 2026-02-01 01:01:11 -05:00
conds.py Cleanups to the last PR. (#12646) 2026-02-26 01:30:31 -05:00
context_windows.py Add slice_cond and per-model context window cond resizing (#12645) 2026-03-19 20:42:42 -07:00
controlnet.py Add working Qwen 2512 ControlNet (Fun ControlNet) support (#12359) 2026-02-13 22:23:52 -05:00
diffusers_convert.py Remove useless code. 2025-01-24 06:15:54 -05:00
diffusers_load.py load_unet -> load_diffusion_model with a model_options argument. 2024-08-12 23:20:57 -04:00
float.py feat: Support mxfp8 (#12907) 2026-03-14 18:36:29 -04:00
gligen.py Remove some useless code. (#8812) 2025-07-06 07:07:39 -04:00
hooks.py New Year ruff cleanup. (#11595) 2026-01-01 22:06:14 -05:00
latent_formats.py Add high quality preview support for Flux2 latents (#13496) 2026-04-29 19:37:30 -04:00
lora_convert.py Use torch RMSNorm for flux models and refactor hunyuan video code. (#12432) 2026-02-13 15:35:13 -05:00
lora.py Implement block prefetch + Lora Async load + and adopt in LTX (Speedup!) (CORE-111) (#13618) 2026-05-02 19:23:24 -04:00
memory_management.py Integrate RAM cache with model RAM management (#13173) 2026-03-27 21:34:16 -04:00
model_base.py Implement block prefetch + Lora Async load + and adopt in LTX (Speedup!) (CORE-111) (#13618) 2026-05-02 19:23:24 -04:00
model_detection.py Cogvideox (#13402) 2026-04-29 19:30:08 -04:00
model_management.py Implement block prefetch + Lora Async load + and adopt in LTX (Speedup!) (CORE-111) (#13618) 2026-05-02 19:23:24 -04:00
model_patcher.py Implement block prefetch + Lora Async load + and adopt in LTX (Speedup!) (CORE-111) (#13618) 2026-05-02 19:23:24 -04:00
model_prefetch.py Implement block prefetch + Lora Async load + and adopt in LTX (Speedup!) (CORE-111) (#13618) 2026-05-02 19:23:24 -04:00
model_sampling.py Cogvideox (#13402) 2026-04-29 19:30:08 -04:00
nested_tensor.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
ops.py Implement block prefetch + Lora Async load + and adopt in LTX (Speedup!) (CORE-111) (#13618) 2026-05-02 19:23:24 -04:00
options.py
patcher_extension.py Fix order of inputs nested merge_nested_dicts (#10362) 2025-10-15 16:47:26 -07:00
pinned_memory.py dynamicVRAM + --cache-ram 2 (CORE-117) (#13603) 2026-04-28 19:15:02 -04:00
pixel_space_convert.py Changes to the previous radiance commit. (#9851) 2025-09-13 18:03:34 -04:00
quant_ops.py feat: Support mxfp8 (#12907) 2026-03-14 18:36:29 -04:00
rmsnorm.py Remove code to support RMSNorm on old pytorch. (#12499) 2026-02-16 20:09:24 -05:00
sample.py Fix fp16 intermediates giving different results. (#13100) 2026-03-21 17:53:25 -04:00
sampler_helpers.py Disable dynamic_vram when weight hooks applied (#12653) 2026-02-28 16:50:18 -05:00
samplers.py Fix sampling issue with fp16 intermediates. (#13099) 2026-03-21 17:47:42 -04:00
sd1_clip_config.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
sd1_clip.py feat: Support Qwen3.5 text generation models (#12771) 2026-03-25 22:48:28 -04:00
sd.py Add high quality preview support for Flux2 latents (#13496) 2026-04-29 19:37:30 -04:00
sdxl_clip.py Add a T5TokenizerOptions node to set options for the T5 tokenizer. (#7803) 2025-04-25 19:36:00 -04:00
supported_models_base.py Fix some custom nodes. (#11134) 2025-12-05 18:25:31 -05:00
supported_models.py Reformat models variable into multiline array CORE-59 (#13513) 2026-05-01 17:20:11 +08:00
utils.py Reduce tiled decode peak memory (#13050) 2026-03-19 13:29:34 -04:00
windows.py Reduce RAM usage, fix VRAM OOMs, and fix Windows shared memory spilling with adaptive model loading (#11845) 2026-02-01 01:01:11 -05:00