ComfyUI/comfy
rattus 73f5649196
Implement temporal rolling VAE (Major VRAM reductions in Hunyuan and Kandinsky) (#10995)
* hunyuan upsampler: rework imports

Remove the transitive import of VideoConv3d and Resnet and takes these
from actual implementation source.

* model: remove unused give_pre_end

According to git grep, this is not used now, and was not used in the
initial commit that introduced it (see below).

This semantic is difficult to implement temporal roll VAE for (and would
defeat the purpose). Rather than implement the complex if, just delete
the unused feature.

(venv) rattus@rattus-box2:~/ComfyUI$ git log --oneline
220afe33 (HEAD) Initial commit.
(venv) rattus@rattus-box2:~/ComfyUI$ git grep give_pre
comfy/ldm/modules/diffusionmodules/model.py:                 resolution, z_channels, give_pre_end=False, tanh_out=False, use_linear_attn=False,
comfy/ldm/modules/diffusionmodules/model.py:        self.give_pre_end = give_pre_end
comfy/ldm/modules/diffusionmodules/model.py:        if self.give_pre_end:

(venv) rattus@rattus-box2:~/ComfyUI$ git co origin/master
Previous HEAD position was 220afe33 Initial commit.
HEAD is now at 9d8a8179 Enable async offloading by default on Nvidia. (#10953)
(venv) rattus@rattus-box2:~/ComfyUI$ git grep give_pre
comfy/ldm/modules/diffusionmodules/model.py:                 resolution, z_channels, give_pre_end=False, tanh_out=False, use_linear_attn=False,
comfy/ldm/modules/diffusionmodules/model.py:        self.give_pre_end = give_pre_end
comfy/ldm/modules/diffusionmodules/model.py:        if self.give_pre_end:

* move refiner VAE temporal roller to core

Move the carrying conv op to the common VAE code and give it a better
name. Roll the carry implementation logic for Resnet into the base
class and scrap the Hunyuan specific subclass.

* model: Add temporal roll to main VAE decoder

If there are no attention layers, its a standard resnet and VideoConv3d
is asked for, substitute in the temporal rolloing VAE algorithm. This
reduces VAE usage by the temporal dimension (can be huge VRAM savings).

* model: Add temporal roll to main VAE encoder

If there are no attention layers, its a standard resnet and VideoConv3d
is asked for, substitute in the temporal rolling VAE algorithm. This
reduces VAE usage by the temporal dimension (can be huge VRAM savings).
2025-12-02 22:49:29 -05:00
..
audio_encoders Support the HuMo model. (#9903) 2025-09-17 00:12:48 -04:00
cldm Add better error message for common error. (#10846) 2025-11-23 04:55:22 -05:00
comfy_types LoRA Trainer: LoRA training node in weight adapter scheme (#8446) 2025-06-13 19:25:59 -04:00
extra_samplers Uni pc sampler now works with audio and video models. 2025-01-18 05:27:58 -05:00
image_encoders Add Hunyuan 3D 2.1 Support (#8714) 2025-09-04 20:36:20 -04:00
k_diffusion Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884) 2025-09-15 20:05:03 -04:00
ldm Implement temporal rolling VAE (Major VRAM reductions in Hunyuan and Kandinsky) (#10995) 2025-12-02 22:49:29 -05:00
sd1_tokenizer Silence clip tokenizer warning. (#8934) 2025-07-16 14:42:07 -04:00
t2i_adapter Controlnet refactor. 2024-06-27 18:43:11 -04:00
taesd Support video tiny VAEs (#10884) 2025-11-28 19:40:19 -05:00
text_encoders Implement the Ovis image model. (#11030) 2025-12-01 20:56:17 -05:00
weight_adapter Fix loras not working on mixed fp8. (#10899) 2025-11-26 00:07:58 -05:00
checkpoint_pickle.py Remove pytorch_lightning dependency. 2023-06-13 10:11:33 -04:00
cli_args.py feat: Support ComfyUI-Manager for pip version (#7555) 2025-12-01 22:32:52 -05:00
clip_config_bigg.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
clip_model.py USO style reference. (#9677) 2025-09-02 15:36:22 -04:00
clip_vision_config_g.json Add support for clip g vision model to CLIPVisionLoader. 2023-08-18 11:13:29 -04:00
clip_vision_config_h.json Add support for unCLIP SD2.x models. 2023-04-01 23:19:15 -04:00
clip_vision_config_vitl_336_llava.json Support llava clip vision model. 2025-03-06 00:24:43 -05:00
clip_vision_config_vitl_336.json support clip-vit-large-patch14-336 (#4042) 2024-07-17 13:12:50 -04:00
clip_vision_config_vitl.json Add support for unCLIP SD2.x models. 2023-04-01 23:19:15 -04:00
clip_vision_siglip_384.json Support new flux model variants. 2024-11-21 08:38:23 -05:00
clip_vision_siglip_512.json Support 512 siglip model. 2025-04-05 07:01:01 -04:00
clip_vision.py Some changes to the previous hunyuan PR. (#9725) 2025-09-04 20:39:02 -04:00
conds.py Add some warnings and prevent crash when cond devices don't match. (#9169) 2025-08-04 04:20:12 -04:00
context_windows.py Make step index detection much more robust (#9392) 2025-08-17 18:54:07 -04:00
controlnet.py Fix Race condition in --async-offload that can cause corruption (#10501) 2025-10-29 17:17:46 -04:00
diffusers_convert.py Remove useless code. 2025-01-24 06:15:54 -05:00
diffusers_load.py load_unet -> load_diffusion_model with a model_options argument. 2024-08-12 23:20:57 -04:00
float.py Clamp output when rounding weight to prevent Nan. 2024-10-19 19:07:10 -04:00
gligen.py Remove some useless code. (#8812) 2025-07-06 07:07:39 -04:00
hooks.py Hooks Part 2 - TransformerOptionsHook and AdditionalModelsHook (#6377) 2025-01-11 12:20:23 -05:00
latent_formats.py Support video tiny VAEs (#10884) 2025-11-28 19:40:19 -05:00
lora_convert.py Implement the USO subject identity lora. (#9674) 2025-09-01 18:54:02 -04:00
lora.py Add some missing z image lora layers. (#10980) 2025-11-28 23:55:00 -05:00
model_base.py Fix Flux2 reference image mem estimation. (#10905) 2025-11-26 02:36:19 -05:00
model_detection.py Implement the Ovis image model. (#11030) 2025-12-01 20:56:17 -05:00
model_management.py mm: wrap the raw stream in context manager (#10958) 2025-11-28 16:38:12 -05:00
model_patcher.py Account for the VRAM cost of weight offloading (#10733) 2025-11-27 01:03:03 -05:00
model_sampling.py Refactor model sampling sigmas code. (#10250) 2025-10-08 17:49:02 -04:00
nested_tensor.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
ops.py mm: wrap the raw stream in context manager (#10958) 2025-11-28 16:38:12 -05:00
options.py Only parse command line args when main.py is called. 2023-09-13 11:38:20 -04:00
patcher_extension.py Fix order of inputs nested merge_nested_dicts (#10362) 2025-10-15 16:47:26 -07:00
pixel_space_convert.py Changes to the previous radiance commit. (#9851) 2025-09-13 18:03:34 -04:00
quant_ops.py fix QuantizedTensor.is_contiguous (#10956) (#10959) 2025-11-28 16:33:07 -05:00
rmsnorm.py Add warning when using old pytorch. (#9347) 2025-08-15 00:22:26 -04:00
sample.py Fix mistake. (#10484) 2025-10-25 23:07:29 -04:00
sampler_helpers.py Added context window support to core sampling code (#9238) 2025-08-13 21:33:05 -04:00
samplers.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
sd1_clip_config.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
sd1_clip.py Lower vram usage for flux 2 text encoder. (#10887) 2025-11-25 14:58:39 -05:00
sd.py Implement the Ovis image model. (#11030) 2025-12-01 20:56:17 -05:00
sdxl_clip.py Add a T5TokenizerOptions node to set options for the T5 tokenizer. (#7803) 2025-04-25 19:36:00 -04:00
supported_models_base.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
supported_models.py Hack to make zimage work in fp16. (#11057) 2025-12-02 17:11:58 -05:00
utils.py Add some missing z image lora layers. (#10980) 2025-11-28 23:55:00 -05:00