ComfyUI/comfy
Rattus 2d69bdfabe ops: sync the offload stream with the consumption of w&b
This sync is nessacary as pytorch will queue cuda async frees on the
same stream as created to tensor. In the case of async offload, this
will be on the offload stream.

Weights and biases can go out of scope in python which then
triggers the pytorch garbage collector to queue the free operation on
the offload stream possible before the compute stream has used the
weight. This causes a use after free on weight data leading to total
corruption of some workflows.

So sync the offload stream with the compute stream after the weight
has been used so the free has to wait for the weight to be used.

The cast_bias_weight is extended in a backwards compatible way with
the new behaviour opt-in on a defaulted parameter. This handles
custom node packs calling cast_bias_weight and defeatures
async-offload for them (as they do not handle the race).

The pattern is now:

cast_bias_weight(... , offloadable=True) #This might be offloaded
thing(weight, bias, ...)
uncast_bias_weight(...)
2025-10-29 21:38:26 +10:00
..
audio_encoders Support the HuMo model. (#9903) 2025-09-17 00:12:48 -04:00
cldm Replace print with logging (#6138) 2024-12-20 16:24:55 -05:00
comfy_types LoRA Trainer: LoRA training node in weight adapter scheme (#8446) 2025-06-13 19:25:59 -04:00
extra_samplers Uni pc sampler now works with audio and video models. 2025-01-18 05:27:58 -05:00
image_encoders Add Hunyuan 3D 2.1 Support (#8714) 2025-09-04 20:36:20 -04:00
k_diffusion Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (#9884) 2025-09-15 20:05:03 -04:00
ldm Fix batch size above 1 giving bad output in chroma radiance. (#10394) 2025-10-18 23:15:34 -04:00
sd1_tokenizer Silence clip tokenizer warning. (#8934) 2025-07-16 14:42:07 -04:00
t2i_adapter Controlnet refactor. 2024-06-27 18:43:11 -04:00
taesd Improvements to the TAESD3 implementation. 2024-06-16 02:04:24 -04:00
text_encoders Implement gemma 3 as a text encoder. (#10241) 2025-10-06 22:08:08 -04:00
weight_adapter Fix LoRA Trainer bugs with FP8 models. (#9854) 2025-09-20 21:24:48 -04:00
checkpoint_pickle.py Remove pytorch_lightning dependency. 2023-06-13 10:11:33 -04:00
cli_args.py Speed up offloading using pinned memory. (#10526) 2025-10-29 00:21:01 -04:00
clip_config_bigg.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
clip_model.py USO style reference. (#9677) 2025-09-02 15:36:22 -04:00
clip_vision_config_g.json Add support for clip g vision model to CLIPVisionLoader. 2023-08-18 11:13:29 -04:00
clip_vision_config_h.json Add support for unCLIP SD2.x models. 2023-04-01 23:19:15 -04:00
clip_vision_config_vitl_336_llava.json Support llava clip vision model. 2025-03-06 00:24:43 -05:00
clip_vision_config_vitl_336.json support clip-vit-large-patch14-336 (#4042) 2024-07-17 13:12:50 -04:00
clip_vision_config_vitl.json Add support for unCLIP SD2.x models. 2023-04-01 23:19:15 -04:00
clip_vision_siglip_384.json Support new flux model variants. 2024-11-21 08:38:23 -05:00
clip_vision_siglip_512.json Support 512 siglip model. 2025-04-05 07:01:01 -04:00
clip_vision.py Some changes to the previous hunyuan PR. (#9725) 2025-09-04 20:39:02 -04:00
conds.py Add some warnings and prevent crash when cond devices don't match. (#9169) 2025-08-04 04:20:12 -04:00
context_windows.py Make step index detection much more robust (#9392) 2025-08-17 18:54:07 -04:00
controlnet.py Support qwen inpaint controlnet. (#9772) 2025-09-08 17:30:26 -04:00
diffusers_convert.py Remove useless code. 2025-01-24 06:15:54 -05:00
diffusers_load.py load_unet -> load_diffusion_model with a model_options argument. 2024-08-12 23:20:57 -04:00
float.py Clamp output when rounding weight to prevent Nan. 2024-10-19 19:07:10 -04:00
gligen.py Remove some useless code. (#8812) 2025-07-06 07:07:39 -04:00
hooks.py Hooks Part 2 - TransformerOptionsHook and AdditionalModelsHook (#6377) 2025-01-11 12:20:23 -05:00
latent_formats.py Add support for Chroma Radiance (#9682) 2025-09-13 17:58:43 -04:00
lora_convert.py Implement the USO subject identity lora. (#9674) 2025-09-01 18:54:02 -04:00
lora.py Support the omnigen2 umo lora. (#9886) 2025-09-15 18:10:55 -04:00
model_base.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
model_detection.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
model_management.py mm: factor out the current stream getter 2025-10-29 21:31:13 +10:00
model_patcher.py Speed up offloading using pinned memory. (#10526) 2025-10-29 00:21:01 -04:00
model_sampling.py Refactor model sampling sigmas code. (#10250) 2025-10-08 17:49:02 -04:00
nested_tensor.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
ops.py ops: sync the offload stream with the consumption of w&b 2025-10-29 21:38:26 +10:00
options.py Only parse command line args when main.py is called. 2023-09-13 11:38:20 -04:00
patcher_extension.py Fix order of inputs nested merge_nested_dicts (#10362) 2025-10-15 16:47:26 -07:00
pixel_space_convert.py Changes to the previous radiance commit. (#9851) 2025-09-13 18:03:34 -04:00
quant_ops.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
rmsnorm.py Add warning when using old pytorch. (#9347) 2025-08-15 00:22:26 -04:00
sample.py Fix mistake. (#10484) 2025-10-25 23:07:29 -04:00
sampler_helpers.py Added context window support to core sampling code (#9238) 2025-08-13 21:33:05 -04:00
samplers.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
sd1_clip_config.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
sd1_clip.py Disable prompt weights for qwen. (#9438) 2025-08-20 01:08:11 -04:00
sd.py Fix issue. (#10527) 2025-10-29 00:37:00 -04:00
sdxl_clip.py Add a T5TokenizerOptions node to set options for the T5 tokenizer. (#7803) 2025-04-25 19:36:00 -04:00
supported_models_base.py Mixed Precision Quantization System (#10498) 2025-10-28 16:20:53 -04:00
supported_models.py Lower wan memory estimation value a bit. (#9964) 2025-09-20 22:09:35 -04:00
utils.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00