ComfyUI/comfy
Kosinkadink a18dd219d5 Pass per-device model to multigpu control clones in pre_run_control
QwenFunControlNet.pre_run stashes model.diffusion_model into extra_args,
which the control_model then uses for forward passes (img_in, txt_in,
pe_embedder, time_text_embed). With multigpu, every per-device control
clone was being pre_run with the base model on GPU0, so secondary
devices would invoke those modules with parameters on GPU0 and inputs
on their own device, raising 'Expected all tensors to be on the same
device'. Build a device -> per-device BaseModel lookup from the
patcher's additional multigpu models and pass each clone the model on
its own device. Falls back to the base model when no per-device match
is found (single-GPU path and the case where cnet.multigpu_clones lags
the patcher's clone set).

Amp-Thread-ID: https://ampcode.com/threads/T-019e4a00-fe3d-76bd-a2f2-a8c8c4040082
Co-authored-by: Amp <amp@ampcode.com>
2026-05-21 11:40:49 -07:00
..
audio_encoders Fix fp16 audio encoder models (#12811) 2026-03-06 18:20:07 -05:00
background_removal Add support for BiRefNet background remove model (CORE-46) (#12747) 2026-05-08 17:59:24 +08:00
cldm Add better error message for common error. (#10846) 2025-11-23 04:55:22 -05:00
comfy_types fix: use frontend-compatible format for Float gradient_stops (#12789) 2026-03-12 10:14:28 -07:00
extra_samplers Uni pc sampler now works with audio and video models. 2025-01-18 05:27:58 -05:00
image_encoders feat: Support MoGe (CORE-168) (#13878) 2026-05-15 10:34:56 +08:00
k_diffusion feat: Support HiDream-O1-Image (CORE-187) (#13817) 2026-05-11 20:35:53 -07:00
ldm Merge remote-tracking branch 'origin/master' into merge-master-into-worksplit-multigpu 2026-05-19 21:43:51 -07:00
sd1_tokenizer Silence clip tokenizer warning. (#8934) 2025-07-16 14:42:07 -04:00
t2i_adapter Controlnet refactor. 2024-06-27 18:43:11 -04:00
taesd Add high quality preview support for Flux2 latents (#13496) 2026-04-29 19:37:30 -04:00
text_encoders Fix Qwen3.5 text generation with multiple input images (#13943) 2026-05-18 01:16:42 -04:00
weight_adapter MPDynamic: force load flux img_in weight (Fixes flux1 canny+depth lora crash) (#12446) 2026-02-15 20:30:09 -05:00
bg_removal_model.py Fix BiRefNet issue (#13966) 2026-05-19 05:03:22 +08:00
cli_args.py Document --cuda-device comma format and MultiGPU Options relative_speed gap 2026-05-20 20:48:59 -07:00
clip_config_bigg.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
clip_model.py Support the siglip 2 naflex model as a clip vision model. (#11831) 2026-01-12 17:05:54 -05:00
clip_vision_config_g.json Add support for clip g vision model to CLIPVisionLoader. 2023-08-18 11:13:29 -04:00
clip_vision_config_h.json Add support for unCLIP SD2.x models. 2023-04-01 23:19:15 -04:00
clip_vision_config_vitl_336_llava.json Support llava clip vision model. 2025-03-06 00:24:43 -05:00
clip_vision_config_vitl_336.json support clip-vit-large-patch14-336 (#4042) 2024-07-17 13:12:50 -04:00
clip_vision_config_vitl.json Add support for unCLIP SD2.x models. 2023-04-01 23:19:15 -04:00
clip_vision_siglip2_base_naflex.json Support the siglip 2 naflex model as a clip vision model. (#11831) 2026-01-12 17:05:54 -05:00
clip_vision_siglip_384.json Support new flux model variants. 2024-11-21 08:38:23 -05:00
clip_vision_siglip_512.json Support 512 siglip model. 2025-04-05 07:01:01 -04:00
clip_vision.py Reduce RAM usage, fix VRAM OOMs, and fix Windows shared memory spilling with adaptive model loading (#11845) 2026-02-01 01:01:11 -05:00
conds.py Cleanups to the last PR. (#12646) 2026-02-26 01:30:31 -05:00
context_windows.py feat: Context windows - add causal_window_fix to improve blending of context windows (CORE-100) (#13563) 2026-05-05 16:40:53 -07:00
controlnet.py Free QwenFunControlNet base_model reference in cleanup 2026-05-21 11:35:54 -07:00
deploy_environment.py Add deploy environment header (Comfy-Env) to partner node API calls (#13425) 2026-05-04 20:17:56 -07:00
diffusers_convert.py Remove useless code. 2025-01-24 06:15:54 -05:00
diffusers_load.py load_unet -> load_diffusion_model with a model_options argument. 2024-08-12 23:20:57 -04:00
float.py feat: Support mxfp8 (#12907) 2026-03-14 18:36:29 -04:00
gligen.py Remove some useless code. (#8812) 2025-07-06 07:07:39 -04:00
hooks.py Fix typos (#10986) 2026-05-08 17:14:45 +08:00
latent_formats.py Use temporal downscale to make empty audio latent nodes more reusable. (#13975) 2026-05-19 00:14:30 -04:00
lora_convert.py Use torch RMSNorm for flux models and refactor hunyuan video code. (#12432) 2026-02-13 15:35:13 -05:00
lora.py Support anima TE lora kohya format. (#13847) 2026-05-11 20:01:52 -07:00
memory_management.py Integrate RAM cache with model RAM management (#13173) 2026-03-27 21:34:16 -04:00
model_base.py HiDream-O1: support area conditioning (#13944) 2026-05-18 01:17:05 -04:00
model_detection.py feat: Support HiDream-O1-Image (CORE-187) (#13817) 2026-05-11 20:35:53 -07:00
model_management.py Fix get_all_torch_devices for XPU/NPU and guard remove() 2026-05-20 16:46:38 -07:00
model_patcher.py Restore prepare_state backward-compatible signature 2026-05-21 11:35:39 -07:00
model_prefetch.py prefetch: guard against no offload (#13703) 2026-05-04 12:56:05 -07:00
model_sampling.py feat: Support HiDream-O1-Image (CORE-187) (#13817) 2026-05-11 20:35:53 -07:00
multigpu.py Prune inherited multigpu clones when max_gpus is lowered 2026-05-20 16:46:45 -07:00
nested_tensor.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
ops.py Fix typo in ops.py (#11925) 2026-05-20 05:45:04 +08:00
options.py Only parse command line args when main.py is called. 2023-09-13 11:38:20 -04:00
patcher_extension.py Merge branch 'master' into worksplit-multigpu 2025-10-15 17:33:02 -07:00
pinned_memory.py dynamicVRAM + --cache-ram 2 (CORE-117) (#13603) 2026-04-28 19:15:02 -04:00
pixel_space_convert.py Changes to the previous radiance commit. (#9851) 2025-09-13 18:03:34 -04:00
quant_ops.py Enable triton comfy kitchen via cli-arg (#12730) 2026-05-03 14:07:21 -04:00
rmsnorm.py feat: Gemma4 text generation support (CORE-30) (#13376) 2026-05-02 22:46:15 -04:00
sample.py Initial work to make downscale_ratio_temporal work. (#13972) 2026-05-18 23:01:43 -04:00
sampler_helpers.py Merge remote-tracking branch 'origin/master' into merge-master-into-worksplit-multigpu 2026-05-19 21:43:51 -07:00
samplers.py Pass per-device model to multigpu control clones in pre_run_control 2026-05-21 11:40:49 -07:00
sd1_clip_config.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
sd1_clip.py feat: Support Qwen3.5 text generation models (#12771) 2026-03-25 22:48:28 -04:00
sd.py Guard cached_patcher_init when output_model is False 2026-05-20 16:46:31 -07:00
sdxl_clip.py Add a T5TokenizerOptions node to set options for the T5 tokenizer. (#7803) 2025-04-25 19:36:00 -04:00
supported_models_base.py Fix some custom nodes. (#11134) 2025-12-05 18:25:31 -05:00
supported_models.py Better Hidream O1 mem usage factor for non dynamic vram. (#13864) 2026-05-12 20:55:38 -07:00
utils.py Fix void failing with RuntimeError: start (0) + length (464) exceeds dimension size (461). (#13873) 2026-05-13 12:37:30 -07:00
windows.py Reduce RAM usage, fix VRAM OOMs, and fix Windows shared memory spilling with adaptive model loading (#11845) 2026-02-01 01:01:11 -05:00