ComfyUI/comfy
huangfeice e29384be0d Add JoyImageEditPlus multi-image edit support (unify onto Plus-style forward)
JoyImageEditPlus is the multi-image (1-6 reference images) variant of
JoyImageEdit, trained from the same base. Its diffusers transformer shares
byte-identical weight structure with the single-image variant (894 keys, zero
rename) but injects references differently: instead of the single-image
slot-stack (stack refs + noise into a 6D tensor and rotate on the frame dim,
which forces all items to share resolution), each reference is independently
patchified and concatenated on the sequence dim with per-image temporal-offset
3D RoPE, allowing references at different resolutions.

Since the single-image port is not yet upstream, this unifies both variants
onto the Plus-style forward rather than keeping two paths; single-image is now
the ref=1 special case. Verified numerically: at ref=1 with equal resolution
the new path's RoPE is bit-identical to the old slot-stack layout, and the
transformer output matches the diffusers Plus reference (fp32, incl. the
different-resolution case).

ComfyUI runs cond/uncond in one forward with a shared reference configuration,
so the diffusers Plus batched RoPE, padding attention_mask, and dedicated
attention processor are unnecessary here: the unified forward reuses the
existing unbatched _apply_rotary_emb and JoyImageAttention. Confirmed
equivalent to the diffusers batched+mask path for a single sample.

- comfy/ldm/joyimage/model.py: forward takes ref_latents and builds
  components=[target, ref0, ...]; per-component patchify + temporal-offset
  RoPE; output keeps only the target segment. Old single-grid RoPE removed.
- comfy/model_base.py: JoyImage drops the slot-stack / frame-rotation /
  shape-equality path in _apply_model, passing ref_latents straight to the
  transformer. Guidance-rescale and the reference_latents requirement are kept.
- comfy/text_encoders/joyimage.py: the image template emits one vision block
  per reference (N = image count); N=1 is byte-for-byte the old template.
- comfy_extras/nodes_joyimage.py: add TextEncodeJoyImageEditPlus with optional
  image1..image6 inputs, each bucket-resized and VAE-encoded into the
  reference_latents list.

Detection, supported_models, and sd.py need no changes: the identical weight
structure routes both variants through image_model="joyimage".
2026-07-01 18:36:43 +08:00
..
audio_encoders Fix fp16 audio encoder models (#12811) 2026-03-06 18:20:07 -05:00
background_removal Some cast/dtype fixes for the birefnet and dino3 models. (#14217) 2026-06-01 14:35:26 -07:00
cldm Add better error message for common error. (#10846) 2025-11-23 04:55:22 -05:00
comfy_types Remove useless annotations imports. (#14105) 2026-05-25 19:23:29 -07:00
extra_samplers Uni pc sampler now works with audio and video models. 2025-01-18 05:27:58 -05:00
image_encoders Depth anything 3 (Core-135) (#13853) 2026-06-10 09:28:24 +08:00
k_diffusion feat: Support HiDream-O1-Image (CORE-187) (#13817) 2026-05-11 20:35:53 -07:00
ldm Add JoyImageEditPlus multi-image edit support (unify onto Plus-style forward) 2026-07-01 18:36:43 +08:00
sd1_tokenizer Silence clip tokenizer warning. (#8934) 2025-07-16 14:42:07 -04:00
t2i_adapter Controlnet refactor. 2024-06-27 18:43:11 -04:00
taesd Add high quality preview support for Flux2 latents (#13496) 2026-04-29 19:37:30 -04:00
text_encoders Add JoyImageEditPlus multi-image edit support (unify onto Plus-style forward) 2026-07-01 18:36:43 +08:00
weight_adapter MPDynamic: force load flux img_in weight (Fixes flux1 canny+depth lora crash) (#12446) 2026-02-15 20:30:09 -05:00
bg_removal_model.py Fix background removal mask output shape (#14171) 2026-05-29 09:14:32 -07:00
cli_args.py Comfy Aimdo 0.4.10 + Dynamic --reserve-vram + --vram-headroom (#14480) 2026-06-15 07:54:36 -07:00
clip_config_bigg.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
clip_model.py Support the siglip 2 naflex model as a clip vision model. (#11831) 2026-01-12 17:05:54 -05:00
clip_vision_config_g.json Add support for clip g vision model to CLIPVisionLoader. 2023-08-18 11:13:29 -04:00
clip_vision_config_h.json Add support for unCLIP SD2.x models. 2023-04-01 23:19:15 -04:00
clip_vision_config_vitl_336_llava.json Support llava clip vision model. 2025-03-06 00:24:43 -05:00
clip_vision_config_vitl_336.json support clip-vit-large-patch14-336 (#4042) 2024-07-17 13:12:50 -04:00
clip_vision_config_vitl.json Add support for unCLIP SD2.x models. 2023-04-01 23:19:15 -04:00
clip_vision_siglip2_base_naflex.json Support the siglip 2 naflex model as a clip vision model. (#11831) 2026-01-12 17:05:54 -05:00
clip_vision_siglip_384.json Support new flux model variants. 2024-11-21 08:38:23 -05:00
clip_vision_siglip_512.json Support 512 siglip model. 2025-04-05 07:01:01 -04:00
clip_vision.py Some cast/dtype fixes for the birefnet and dino3 models. (#14217) 2026-06-01 14:35:26 -07:00
conds.py Cleanups to the last PR. (#12646) 2026-02-26 01:30:31 -05:00
context_windows.py feat: Context windows - add causal_window_fix to improve blending of context windows (CORE-100) (#13563) 2026-05-05 16:40:53 -07:00
controlnet.py MultiGPU Work Units For Accelerated Sampling (CORE-184) (#7063) 2026-05-25 18:26:40 -07:00
deploy_environment.py Add deploy environment header (Comfy-Env) to partner node API calls (#13425) 2026-05-04 20:17:56 -07:00
diffusers_convert.py Remove useless code. 2025-01-24 06:15:54 -05:00
diffusers_load.py load_unet -> load_diffusion_model with a model_options argument. 2024-08-12 23:20:57 -04:00
float.py float: use CK stochastic rounding cuda kernel (#13971) 2026-05-28 19:23:42 -07:00
gligen.py Remove some useless code. (#8812) 2025-07-06 07:07:39 -04:00
hooks.py Fix typos (#10986) 2026-05-08 17:14:45 +08:00
latent_formats.py Revert "Add SeedVR2 support (CORE-6) (#14110)" (#14359) 2026-06-08 18:00:20 -04:00
lora_convert.py Use torch RMSNorm for flux models and refactor hunyuan video code. (#12432) 2026-02-13 15:35:13 -05:00
lora.py Add LoRA key mapping for LTXV/LTXAV models (#14349) 2026-06-09 09:57:58 -04:00
memory_management.py Threaded Loader performance fixes / improvements (+ Aimdo 0.4.6) (#14116) 2026-05-30 15:20:04 -04:00
model_base.py Add JoyImageEditPlus multi-image edit support (unify onto Plus-style forward) 2026-07-01 18:36:43 +08:00
model_detection.py Add JoyImageEdit native model support 2026-06-17 18:53:36 +08:00
model_management.py add --high-ram option (#14437) 2026-06-12 07:53:33 -07:00
model_patcher.py [Trainer/bug] Ensure model is not inference mode (CORE-72) (#13400) 2026-06-09 23:07:47 -04:00
model_prefetch.py Threaded Loader performance fixes / improvements (+ Aimdo 0.4.6) (#14116) 2026-05-30 15:20:04 -04:00
model_sampling.py feat: Support HiDream-O1-Image (CORE-187) (#13817) 2026-05-11 20:35:53 -07:00
multigpu.py fix (MultiGPU): prevent freeze on manual abort when using MultiGPU CFG Split (#14235) 2026-06-02 10:05:24 -07:00
nested_tensor.py WIP way to support multi multi dimensional latents. (#10456) 2025-10-23 21:21:14 -04:00
ops.py add --high-ram option (#14437) 2026-06-12 07:53:33 -07:00
options.py Only parse command line args when main.py is called. 2023-09-13 11:38:20 -04:00
patcher_extension.py Remove useless annotations imports. (#14105) 2026-05-25 19:23:29 -07:00
pinned_memory.py Fix interoperation with external source of pinned memory pressure (#14252) 2026-06-05 08:39:35 -07:00
pixel_space_convert.py Changes to the previous radiance commit. (#9851) 2025-09-13 18:03:34 -04:00
quant_ops.py Enable triton comfy kitchen via cli-arg (#12730) 2026-05-03 14:07:21 -04:00
rmsnorm.py feat: Gemma4 text generation support (CORE-30) (#13376) 2026-05-02 22:46:15 -04:00
sample.py Revert "Add SeedVR2 support (CORE-6) (#14110)" (#14359) 2026-06-08 18:00:20 -04:00
sampler_helpers.py MultiGPU Work Units For Accelerated Sampling (CORE-184) (#7063) 2026-05-25 18:26:40 -07:00
samplers.py fix(multigpu): replace hardcoded torch.cuda.set_device with device-agnostic set_torch_device (#14191) 2026-05-30 21:18:42 -04:00
sd1_clip_config.json Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
sd1_clip.py feat: Support Qwen3.5 text generation models (#12771) 2026-03-25 22:48:28 -04:00
sd.py Adapt JoyImageEdit text encoder onto upstream Qwen3-VL stack 2026-06-17 21:29:33 +08:00
sdxl_clip.py Add a T5TokenizerOptions node to set options for the T5 tokenizer. (#7803) 2025-04-25 19:36:00 -04:00
supported_models_base.py Revert "Add SeedVR2 support (CORE-6) (#14110)" (#14359) 2026-06-08 18:00:20 -04:00
supported_models.py Add JoyImageEdit native model support 2026-06-17 18:53:36 +08:00
utils.py Threaded Loader performance fixes / improvements (+ Aimdo 0.4.6) (#14116) 2026-05-30 15:20:04 -04:00