EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-07-03 21:20:49 +08:00

History

huangfeice 5260e18cdf Add JoyImageEdit native model support JoyImageEdit is an image-edit diffusion transformer from JD (jd-opensource), Apache 2.0. This adds native ComfyUI support so it loads and runs like other edit models (load checkpoint -> TextEncode + ReferenceLatent -> KSampler -> VAEDecode), with no diffusers dependency. Architecture: - Transformer (comfy/ldm/joyimage/model.py): dual-stream (img/txt) DiT with a Conv3d patch embed (patch_size [1,2,2]), Wan-style learnable modulation, and 3D RoPE (rope_dim_list [16,56,56]). All attention goes through comfy.ldm.modules.attention.optimized_attention. - Text encoder (comfy/text_encoders/{qwen3_vl,joyimage}.py): a reusable Qwen3-VL multimodal stack (vision tower + LM) in qwen3_vl.py, plus a thin JoyImage-specific layer (prompt templates, drop_idx, tokenizer, te() factory) in joyimage.py that depends on it. text_dim 4096. - VAE: reuses the existing Wan 2.1 latent format (AutoencoderKLWan), no new latent format. - Edit conditioning: reuses the reference_latents mechanism. Reference and noise latents are stacked on a new n-slot dimension and rotated at the model boundary (model_base.JoyImage), so the transformer stays 5D-in/5D-out. Guidance-rescale is built into the CFG path. Model wiring: - model_base.JoyImage uses ModelType.FLOW with sampling_settings multiplier=1000 (the time embedding is trained on t in [0,1000]) and shift=1.5; FLOW's linear time_snr_shift matches the diffusers FlowMatchEuler sigma schedule. - model_detection sniffs the transformer state-dict (double_blocks., condition_embedder., 5D img_in Conv3d) to route image_model="joyimage". - supported_models.JoyImage and the CLIPLoader "joyimage" type register it. User-facing node TextEncodeJoyImageEdit (comfy_extras/nodes_joyimage.py) bucket-resizes the input image to the nearest 1024-base bucket, encodes the prompt with the image, and emits both the conditioning and the bucketed image so the same pixels feed VAEEncode and the negative encode (JoyImage requires noise and reference latents to share spatial dims).		2026-06-17 18:53:36 +08:00
..
audio_encoders	Fix fp16 audio encoder models (#12811 )	2026-03-06 18:20:07 -05:00
background_removal	Some cast/dtype fixes for the birefnet and dino3 models. (#14217 )	2026-06-01 14:35:26 -07:00
cldm	Add better error message for common error. (#10846 )	2025-11-23 04:55:22 -05:00
comfy_types	Remove useless annotations imports. (#14105 )	2026-05-25 19:23:29 -07:00
extra_samplers
image_encoders	Depth anything 3 (Core-135) (#13853 )	2026-06-10 09:28:24 +08:00
k_diffusion	feat: Support HiDream-O1-Image (CORE-187) (#13817 )	2026-05-11 20:35:53 -07:00
ldm	Add JoyImageEdit native model support	2026-06-17 18:53:36 +08:00
sd1_tokenizer
t2i_adapter
taesd	Add high quality preview support for Flux2 latents (#13496 )	2026-04-29 19:37:30 -04:00
text_encoders	Add JoyImageEdit native model support	2026-06-17 18:53:36 +08:00
weight_adapter	MPDynamic: force load flux img_in weight (Fixes flux1 canny+depth lora crash) (#12446 )	2026-02-15 20:30:09 -05:00
bg_removal_model.py	Fix background removal mask output shape (#14171 )	2026-05-29 09:14:32 -07:00
cli_args.py	Comfy Aimdo 0.4.10 + Dynamic --reserve-vram + --vram-headroom (#14480 )	2026-06-15 07:54:36 -07:00
clip_config_bigg.json
clip_model.py	Support the siglip 2 naflex model as a clip vision model. (#11831 )	2026-01-12 17:05:54 -05:00
clip_vision_config_g.json
clip_vision_config_h.json
clip_vision_config_vitl_336_llava.json
clip_vision_config_vitl_336.json
clip_vision_config_vitl.json
clip_vision_siglip2_base_naflex.json	Support the siglip 2 naflex model as a clip vision model. (#11831 )	2026-01-12 17:05:54 -05:00
clip_vision_siglip_384.json
clip_vision_siglip_512.json
clip_vision.py	Some cast/dtype fixes for the birefnet and dino3 models. (#14217 )	2026-06-01 14:35:26 -07:00
conds.py	Cleanups to the last PR. (#12646 )	2026-02-26 01:30:31 -05:00
context_windows.py	feat: Context windows - add causal_window_fix to improve blending of context windows (CORE-100) (#13563 )	2026-05-05 16:40:53 -07:00
controlnet.py	MultiGPU Work Units For Accelerated Sampling (CORE-184) (#7063 )	2026-05-25 18:26:40 -07:00
deploy_environment.py	Add deploy environment header (Comfy-Env) to partner node API calls (#13425 )	2026-05-04 20:17:56 -07:00
diffusers_convert.py
diffusers_load.py
float.py	float: use CK stochastic rounding cuda kernel (#13971 )	2026-05-28 19:23:42 -07:00
gligen.py
hooks.py	Fix typos (#10986 )	2026-05-08 17:14:45 +08:00
latent_formats.py	Revert "Add SeedVR2 support (CORE-6) (#14110 )" (#14359 )	2026-06-08 18:00:20 -04:00
lora_convert.py	Use torch RMSNorm for flux models and refactor hunyuan video code. (#12432 )	2026-02-13 15:35:13 -05:00
lora.py	Add LoRA key mapping for LTXV/LTXAV models (#14349 )	2026-06-09 09:57:58 -04:00
memory_management.py	Threaded Loader performance fixes / improvements (+ Aimdo 0.4.6) (#14116 )	2026-05-30 15:20:04 -04:00
model_base.py	Add JoyImageEdit native model support	2026-06-17 18:53:36 +08:00
model_detection.py	Add JoyImageEdit native model support	2026-06-17 18:53:36 +08:00
model_management.py	add --high-ram option (#14437 )	2026-06-12 07:53:33 -07:00
model_patcher.py	[Trainer/bug] Ensure model is not inference mode (CORE-72) (#13400 )	2026-06-09 23:07:47 -04:00
model_prefetch.py	Threaded Loader performance fixes / improvements (+ Aimdo 0.4.6) (#14116 )	2026-05-30 15:20:04 -04:00
model_sampling.py	feat: Support HiDream-O1-Image (CORE-187) (#13817 )	2026-05-11 20:35:53 -07:00
multigpu.py	fix (MultiGPU): prevent freeze on manual abort when using MultiGPU CFG Split (#14235 )	2026-06-02 10:05:24 -07:00
nested_tensor.py	WIP way to support multi multi dimensional latents. (#10456 )	2025-10-23 21:21:14 -04:00
ops.py	add --high-ram option (#14437 )	2026-06-12 07:53:33 -07:00
options.py
patcher_extension.py	Remove useless annotations imports. (#14105 )	2026-05-25 19:23:29 -07:00
pinned_memory.py	Fix interoperation with external source of pinned memory pressure (#14252 )	2026-06-05 08:39:35 -07:00
pixel_space_convert.py	Changes to the previous radiance commit. (#9851 )	2025-09-13 18:03:34 -04:00
quant_ops.py	Enable triton comfy kitchen via cli-arg (#12730 )	2026-05-03 14:07:21 -04:00
rmsnorm.py	feat: Gemma4 text generation support (CORE-30) (#13376 )	2026-05-02 22:46:15 -04:00
sample.py	Revert "Add SeedVR2 support (CORE-6) (#14110 )" (#14359 )	2026-06-08 18:00:20 -04:00
sampler_helpers.py	MultiGPU Work Units For Accelerated Sampling (CORE-184) (#7063 )	2026-05-25 18:26:40 -07:00
samplers.py	fix(multigpu): replace hardcoded torch.cuda.set_device with device-agnostic set_torch_device (#14191 )	2026-05-30 21:18:42 -04:00
sd1_clip_config.json
sd1_clip.py	feat: Support Qwen3.5 text generation models (#12771 )	2026-03-25 22:48:28 -04:00
sd.py	Add JoyImageEdit native model support	2026-06-17 18:53:36 +08:00
sdxl_clip.py
supported_models_base.py	Revert "Add SeedVR2 support (CORE-6) (#14110 )" (#14359 )	2026-06-08 18:00:20 -04:00
supported_models.py	Add JoyImageEdit native model support	2026-06-17 18:53:36 +08:00
utils.py	Threaded Loader performance fixes / improvements (+ Aimdo 0.4.6) (#14116 )	2026-05-30 15:20:04 -04:00