EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-07-03 21:20:49 +08:00

History

huangfeice e96bd48e2d Adapt JoyImageEdit text encoder onto upstream Qwen3-VL stack Upstream merged native Qwen3-VL support (#14298), adding comfy/text_encoders/qwen3vl.py plus helpers in qwen_vl.py / llama.py / qwen35.py. The JoyImage port previously shipped its own duplicate Qwen3-VL implementation (comfy/text_encoders/qwen3_vl.py); that duplication is now removed and the JoyImage text encoder rides on the upstream stack. - Delete comfy/text_encoders/qwen3_vl.py. - Rewrite comfy/text_encoders/joyimage.py to subclass upstream comfy.text_encoders.qwen3vl. The JoyImage checkpoint is a stock qwen3vl_8b, so only JoyImage-specific behavior is overridden: * Qwen3VL8B_JoyImage.forward builds the 3D MRoPE position ids and injects deepstack visual features on the conditioning path. Upstream Qwen3VL only does this inside generate() via build_image_inputs; SDClipModel.forward never passes those kwargs. The JoyImage node feeds an image through the encoder (clip.tokenize(prompt, images=[..])), so the override reuses build_image_inputs to reproduce the multimodal conditioning that Llama2_.forward already accepts kwargs for. * preprocess_embed keeps JoyImage's bicubic+clamp image preprocessing (process_qwen3vl_image) instead of upstream's bilinear path, to preserve validated DiT numerics. * JoyImageTokenizer keeps the JoyImage system-prompt templates, suppresses the Qwen3 <think> block, and raises on image-placeholder count mismatch. * JoyImageTEModel keeps the drop_idx=34 system-prompt strip and the pre-final-norm layer tap (layer="hidden", layer_idx=-1). - sd.py QWEN3VL_8B_JOYIMAGE branch: apply the same state-dict prefix remap the sibling QWEN3VL branch uses (model.language_model.->model., model.visual.->visual., lm_head.->model.lm_head.) so the checkpoint loads into the upstream Qwen3VL namespace, then use the module-level llama_detect. Detection ordering is preserved: the JoyImage discriminator is checked before the generic Qwen3-VL deepstack key. No changes to llama.py / qwen3vl.py / qwen_vl.py / qwen35.py.		2026-06-17 21:29:33 +08:00
..
audio_encoders	Fix fp16 audio encoder models (#12811 )	2026-03-06 18:20:07 -05:00
background_removal	Some cast/dtype fixes for the birefnet and dino3 models. (#14217 )	2026-06-01 14:35:26 -07:00
cldm	Add better error message for common error. (#10846 )	2025-11-23 04:55:22 -05:00
comfy_types	Remove useless annotations imports. (#14105 )	2026-05-25 19:23:29 -07:00
extra_samplers
image_encoders	Depth anything 3 (Core-135) (#13853 )	2026-06-10 09:28:24 +08:00
k_diffusion	feat: Support HiDream-O1-Image (CORE-187) (#13817 )	2026-05-11 20:35:53 -07:00
ldm	Add JoyImageEdit native model support	2026-06-17 18:53:36 +08:00
sd1_tokenizer	Silence clip tokenizer warning. (#8934 )	2025-07-16 14:42:07 -04:00
t2i_adapter
taesd	Add high quality preview support for Flux2 latents (#13496 )	2026-04-29 19:37:30 -04:00
text_encoders	Adapt JoyImageEdit text encoder onto upstream Qwen3-VL stack	2026-06-17 21:29:33 +08:00
weight_adapter	MPDynamic: force load flux img_in weight (Fixes flux1 canny+depth lora crash) (#12446 )	2026-02-15 20:30:09 -05:00
bg_removal_model.py	Fix background removal mask output shape (#14171 )	2026-05-29 09:14:32 -07:00
cli_args.py	Comfy Aimdo 0.4.10 + Dynamic --reserve-vram + --vram-headroom (#14480 )	2026-06-15 07:54:36 -07:00
clip_config_bigg.json
clip_model.py	Support the siglip 2 naflex model as a clip vision model. (#11831 )	2026-01-12 17:05:54 -05:00
clip_vision_config_g.json
clip_vision_config_h.json
clip_vision_config_vitl_336_llava.json	Support llava clip vision model.	2025-03-06 00:24:43 -05:00
clip_vision_config_vitl_336.json
clip_vision_config_vitl.json
clip_vision_siglip2_base_naflex.json	Support the siglip 2 naflex model as a clip vision model. (#11831 )	2026-01-12 17:05:54 -05:00
clip_vision_siglip_384.json
clip_vision_siglip_512.json	Support 512 siglip model.	2025-04-05 07:01:01 -04:00
clip_vision.py	Some cast/dtype fixes for the birefnet and dino3 models. (#14217 )	2026-06-01 14:35:26 -07:00
conds.py	Cleanups to the last PR. (#12646 )	2026-02-26 01:30:31 -05:00
context_windows.py	feat: Context windows - add causal_window_fix to improve blending of context windows (CORE-100) (#13563 )	2026-05-05 16:40:53 -07:00
controlnet.py	MultiGPU Work Units For Accelerated Sampling (CORE-184) (#7063 )	2026-05-25 18:26:40 -07:00
deploy_environment.py	Add deploy environment header (Comfy-Env) to partner node API calls (#13425 )	2026-05-04 20:17:56 -07:00
diffusers_convert.py
diffusers_load.py
float.py	float: use CK stochastic rounding cuda kernel (#13971 )	2026-05-28 19:23:42 -07:00
gligen.py	Remove some useless code. (#8812 )	2025-07-06 07:07:39 -04:00
hooks.py	Fix typos (#10986 )	2026-05-08 17:14:45 +08:00
latent_formats.py	Revert "Add SeedVR2 support (CORE-6) (#14110 )" (#14359 )	2026-06-08 18:00:20 -04:00
lora_convert.py	Use torch RMSNorm for flux models and refactor hunyuan video code. (#12432 )	2026-02-13 15:35:13 -05:00
lora.py	Add LoRA key mapping for LTXV/LTXAV models (#14349 )	2026-06-09 09:57:58 -04:00
memory_management.py	Threaded Loader performance fixes / improvements (+ Aimdo 0.4.6) (#14116 )	2026-05-30 15:20:04 -04:00
model_base.py	Add JoyImageEdit native model support	2026-06-17 18:53:36 +08:00
model_detection.py	Add JoyImageEdit native model support	2026-06-17 18:53:36 +08:00
model_management.py	add --high-ram option (#14437 )	2026-06-12 07:53:33 -07:00
model_patcher.py	[Trainer/bug] Ensure model is not inference mode (CORE-72) (#13400 )	2026-06-09 23:07:47 -04:00
model_prefetch.py	Threaded Loader performance fixes / improvements (+ Aimdo 0.4.6) (#14116 )	2026-05-30 15:20:04 -04:00
model_sampling.py	feat: Support HiDream-O1-Image (CORE-187) (#13817 )	2026-05-11 20:35:53 -07:00
multigpu.py	fix (MultiGPU): prevent freeze on manual abort when using MultiGPU CFG Split (#14235 )	2026-06-02 10:05:24 -07:00
nested_tensor.py	WIP way to support multi multi dimensional latents. (#10456 )	2025-10-23 21:21:14 -04:00
ops.py	add --high-ram option (#14437 )	2026-06-12 07:53:33 -07:00
options.py
patcher_extension.py	Remove useless annotations imports. (#14105 )	2026-05-25 19:23:29 -07:00
pinned_memory.py	Fix interoperation with external source of pinned memory pressure (#14252 )	2026-06-05 08:39:35 -07:00
pixel_space_convert.py	Changes to the previous radiance commit. (#9851 )	2025-09-13 18:03:34 -04:00
quant_ops.py	Enable triton comfy kitchen via cli-arg (#12730 )	2026-05-03 14:07:21 -04:00
rmsnorm.py	feat: Gemma4 text generation support (CORE-30) (#13376 )	2026-05-02 22:46:15 -04:00
sample.py	Revert "Add SeedVR2 support (CORE-6) (#14110 )" (#14359 )	2026-06-08 18:00:20 -04:00
sampler_helpers.py	MultiGPU Work Units For Accelerated Sampling (CORE-184) (#7063 )	2026-05-25 18:26:40 -07:00
samplers.py	fix(multigpu): replace hardcoded torch.cuda.set_device with device-agnostic set_torch_device (#14191 )	2026-05-30 21:18:42 -04:00
sd1_clip_config.json
sd1_clip.py	feat: Support Qwen3.5 text generation models (#12771 )	2026-03-25 22:48:28 -04:00
sd.py	Adapt JoyImageEdit text encoder onto upstream Qwen3-VL stack	2026-06-17 21:29:33 +08:00
sdxl_clip.py	Add a T5TokenizerOptions node to set options for the T5 tokenizer. (#7803 )	2025-04-25 19:36:00 -04:00
supported_models_base.py	Revert "Add SeedVR2 support (CORE-6) (#14110 )" (#14359 )	2026-06-08 18:00:20 -04:00
supported_models.py	Add JoyImageEdit native model support	2026-06-17 18:53:36 +08:00
utils.py	Threaded Loader performance fixes / improvements (+ Aimdo 0.4.6) (#14116 )	2026-05-30 15:20:04 -04:00