mirror of
https://github.com/comfyanonymous/ComfyUI.git
synced 2026-04-25 18:02:37 +08:00
Some checks failed
Python Linting / Run Ruff (push) Has been cancelled
Python Linting / Run Pylint (push) Has been cancelled
Build package / Build Test (3.10) (push) Has been cancelled
Build package / Build Test (3.11) (push) Has been cancelled
Build package / Build Test (3.12) (push) Has been cancelled
Build package / Build Test (3.13) (push) Has been cancelled
Build package / Build Test (3.14) (push) Has been cancelled
* fix: pin SQLAlchemy>=2.0 in requirements.txt (fixes #13036) (#13316) * Refactor io to IO in nodes_ace.py (#13485) * Bump comfyui-frontend-package to 1.42.12 (#13489) * Make the ltx audio vae more native. (#13486) * feat(api-nodes): add automatic downscaling of videos for ByteDance 2 nodes (#13465) * Support standalone LTXV audio VAEs (#13499) * [Partner Nodes] added 4K resolution for Veo models; added Veo 3 Lite model (#13330) * feat(api nodes): added 4K resolution for Veo models; added Veo 3 Lite model Signed-off-by: bigcat88 <bigcat88@icloud.com> * increase poll_interval from 5 to 9 --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> * Bump comfyui-frontend-package to 1.42.14 (#13493) * Add gpt-image-2 as version option (#13501) * Allow logging in comfy app files. (#13505) * chore: update workflow templates to v0.9.59 (#13507) * fix(veo): reject 4K resolution for veo-3.0 models in Veo3VideoGenerationNode (#13504) The tooltip on the resolution input states that 4K is not available for veo-3.1-lite or veo-3.0 models, but the execute guard only rejected the lite combination. Selecting 4K with veo-3.0-generate-001 or veo-3.0-fast-generate-001 would fall through and hit the upstream API with an invalid request. Broaden the guard to match the documented behavior and update the error message accordingly. Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> * feat: RIFE and FILM frame interpolation model support (CORE-29) (#13258) * initial RIFE support * Also support FILM * Better RAM usage, reduce FILM VRAM peak * Add model folder placeholder * Fix oom fallback frame loss * Remove torch.compile for now * Rename model input * Shorter input type name --------- * fix: use Parameter assignment for Stable_Zero123 cc_projection weights (fixes #13492) (#13518) On Windows with aimdo enabled, disable_weight_init.Linear uses lazy initialization that sets weight and bias to None to avoid unnecessary memory allocation. This caused a crash when copy_() was called on the None weight attribute in Stable_Zero123.__init__. Replace copy_() with direct torch.nn.Parameter assignment, which works correctly on both Windows (aimdo enabled) and other platforms. * Derive InterruptProcessingException from BaseException (#13523) * bump manager version to 4.2.1 (#13516) * ModelPatcherDynamic: force cast stray weights on comfy layers (#13487) the mixed_precision ops can have input_scale parameters that are used in tensor math but arent a weight or bias so dont get proper VRAM management. Treat these as force-castable parameters like the non comfy weight, random params are buffers already are. * Update logging level for invalid version format (#13526) * [Partner Nodes] add SD2 real human support (#13509) * feat(api-nodes): add SD2 real human support Signed-off-by: bigcat88 <bigcat88@icloud.com> * fix: add validation before uploading Assets Signed-off-by: bigcat88 <bigcat88@icloud.com> * Add asset_id and group_id displaying on the node Signed-off-by: bigcat88 <bigcat88@icloud.com> * extend poll_op to use instead of custom async cycle Signed-off-by: bigcat88 <bigcat88@icloud.com> * added the polling for the "Active" status after asset creation Signed-off-by: bigcat88 <bigcat88@icloud.com> * updated tooltip for group_id * allow usage of real human in the ByteDance2FirstLastFrame node * add reference count limits * corrected price in status when input assets contain video Signed-off-by: bigcat88 <bigcat88@icloud.com> --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> * feat: SAM (segment anything) 3.1 support (CORE-34) (#13408) * [Partner Nodes] GPTImage: fix price badges, add new resolutions (#13519) * fix(api-nodes): fixed price badges, add new resolutions Signed-off-by: bigcat88 <bigcat88@icloud.com> * proper calculate the total run cost when "n > 1" Signed-off-by: bigcat88 <bigcat88@icloud.com> --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> * chore: update workflow templates to v0.9.61 (#13533) * chore: update embedded docs to v0.4.4 (#13535) * add 4K resolution to Kling nodes (#13536) Signed-off-by: bigcat88 <bigcat88@icloud.com> * Fix LTXV Reference Audio node (#13531) * comfy-aimdo 0.2.14: Hotfix async allocator estimations (#13534) This was doing an over-estimate of VRAM used by the async allocator when lots of little small tensors were in play. Also change the versioning scheme to == so we can roll forward aimdo without worrying about stable regressions downstream in comfyUI core. * Disable sageattention for SAM3 (#13529) Causes Nans * execution: Add anti-cycle validation (#13169) Currently if the graph contains a cycle, the just inifitiate recursions, hits a catch all then throws a generic error against the output node that seeded the validation. Instead, fail the offending cycling mode chain and handlng it as an error in its own right. Co-authored-by: guill <jacob.e.segal@gmail.com> * chore: update workflow templates to v0.9.62 (#13539) --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> Co-authored-by: Octopus <liyuan851277048@icloud.com> Co-authored-by: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Co-authored-by: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com> Co-authored-by: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Co-authored-by: Jukka Seppänen <40791699+kijai@users.noreply.github.com> Co-authored-by: AustinMroz <austin@comfy.org> Co-authored-by: Daxiong (Lin) <contact@comfyui-wiki.com> Co-authored-by: Matt Miller <matt@miller-media.com> Co-authored-by: blepping <157360029+blepping@users.noreply.github.com> Co-authored-by: Dr.Lt.Data <128333288+ltdrdata@users.noreply.github.com> Co-authored-by: rattus <46076784+rattus128@users.noreply.github.com> Co-authored-by: guill <jacob.e.segal@gmail.com>
98 lines
4.1 KiB
Python
98 lines
4.1 KiB
Python
import re
|
|
from comfy import sd1_clip
|
|
|
|
SAM3_CLIP_CONFIG = {
|
|
"architectures": ["CLIPTextModel"],
|
|
"hidden_act": "quick_gelu",
|
|
"hidden_size": 1024,
|
|
"intermediate_size": 4096,
|
|
"num_attention_heads": 16,
|
|
"num_hidden_layers": 24,
|
|
"max_position_embeddings": 32,
|
|
"projection_dim": 512,
|
|
"vocab_size": 49408,
|
|
"layer_norm_eps": 1e-5,
|
|
"eos_token_id": 49407,
|
|
}
|
|
|
|
|
|
class SAM3ClipModel(sd1_clip.SDClipModel):
|
|
def __init__(self, device="cpu", dtype=None, model_options={}):
|
|
super().__init__(device=device, dtype=dtype, max_length=32, layer="last", textmodel_json_config=SAM3_CLIP_CONFIG, special_tokens={"start": 49406, "end": 49407, "pad": 0}, return_projected_pooled=False, return_attention_masks=True, enable_attention_masks=True, model_options=model_options)
|
|
|
|
|
|
class SAM3Tokenizer(sd1_clip.SDTokenizer):
|
|
def __init__(self, embedding_directory=None, tokenizer_data={}):
|
|
super().__init__(max_length=32, pad_with_end=False, pad_token=0, embedding_directory=embedding_directory, embedding_size=1024, embedding_key="sam3_clip", tokenizer_data=tokenizer_data)
|
|
self.disable_weights = True
|
|
|
|
|
|
def _parse_prompts(text):
|
|
"""Split comma-separated prompts with optional :N max detections per category"""
|
|
text = text.replace("(", "").replace(")", "")
|
|
parts = [p.strip() for p in text.split(",") if p.strip()]
|
|
result = []
|
|
for part in parts:
|
|
m = re.match(r'^(.+?)\s*:\s*([\d.]+)\s*$', part)
|
|
if m:
|
|
text_part = m.group(1).strip()
|
|
val = m.group(2)
|
|
max_det = max(1, round(float(val)))
|
|
result.append((text_part, max_det))
|
|
else:
|
|
result.append((part, 1))
|
|
return result
|
|
|
|
|
|
class SAM3TokenizerWrapper(sd1_clip.SD1Tokenizer):
|
|
def __init__(self, embedding_directory=None, tokenizer_data={}):
|
|
super().__init__(embedding_directory=embedding_directory, tokenizer_data=tokenizer_data, clip_name="l", tokenizer=SAM3Tokenizer, name="sam3_clip")
|
|
|
|
def tokenize_with_weights(self, text: str, return_word_ids=False, **kwargs):
|
|
parsed = _parse_prompts(text)
|
|
if len(parsed) <= 1 and (not parsed or parsed[0][1] == 1):
|
|
return super().tokenize_with_weights(text, return_word_ids, **kwargs)
|
|
# Tokenize each prompt part separately, store per-part batches and metadata
|
|
inner = getattr(self, self.clip)
|
|
per_prompt = []
|
|
for prompt_text, max_det in parsed:
|
|
batches = inner.tokenize_with_weights(prompt_text, return_word_ids, **kwargs)
|
|
per_prompt.append((batches, max_det))
|
|
# Main output uses first prompt's tokens (for compatibility)
|
|
out = {self.clip_name: per_prompt[0][0], "sam3_per_prompt": per_prompt}
|
|
return out
|
|
|
|
|
|
class SAM3ClipModelWrapper(sd1_clip.SD1ClipModel):
|
|
def __init__(self, device="cpu", dtype=None, model_options={}, **kwargs):
|
|
super().__init__(device=device, dtype=dtype, model_options=model_options, clip_name="l", clip_model=SAM3ClipModel, name="sam3_clip")
|
|
|
|
def encode_token_weights(self, token_weight_pairs):
|
|
per_prompt = token_weight_pairs.pop("sam3_per_prompt", None)
|
|
if per_prompt is None:
|
|
return super().encode_token_weights(token_weight_pairs)
|
|
|
|
# Encode each prompt separately, pack into extra dict
|
|
inner = getattr(self, self.clip)
|
|
multi_cond = []
|
|
first_pooled = None
|
|
for batches, max_det in per_prompt:
|
|
out = inner.encode_token_weights(batches)
|
|
cond, pooled = out[0], out[1]
|
|
extra = out[2] if len(out) > 2 else {}
|
|
if first_pooled is None:
|
|
first_pooled = pooled
|
|
multi_cond.append({
|
|
"cond": cond,
|
|
"attention_mask": extra.get("attention_mask"),
|
|
"max_detections": max_det,
|
|
})
|
|
|
|
# Return first prompt as main (for non-SAM3 consumers), all prompts in metadata
|
|
main = multi_cond[0]
|
|
main_extra = {}
|
|
if main["attention_mask"] is not None:
|
|
main_extra["attention_mask"] = main["attention_mask"]
|
|
main_extra["sam3_multi_cond"] = multi_cond
|
|
return (main["cond"], first_pooled, main_extra)
|