ComfyUI/comfy_extras/nodes_ace.py
Jedrzej Kosinski 1b96430c60
Some checks are pending
Build package / Build Test (3.10) (push) Waiting to run
Build package / Build Test (3.12) (push) Waiting to run
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Build package / Build Test (3.11) (push) Waiting to run
Build package / Build Test (3.13) (push) Waiting to run
Build package / Build Test (3.14) (push) Waiting to run
Merge master into worksplit-multigpu (#13546)
* fix: pin SQLAlchemy>=2.0 in requirements.txt (fixes #13036) (#13316)

* Refactor io to IO in nodes_ace.py (#13485)

* Bump comfyui-frontend-package to 1.42.12 (#13489)

* Make the ltx audio vae more native. (#13486)

* feat(api-nodes): add automatic downscaling of videos for ByteDance 2 nodes (#13465)

* Support standalone LTXV audio VAEs (#13499)

* [Partner Nodes]  added 4K resolution for Veo models; added Veo 3 Lite model (#13330)

* feat(api nodes): added 4K resolution for Veo models; added Veo 3 Lite model

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* increase poll_interval from 5 to 9

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>

* Bump comfyui-frontend-package to 1.42.14 (#13493)

* Add gpt-image-2 as version option (#13501)

* Allow logging in comfy app files. (#13505)

* chore: update workflow templates to v0.9.59 (#13507)

* fix(veo): reject 4K resolution for veo-3.0 models in Veo3VideoGenerationNode (#13504)

The tooltip on the resolution input states that 4K is not available for
veo-3.1-lite or veo-3.0 models, but the execute guard only rejected the
lite combination. Selecting 4K with veo-3.0-generate-001 or
veo-3.0-fast-generate-001 would fall through and hit the upstream API
with an invalid request.

Broaden the guard to match the documented behavior and update the error
message accordingly.

Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>

* feat: RIFE and FILM frame interpolation model support (CORE-29) (#13258)

* initial RIFE support

* Also support FILM

* Better RAM usage, reduce FILM VRAM peak

* Add model folder placeholder

* Fix oom fallback frame loss

* Remove torch.compile for now

* Rename model input

* Shorter input type name

---------

* fix: use Parameter assignment for Stable_Zero123 cc_projection weights (fixes #13492) (#13518)

On Windows with aimdo enabled, disable_weight_init.Linear uses lazy
initialization that sets weight and bias to None to avoid unnecessary
memory allocation. This caused a crash when copy_() was called on the
None weight attribute in Stable_Zero123.__init__.

Replace copy_() with direct torch.nn.Parameter assignment, which works
correctly on both Windows (aimdo enabled) and other platforms.

* Derive InterruptProcessingException from BaseException (#13523)

* bump manager version to 4.2.1 (#13516)

* ModelPatcherDynamic: force cast stray weights on comfy layers (#13487)

the mixed_precision ops can have input_scale parameters that are used
in tensor math but arent a weight or bias so dont get proper VRAM
management. Treat these as force-castable parameters like the non comfy
weight, random params are buffers already are.

* Update logging level for invalid version format (#13526)

* [Partner Nodes] add SD2 real human support (#13509)

* feat(api-nodes): add SD2 real human support

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* fix: add validation before uploading Assets

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* Add asset_id and group_id displaying on the node

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* extend poll_op to use instead of custom async cycle

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* added the polling for the "Active" status after asset creation

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* updated tooltip for group_id

* allow usage of real human in the ByteDance2FirstLastFrame node

* add reference count limits

* corrected price in status when input assets contain video

Signed-off-by: bigcat88 <bigcat88@icloud.com>

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* feat: SAM (segment anything) 3.1 support (CORE-34) (#13408)

* [Partner Nodes] GPTImage: fix price badges, add new resolutions (#13519)

* fix(api-nodes): fixed price badges, add new resolutions

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* proper calculate the total run cost when "n > 1"

Signed-off-by: bigcat88 <bigcat88@icloud.com>

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* chore: update workflow templates to v0.9.61 (#13533)

* chore: update embedded docs to v0.4.4 (#13535)

* add 4K resolution to Kling nodes (#13536)

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* Fix LTXV Reference Audio node (#13531)

* comfy-aimdo 0.2.14: Hotfix async allocator estimations (#13534)

This was doing an over-estimate of VRAM used by the async allocator when lots
of little small tensors were in play.

Also change the versioning scheme to == so we can roll forward aimdo without
worrying about stable regressions downstream in comfyUI core.

* Disable sageattention for SAM3 (#13529)

Causes Nans

* execution: Add anti-cycle validation (#13169)

Currently if the graph contains a cycle, the just inifitiate recursions,
hits a catch all then throws a generic error against the output node
that seeded the validation. Instead, fail the offending cycling mode
chain and handlng it as an error in its own right.

Co-authored-by: guill <jacob.e.segal@gmail.com>

* chore: update workflow templates to v0.9.62 (#13539)

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>
Co-authored-by: Octopus <liyuan851277048@icloud.com>
Co-authored-by: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com>
Co-authored-by: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com>
Co-authored-by: Alexander Piskun <13381981+bigcat88@users.noreply.github.com>
Co-authored-by: Jukka Seppänen <40791699+kijai@users.noreply.github.com>
Co-authored-by: AustinMroz <austin@comfy.org>
Co-authored-by: Daxiong (Lin) <contact@comfyui-wiki.com>
Co-authored-by: Matt Miller <matt@miller-media.com>
Co-authored-by: blepping <157360029+blepping@users.noreply.github.com>
Co-authored-by: Dr.Lt.Data <128333288+ltdrdata@users.noreply.github.com>
Co-authored-by: rattus <46076784+rattus128@users.noreply.github.com>
Co-authored-by: guill <jacob.e.segal@gmail.com>
2026-04-23 19:20:14 -07:00

146 lines
6.8 KiB
Python

import torch
from typing_extensions import override
import comfy.model_management
import node_helpers
from comfy_api.latest import ComfyExtension, IO
class TextEncodeAceStepAudio(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="TextEncodeAceStepAudio",
category="conditioning",
inputs=[
IO.Clip.Input("clip"),
IO.String.Input("tags", multiline=True, dynamic_prompts=True),
IO.String.Input("lyrics", multiline=True, dynamic_prompts=True),
IO.Float.Input("lyrics_strength", default=1.0, min=0.0, max=10.0, step=0.01),
],
outputs=[IO.Conditioning.Output()],
)
@classmethod
def execute(cls, clip, tags, lyrics, lyrics_strength) -> IO.NodeOutput:
tokens = clip.tokenize(tags, lyrics=lyrics)
conditioning = clip.encode_from_tokens_scheduled(tokens)
conditioning = node_helpers.conditioning_set_values(conditioning, {"lyrics_strength": lyrics_strength})
return IO.NodeOutput(conditioning)
class TextEncodeAceStepAudio15(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="TextEncodeAceStepAudio1.5",
category="conditioning",
inputs=[
IO.Clip.Input("clip"),
IO.String.Input("tags", multiline=True, dynamic_prompts=True),
IO.String.Input("lyrics", multiline=True, dynamic_prompts=True),
IO.Int.Input("seed", default=0, min=0, max=0xffffffffffffffff, control_after_generate=True),
IO.Int.Input("bpm", default=120, min=10, max=300),
IO.Float.Input("duration", default=120.0, min=0.0, max=2000.0, step=0.1),
IO.Combo.Input("timesignature", options=['2', '3', '4', '6']),
IO.Combo.Input("language", options=["en", "ja", "zh", "es", "de", "fr", "pt", "ru", "it", "nl", "pl", "tr", "vi", "cs", "fa", "id", "ko", "uk", "hu", "ar", "sv", "ro", "el"]),
IO.Combo.Input("keyscale", options=[f"{root} {quality}" for quality in ["major", "minor"] for root in ["C", "C#", "Db", "D", "D#", "Eb", "E", "F", "F#", "Gb", "G", "G#", "Ab", "A", "A#", "Bb", "B"]]),
IO.Boolean.Input("generate_audio_codes", default=True, tooltip="Enable the LLM that generates audio codes. This can be slow but will increase the quality of the generated audio. Turn this off if you are giving the model an audio reference.", advanced=True),
IO.Float.Input("cfg_scale", default=2.0, min=0.0, max=100.0, step=0.1, advanced=True),
IO.Float.Input("temperature", default=0.85, min=0.0, max=2.0, step=0.01, advanced=True),
IO.Float.Input("top_p", default=0.9, min=0.0, max=2000.0, step=0.01, advanced=True),
IO.Int.Input("top_k", default=0, min=0, max=100, advanced=True),
IO.Float.Input("min_p", default=0.000, min=0.0, max=1.0, step=0.001, advanced=True),
],
outputs=[IO.Conditioning.Output()],
)
@classmethod
def execute(cls, clip, tags, lyrics, seed, bpm, duration, timesignature, language, keyscale, generate_audio_codes, cfg_scale, temperature, top_p, top_k, min_p) -> IO.NodeOutput:
tokens = clip.tokenize(tags, lyrics=lyrics, bpm=bpm, duration=duration, timesignature=int(timesignature), language=language, keyscale=keyscale, seed=seed, generate_audio_codes=generate_audio_codes, cfg_scale=cfg_scale, temperature=temperature, top_p=top_p, top_k=top_k, min_p=min_p)
conditioning = clip.encode_from_tokens_scheduled(tokens)
return IO.NodeOutput(conditioning)
class EmptyAceStepLatentAudio(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="EmptyAceStepLatentAudio",
display_name="Empty Ace Step 1.0 Latent Audio",
category="latent/audio",
inputs=[
IO.Float.Input("seconds", default=120.0, min=1.0, max=1000.0, step=0.1),
IO.Int.Input(
"batch_size", default=1, min=1, max=4096, tooltip="The number of latent images in the batch."
),
],
outputs=[IO.Latent.Output()],
)
@classmethod
def execute(cls, seconds, batch_size) -> IO.NodeOutput:
length = int(seconds * 44100 / 512 / 8)
latent = torch.zeros([batch_size, 8, 16, length], device=comfy.model_management.intermediate_device(), dtype=comfy.model_management.intermediate_dtype())
return IO.NodeOutput({"samples": latent, "type": "audio"})
class EmptyAceStep15LatentAudio(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="EmptyAceStep1.5LatentAudio",
display_name="Empty Ace Step 1.5 Latent Audio",
category="latent/audio",
inputs=[
IO.Float.Input("seconds", default=120.0, min=1.0, max=1000.0, step=0.01),
IO.Int.Input(
"batch_size", default=1, min=1, max=4096, tooltip="The number of latent images in the batch."
),
],
outputs=[IO.Latent.Output()],
)
@classmethod
def execute(cls, seconds, batch_size) -> IO.NodeOutput:
length = round((seconds * 48000 / 1920))
latent = torch.zeros([batch_size, 64, length], device=comfy.model_management.intermediate_device(), dtype=comfy.model_management.intermediate_dtype())
return IO.NodeOutput({"samples": latent, "type": "audio"})
class ReferenceAudio(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="ReferenceTimbreAudio",
display_name="Reference Audio",
category="advanced/conditioning/audio",
is_experimental=True,
description="This node sets the reference audio for ace step 1.5",
inputs=[
IO.Conditioning.Input("conditioning"),
IO.Latent.Input("latent", optional=True),
],
outputs=[
IO.Conditioning.Output(),
]
)
@classmethod
def execute(cls, conditioning, latent=None) -> IO.NodeOutput:
if latent is not None:
conditioning = node_helpers.conditioning_set_values(conditioning, {"reference_audio_timbre_latents": [latent["samples"]]}, append=True)
return IO.NodeOutput(conditioning)
class AceExtension(ComfyExtension):
@override
async def get_node_list(self) -> list[type[IO.ComfyNode]]:
return [
TextEncodeAceStepAudio,
EmptyAceStepLatentAudio,
TextEncodeAceStepAudio15,
EmptyAceStep15LatentAudio,
ReferenceAudio,
]
async def comfy_entrypoint() -> AceExtension:
return AceExtension()