mirror of
https://github.com/comfyanonymous/ComfyUI.git
synced 2026-04-25 09:52:35 +08:00
Some checks are pending
Build package / Build Test (3.10) (push) Waiting to run
Build package / Build Test (3.12) (push) Waiting to run
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Build package / Build Test (3.11) (push) Waiting to run
Build package / Build Test (3.13) (push) Waiting to run
Build package / Build Test (3.14) (push) Waiting to run
* fix: pin SQLAlchemy>=2.0 in requirements.txt (fixes #13036) (#13316) * Refactor io to IO in nodes_ace.py (#13485) * Bump comfyui-frontend-package to 1.42.12 (#13489) * Make the ltx audio vae more native. (#13486) * feat(api-nodes): add automatic downscaling of videos for ByteDance 2 nodes (#13465) * Support standalone LTXV audio VAEs (#13499) * [Partner Nodes] added 4K resolution for Veo models; added Veo 3 Lite model (#13330) * feat(api nodes): added 4K resolution for Veo models; added Veo 3 Lite model Signed-off-by: bigcat88 <bigcat88@icloud.com> * increase poll_interval from 5 to 9 --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> * Bump comfyui-frontend-package to 1.42.14 (#13493) * Add gpt-image-2 as version option (#13501) * Allow logging in comfy app files. (#13505) * chore: update workflow templates to v0.9.59 (#13507) * fix(veo): reject 4K resolution for veo-3.0 models in Veo3VideoGenerationNode (#13504) The tooltip on the resolution input states that 4K is not available for veo-3.1-lite or veo-3.0 models, but the execute guard only rejected the lite combination. Selecting 4K with veo-3.0-generate-001 or veo-3.0-fast-generate-001 would fall through and hit the upstream API with an invalid request. Broaden the guard to match the documented behavior and update the error message accordingly. Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com> * feat: RIFE and FILM frame interpolation model support (CORE-29) (#13258) * initial RIFE support * Also support FILM * Better RAM usage, reduce FILM VRAM peak * Add model folder placeholder * Fix oom fallback frame loss * Remove torch.compile for now * Rename model input * Shorter input type name --------- * fix: use Parameter assignment for Stable_Zero123 cc_projection weights (fixes #13492) (#13518) On Windows with aimdo enabled, disable_weight_init.Linear uses lazy initialization that sets weight and bias to None to avoid unnecessary memory allocation. This caused a crash when copy_() was called on the None weight attribute in Stable_Zero123.__init__. Replace copy_() with direct torch.nn.Parameter assignment, which works correctly on both Windows (aimdo enabled) and other platforms. * Derive InterruptProcessingException from BaseException (#13523) * bump manager version to 4.2.1 (#13516) * ModelPatcherDynamic: force cast stray weights on comfy layers (#13487) the mixed_precision ops can have input_scale parameters that are used in tensor math but arent a weight or bias so dont get proper VRAM management. Treat these as force-castable parameters like the non comfy weight, random params are buffers already are. * Update logging level for invalid version format (#13526) * [Partner Nodes] add SD2 real human support (#13509) * feat(api-nodes): add SD2 real human support Signed-off-by: bigcat88 <bigcat88@icloud.com> * fix: add validation before uploading Assets Signed-off-by: bigcat88 <bigcat88@icloud.com> * Add asset_id and group_id displaying on the node Signed-off-by: bigcat88 <bigcat88@icloud.com> * extend poll_op to use instead of custom async cycle Signed-off-by: bigcat88 <bigcat88@icloud.com> * added the polling for the "Active" status after asset creation Signed-off-by: bigcat88 <bigcat88@icloud.com> * updated tooltip for group_id * allow usage of real human in the ByteDance2FirstLastFrame node * add reference count limits * corrected price in status when input assets contain video Signed-off-by: bigcat88 <bigcat88@icloud.com> --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> * feat: SAM (segment anything) 3.1 support (CORE-34) (#13408) * [Partner Nodes] GPTImage: fix price badges, add new resolutions (#13519) * fix(api-nodes): fixed price badges, add new resolutions Signed-off-by: bigcat88 <bigcat88@icloud.com> * proper calculate the total run cost when "n > 1" Signed-off-by: bigcat88 <bigcat88@icloud.com> --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> * chore: update workflow templates to v0.9.61 (#13533) * chore: update embedded docs to v0.4.4 (#13535) * add 4K resolution to Kling nodes (#13536) Signed-off-by: bigcat88 <bigcat88@icloud.com> * Fix LTXV Reference Audio node (#13531) * comfy-aimdo 0.2.14: Hotfix async allocator estimations (#13534) This was doing an over-estimate of VRAM used by the async allocator when lots of little small tensors were in play. Also change the versioning scheme to == so we can roll forward aimdo without worrying about stable regressions downstream in comfyUI core. * Disable sageattention for SAM3 (#13529) Causes Nans * execution: Add anti-cycle validation (#13169) Currently if the graph contains a cycle, the just inifitiate recursions, hits a catch all then throws a generic error against the output node that seeded the validation. Instead, fail the offending cycling mode chain and handlng it as an error in its own right. Co-authored-by: guill <jacob.e.segal@gmail.com> * chore: update workflow templates to v0.9.62 (#13539) --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> Co-authored-by: Octopus <liyuan851277048@icloud.com> Co-authored-by: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Co-authored-by: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com> Co-authored-by: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Co-authored-by: Jukka Seppänen <40791699+kijai@users.noreply.github.com> Co-authored-by: AustinMroz <austin@comfy.org> Co-authored-by: Daxiong (Lin) <contact@comfyui-wiki.com> Co-authored-by: Matt Miller <matt@miller-media.com> Co-authored-by: blepping <157360029+blepping@users.noreply.github.com> Co-authored-by: Dr.Lt.Data <128333288+ltdrdata@users.noreply.github.com> Co-authored-by: rattus <46076784+rattus128@users.noreply.github.com> Co-authored-by: guill <jacob.e.segal@gmail.com>
226 lines
7.7 KiB
Python
226 lines
7.7 KiB
Python
import folder_paths
|
|
import comfy.utils
|
|
import comfy.model_management
|
|
import torch
|
|
|
|
from comfy_api.latest import ComfyExtension, io
|
|
from comfy_extras.nodes_audio import VAEEncodeAudio
|
|
|
|
class LTXVAudioVAELoader(io.ComfyNode):
|
|
@classmethod
|
|
def define_schema(cls) -> io.Schema:
|
|
return io.Schema(
|
|
node_id="LTXVAudioVAELoader",
|
|
display_name="LTXV Audio VAE Loader",
|
|
category="audio",
|
|
inputs=[
|
|
io.Combo.Input(
|
|
"ckpt_name",
|
|
options=folder_paths.get_filename_list("checkpoints"),
|
|
tooltip="Audio VAE checkpoint to load.",
|
|
)
|
|
],
|
|
outputs=[io.Vae.Output(display_name="Audio VAE")],
|
|
)
|
|
|
|
@classmethod
|
|
def execute(cls, ckpt_name: str) -> io.NodeOutput:
|
|
ckpt_path = folder_paths.get_full_path_or_raise("checkpoints", ckpt_name)
|
|
sd, metadata = comfy.utils.load_torch_file(ckpt_path, return_metadata=True)
|
|
sd = comfy.utils.state_dict_prefix_replace(sd, {"audio_vae.": "autoencoder.", "vocoder.": "vocoder."}, filter_keys=True)
|
|
vae = comfy.sd.VAE(sd=sd, metadata=metadata)
|
|
vae.throw_exception_if_invalid()
|
|
|
|
return io.NodeOutput(vae)
|
|
|
|
|
|
class LTXVAudioVAEEncode(VAEEncodeAudio):
|
|
@classmethod
|
|
def define_schema(cls) -> io.Schema:
|
|
return io.Schema(
|
|
node_id="LTXVAudioVAEEncode",
|
|
display_name="LTXV Audio VAE Encode",
|
|
category="audio",
|
|
inputs=[
|
|
io.Audio.Input("audio", tooltip="The audio to be encoded."),
|
|
io.Vae.Input(
|
|
id="audio_vae",
|
|
display_name="Audio VAE",
|
|
tooltip="The Audio VAE model to use for encoding.",
|
|
),
|
|
],
|
|
outputs=[io.Latent.Output(display_name="Audio Latent")],
|
|
)
|
|
|
|
@classmethod
|
|
def execute(cls, audio, audio_vae) -> io.NodeOutput:
|
|
return super().execute(audio_vae, audio)
|
|
|
|
|
|
class LTXVAudioVAEDecode(io.ComfyNode):
|
|
@classmethod
|
|
def define_schema(cls) -> io.Schema:
|
|
return io.Schema(
|
|
node_id="LTXVAudioVAEDecode",
|
|
display_name="LTXV Audio VAE Decode",
|
|
category="audio",
|
|
inputs=[
|
|
io.Latent.Input("samples", tooltip="The latent to be decoded."),
|
|
io.Vae.Input(
|
|
id="audio_vae",
|
|
display_name="Audio VAE",
|
|
tooltip="The Audio VAE model used for decoding the latent.",
|
|
),
|
|
],
|
|
outputs=[io.Audio.Output(display_name="Audio")],
|
|
)
|
|
|
|
@classmethod
|
|
def execute(cls, samples, audio_vae) -> io.NodeOutput:
|
|
audio_latent = samples["samples"]
|
|
if audio_latent.is_nested:
|
|
audio_latent = audio_latent.unbind()[-1]
|
|
audio = audio_vae.decode(audio_latent).movedim(-1, 1).to(audio_latent.device)
|
|
output_audio_sample_rate = audio_vae.first_stage_model.output_sample_rate
|
|
return io.NodeOutput(
|
|
{
|
|
"waveform": audio,
|
|
"sample_rate": int(output_audio_sample_rate),
|
|
}
|
|
)
|
|
|
|
|
|
class LTXVEmptyLatentAudio(io.ComfyNode):
|
|
@classmethod
|
|
def define_schema(cls) -> io.Schema:
|
|
return io.Schema(
|
|
node_id="LTXVEmptyLatentAudio",
|
|
display_name="LTXV Empty Latent Audio",
|
|
category="latent/audio",
|
|
inputs=[
|
|
io.Int.Input(
|
|
"frames_number",
|
|
default=97,
|
|
min=1,
|
|
max=1000,
|
|
step=1,
|
|
display_mode=io.NumberDisplay.number,
|
|
tooltip="Number of frames.",
|
|
),
|
|
io.Int.Input(
|
|
"frame_rate",
|
|
default=25,
|
|
min=1,
|
|
max=1000,
|
|
step=1,
|
|
display_mode=io.NumberDisplay.number,
|
|
tooltip="Number of frames per second.",
|
|
),
|
|
io.Int.Input(
|
|
"batch_size",
|
|
default=1,
|
|
min=1,
|
|
max=4096,
|
|
display_mode=io.NumberDisplay.number,
|
|
tooltip="The number of latent audio samples in the batch.",
|
|
),
|
|
io.Vae.Input(
|
|
id="audio_vae",
|
|
display_name="Audio VAE",
|
|
tooltip="The Audio VAE model to get configuration from.",
|
|
),
|
|
],
|
|
outputs=[io.Latent.Output(display_name="Latent")],
|
|
)
|
|
|
|
@classmethod
|
|
def execute(
|
|
cls,
|
|
frames_number: int,
|
|
frame_rate: int,
|
|
batch_size: int,
|
|
audio_vae,
|
|
) -> io.NodeOutput:
|
|
"""Generate empty audio latents matching the reference pipeline structure."""
|
|
|
|
assert audio_vae is not None, "Audio VAE model is required"
|
|
|
|
z_channels = audio_vae.latent_channels
|
|
audio_freq = audio_vae.first_stage_model.latent_frequency_bins
|
|
sampling_rate = int(audio_vae.first_stage_model.sample_rate)
|
|
|
|
num_audio_latents = audio_vae.first_stage_model.num_of_latents_from_frames(frames_number, frame_rate)
|
|
|
|
audio_latents = torch.zeros(
|
|
(batch_size, z_channels, num_audio_latents, audio_freq),
|
|
device=comfy.model_management.intermediate_device(),
|
|
)
|
|
|
|
return io.NodeOutput(
|
|
{
|
|
"samples": audio_latents,
|
|
"sample_rate": sampling_rate,
|
|
"type": "audio",
|
|
}
|
|
)
|
|
|
|
|
|
class LTXAVTextEncoderLoader(io.ComfyNode):
|
|
@classmethod
|
|
def define_schema(cls) -> io.Schema:
|
|
return io.Schema(
|
|
node_id="LTXAVTextEncoderLoader",
|
|
display_name="LTXV Audio Text Encoder Loader",
|
|
category="advanced/loaders",
|
|
description="[Recipes]\n\nltxav: gemma 3 12B",
|
|
inputs=[
|
|
io.Combo.Input(
|
|
"text_encoder",
|
|
options=folder_paths.get_filename_list("text_encoders"),
|
|
),
|
|
io.Combo.Input(
|
|
"ckpt_name",
|
|
options=folder_paths.get_filename_list("checkpoints"),
|
|
),
|
|
io.Combo.Input(
|
|
"device",
|
|
options=comfy.model_management.get_gpu_device_options(),
|
|
advanced=True,
|
|
)
|
|
],
|
|
outputs=[io.Clip.Output()],
|
|
)
|
|
|
|
@classmethod
|
|
def execute(cls, text_encoder, ckpt_name, device="default"):
|
|
clip_type = comfy.sd.CLIPType.LTXV
|
|
|
|
clip_path1 = folder_paths.get_full_path_or_raise("text_encoders", text_encoder)
|
|
clip_path2 = folder_paths.get_full_path_or_raise("checkpoints", ckpt_name)
|
|
|
|
model_options = {}
|
|
resolved = comfy.model_management.resolve_gpu_device_option(device)
|
|
if resolved is not None:
|
|
if resolved.type == "cpu":
|
|
model_options["load_device"] = model_options["offload_device"] = resolved
|
|
else:
|
|
model_options["load_device"] = resolved
|
|
|
|
clip = comfy.sd.load_clip(ckpt_paths=[clip_path1, clip_path2], embedding_directory=folder_paths.get_folder_paths("embeddings"), clip_type=clip_type, model_options=model_options)
|
|
return io.NodeOutput(clip)
|
|
|
|
|
|
class LTXVAudioExtension(ComfyExtension):
|
|
async def get_node_list(self) -> list[type[io.ComfyNode]]:
|
|
return [
|
|
LTXVAudioVAELoader,
|
|
LTXVAudioVAEEncode,
|
|
LTXVAudioVAEDecode,
|
|
LTXVEmptyLatentAudio,
|
|
LTXAVTextEncoderLoader,
|
|
]
|
|
|
|
|
|
async def comfy_entrypoint() -> ComfyExtension:
|
|
return LTXVAudioExtension()
|