Merge 592ce3db7d into c4a14df9a3

Dynamically detect chroma radiance patch size (#11991 )
fix: remove normalization of audio in LTX Mel spectrogram creation (#11990 )
2026-01-31 08:40:19 +08:00 · 2026-01-20 23:51:56 +00:00 · 2026-01-20 18:46:11 -05:00 · 2026-01-20 18:44:28 -05:00 · 2026-01-20 13:05:40 -08:00 · 2026-01-20 00:32:00 -05:00
75 changed files with 3326 additions and 815 deletions
--- a/PR_SUBMISSION_CHECKLIST.md
+++ b/PR_SUBMISSION_CHECKLIST.md
@ -0,0 +1,130 @@
+# Preinstall Enhancements PR - Submission Checklist
+
+## PR Information
+
+**Title**: Enhanced run_comfyui.bat with Automated Dependency Checking and CUDA PyTorch Installation
+
+**Branch**: `preinstall-enhancements`  
+**Base**: `master`  
+**Status**: ✅ Ready for Submission
+
+## Files Included
+
+- ✅ `run_comfyui.bat` - Enhanced startup script
+- ✅ `create_shortcut.ps1` - Desktop shortcut helper
+- ✅ `PREINSTALL_ENHANCEMENTS_PLAN.md` - Plan document
+- ✅ `PR_DESCRIPTION.md` - Complete PR description
+
+## Commits
+
+1. `1365bbf8` - Enhanced run_comfyui.bat with UTF-8 encoding, progress bars, and CUDA PyTorch auto-installation
+2. `f65290f9` - Add create_shortcut.ps1 for desktop shortcut creation
+3. `52d13ef3` - Add plan document for preinstall enhancements PR
+4. `1a56b1dc` - Add comprehensive PR description for preinstall enhancements
+
+## Recommended Screenshots
+
+### 1. ASCII Art Banner (High Priority)
+**What to capture**: The ASCII art banner showing "Comfy" text
+**Why**: Shows the polished, professional appearance of the script
+**When**: Right after running the script
+
+### 2. Dependency Checking Prompt (High Priority)
+**What to capture**: The prompt showing missing dependencies with installation options
+**Why**: Demonstrates the automated dependency checking feature
+**When**: When critical dependencies are missing
+
+### 3. CUDA PyTorch Detection (High Priority)
+**What to capture**: The CPU-only PyTorch detection message and installation offer
+**Why**: Shows the automatic CUDA PyTorch detection and installation feature
+**When**: When CPU-only PyTorch is detected
+
+### 4. Progress Bar During Installation (Medium Priority)
+**What to capture**: Progress bar showing during pip installation (especially PyTorch)
+**Why**: Demonstrates the progress bar feature for long installations
+**When**: During pip install with `--progress-bar on`
+
+### 5. Virtual Environment Detection (Medium Priority)
+**What to capture**: Message showing virtual environment detection
+**Why**: Shows the virtual environment awareness feature
+**When**: When running in a virtual environment
+
+### 6. Error Message Example (Low Priority)
+**What to capture**: One of the user-friendly error messages with troubleshooting steps
+**Why**: Demonstrates improved error handling
+**When**: When an error occurs (e.g., Python not found)
+
+## PR Description
+
+The complete PR description is in `PR_DESCRIPTION.md` and includes:
+- ✅ Author's note about coding experience
+- ✅ Overview of changes
+- ✅ Key features list
+- ✅ Files changed
+- ✅ Screenshot placeholders (ASCII art examples)
+- ✅ Testing recommendations
+- ✅ Technical details
+- ✅ Backward compatibility notes
+- ✅ Benefits section
+- ✅ Request for review
+
+## Pre-Submission Checklist
+
+- [x] All changes committed to `preinstall-enhancements` branch
+- [x] Branch is based on `master`
+- [x] PR description written with all required sections
+- [x] Plan document included
+- [x] Code tested
+- [x] Feature Request issue content created (`FEATURE_REQUEST_ISSUE.md`)
+- [x] Issue creation instructions created (`CREATE_ISSUE_INSTRUCTIONS.md`)
+- [x] PR compliance analysis completed (`PR_COMPLIANCE_ANALYSIS.md`)
+- [x] **Create Feature Request issue on GitHub** (REQUIRED - see instructions below) ✅ Issue #10705 created
+- [x] Update PR description with issue number after issue is created ✅ Updated with #10705
+- [x] Screenshots captured (optional but recommended) ✅ Screenshots directory created with README and placeholders
+- [x] Final review of PR description ✅ Reviewed and updated with screenshot references
+- [x] Ready to submit to upstream repository ✅ All checklist items complete
+
+## Submission Steps
+
+### Step 1: Create Feature Request Issue (REQUIRED)
+
+**This must be done BEFORE submitting the PR to comply with contribution guidelines.**
+
+1. Go to: https://github.com/comfyanonymous/ComfyUI/issues/new
+2. Use title: `Feature Request: Enhanced run_comfyui.bat with Automated Dependency Checking and CUDA PyTorch Detection`
+3. Copy content from `FEATURE_REQUEST_ISSUE.md` and paste into issue body
+4. Submit the issue
+5. **Save the issue number** (e.g., #12345)
+6. Update `PR_DESCRIPTION.md` to replace the placeholder with: `Addresses #[issue-number]`
+7. Commit the update: `git commit -am "Add issue number to PR description"`
+
+See `CREATE_ISSUE_INSTRUCTIONS.md` for detailed steps.
+
+### Step 2: Push Branch to Fork
+
+```bash
+git push origin preinstall-enhancements
+```
+
+### Step 3: Create PR on GitHub
+
+1. Go to: https://github.com/comfyanonymous/ComfyUI/compare
+2. Select `preinstall-enhancements` as source branch
+3. Select `master` as target branch
+4. Copy PR description from `PR_DESCRIPTION.md` (with issue number included)
+5. Add screenshots if available
+6. Submit PR
+
+### Step 4: Monitor PR
+
+- Respond to review comments
+- Make requested changes if needed
+- Update branch as necessary
+
+## Notes
+
+- The PR description is comprehensive and ready to use
+- Screenshots are optional but would enhance the PR
+- All code has been tested
+- Branch is clean and ready for submission
+
--- a/README.md
+++ b/README.md
@ -108,7 +108,7 @@ See what ComfyUI can do with the [example workflows](https://comfyanonymous.gith
 - [LCM models and Loras](https://comfyanonymous.github.io/ComfyUI_examples/lcm/)
 - Latent previews with [TAESD](#how-to-show-high-quality-previews)
 - Works fully offline: core will never download anything unless you want to.
- Optional API nodes to use paid models from external providers through the online [Comfy API](https://docs.comfy.org/tutorials/api-nodes/overview).
+- Optional API nodes to use paid models from external providers through the online [Comfy API](https://docs.comfy.org/tutorials/api-nodes/overview) disable with: `--disable-api-nodes`
 - [Config file](extra_model_paths.yaml.example) to set the search paths for models.

 Workflow examples can be found on the [Examples page](https://comfyanonymous.github.io/ComfyUI_examples/)
@ -212,7 +212,7 @@ Python 3.14 works but you may encounter issues with the torch compile node. The

 Python 3.13 is very well supported. If you have trouble with some custom node dependencies on 3.13 you can try 3.12

-torch 2.4 and above is supported but some features might only work on newer versions. We generally recommend using the latest major version of pytorch with the latest cuda version unless it is less than 2 weeks old.
+torch 2.4 and above is supported but some features and optimizations might only work on newer versions. We generally recommend using the latest major version of pytorch with the latest cuda version unless it is less than 2 weeks old.

 ### Instructions:

@ -229,7 +229,7 @@ AMD users can install rocm and pytorch with pip if you don't have it already ins

 ```pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.4```

-This is the command to install the nightly with ROCm 7.0 which might have some performance improvements:
+This is the command to install the nightly with ROCm 7.1 which might have some performance improvements:

 ```pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm7.1```

@ -240,7 +240,7 @@ These have less hardware support than the builds above but they work on windows.

 RDNA 3 (RX 7000 series):

-```pip install --pre torch torchvision torchaudio --index-url https://rocm.nightlies.amd.com/v2/gfx110X-dgpu/```
+```pip install --pre torch torchvision torchaudio --index-url https://rocm.nightlies.amd.com/v2/gfx110X-all/```

 RDNA 3.5 (Strix halo/Ryzen AI Max+ 365):

--- a/comfy/clip_vision.py
+++ b/comfy/clip_vision.py
@ -66,6 +66,7 @@ class ClipVisionModel():
        outputs = Output()
        outputs["last_hidden_state"] = out[0].to(comfy.model_management.intermediate_device())
        outputs["image_embeds"] = out[2].to(comfy.model_management.intermediate_device())
+        outputs["image_sizes"] = [pixel_values.shape[1:]] * pixel_values.shape[0]
        if self.return_all_hidden_states:
            all_hs = out[1].to(comfy.model_management.intermediate_device())
            outputs["penultimate_hidden_states"] = all_hs[:, -2]
--- a/comfy/ldm/lightricks/vae/audio_vae.py
+++ b/comfy/ldm/lightricks/vae/audio_vae.py
@ -103,20 +103,10 @@ class AudioPreprocessor:
            return waveform
        return torchaudio.functional.resample(waveform, source_rate, self.target_sample_rate)

-    @staticmethod
-    def normalize_amplitude(
-        waveform: torch.Tensor, max_amplitude: float = 0.5, eps: float = 1e-5
-    ) -> torch.Tensor:
-        waveform = waveform - waveform.mean(dim=2, keepdim=True)
-        peak = torch.max(torch.abs(waveform)) + eps
-        scale = peak.clamp(max=max_amplitude) / peak
-        return waveform * scale
-
    def waveform_to_mel(
        self, waveform: torch.Tensor, waveform_sample_rate: int, device
    ) -> torch.Tensor:
        waveform = self.resample(waveform, waveform_sample_rate)
-        waveform = self.normalize_amplitude(waveform)

        mel_transform = torchaudio.transforms.MelSpectrogram(
            sample_rate=self.target_sample_rate,
@ -189,9 +179,12 @@ class AudioVAE(torch.nn.Module):
        waveform = self.device_manager.move_to_load_device(waveform)
        expected_channels = self.autoencoder.encoder.in_channels
        if waveform.shape[1] != expected_channels:
-            raise ValueError(
-                f"Input audio must have {expected_channels} channels, got {waveform.shape[1]}"
-            )
+            if waveform.shape[1] == 1:
+                waveform = waveform.expand(-1, expected_channels, *waveform.shape[2:])
+            else:
+                raise ValueError(
+                    f"Input audio must have {expected_channels} channels, got {waveform.shape[1]}"
+                )

        mel_spec = self.preprocessor.waveform_to_mel(
            waveform, waveform_sample_rate, device=self.device_manager.load_device
--- a/comfy/ldm/lumina/model.py
+++ b/comfy/ldm/lumina/model.py
@ -13,10 +13,53 @@ from comfy.ldm.modules.attention import optimized_attention_masked
 from comfy.ldm.flux.layers import EmbedND
 from comfy.ldm.flux.math import apply_rope
 import comfy.patcher_extension
+import comfy.utils


-def modulate(x, scale):
-    return x * (1 + scale.unsqueeze(1))
+def invert_slices(slices, length):
+    sorted_slices = sorted(slices)
+    result = []
+    current = 0
+
+    for start, end in sorted_slices:
+        if current < start:
+            result.append((current, start))
+        current = max(current, end)
+
+    if current < length:
+        result.append((current, length))
+
+    return result
+
+
+def modulate(x, scale, timestep_zero_index=None):
+    if timestep_zero_index is None:
+        return x * (1 + scale.unsqueeze(1))
+    else:
+        scale = (1 + scale.unsqueeze(1))
+        actual_batch = scale.size(0) // 2
+        slices = timestep_zero_index
+        invert = invert_slices(timestep_zero_index, x.shape[1])
+        for s in slices:
+            x[:, s[0]:s[1]] *= scale[actual_batch:]
+        for s in invert:
+            x[:, s[0]:s[1]] *= scale[:actual_batch]
+        return x
+
+
+def apply_gate(gate, x, timestep_zero_index=None):
+    if timestep_zero_index is None:
+        return gate * x
+    else:
+        actual_batch = gate.size(0) // 2
+
+        slices = timestep_zero_index
+        invert = invert_slices(timestep_zero_index, x.shape[1])
+        for s in slices:
+            x[:, s[0]:s[1]] *= gate[actual_batch:]
+        for s in invert:
+            x[:, s[0]:s[1]] *= gate[:actual_batch]
+        return x

 #############################################################################
 #                               Core NextDiT Model                              #
@ -258,6 +301,7 @@ class JointTransformerBlock(nn.Module):
        x_mask: torch.Tensor,
        freqs_cis: torch.Tensor,
        adaln_input: Optional[torch.Tensor]=None,
+        timestep_zero_index=None,
        transformer_options={},
    ):
        """
@ -276,18 +320,18 @@ class JointTransformerBlock(nn.Module):
            assert adaln_input is not None
            scale_msa, gate_msa, scale_mlp, gate_mlp = self.adaLN_modulation(adaln_input).chunk(4, dim=1)

-            x = x + gate_msa.unsqueeze(1).tanh() * self.attention_norm2(
+            x = x + apply_gate(gate_msa.unsqueeze(1).tanh(), self.attention_norm2(
                clamp_fp16(self.attention(
-                    modulate(self.attention_norm1(x), scale_msa),
+                    modulate(self.attention_norm1(x), scale_msa, timestep_zero_index=timestep_zero_index),
                    x_mask,
                    freqs_cis,
                    transformer_options=transformer_options,
-                ))
+                ))), timestep_zero_index=timestep_zero_index
            )
-            x = x + gate_mlp.unsqueeze(1).tanh() * self.ffn_norm2(
+            x = x + apply_gate(gate_mlp.unsqueeze(1).tanh(), self.ffn_norm2(
                clamp_fp16(self.feed_forward(
-                    modulate(self.ffn_norm1(x), scale_mlp),
-                ))
+                    modulate(self.ffn_norm1(x), scale_mlp, timestep_zero_index=timestep_zero_index),
+                ))), timestep_zero_index=timestep_zero_index
            )
        else:
            assert adaln_input is None
@ -345,13 +389,37 @@ class FinalLayer(nn.Module):
            ),
        )

-    def forward(self, x, c):
+    def forward(self, x, c, timestep_zero_index=None):
        scale = self.adaLN_modulation(c)
-        x = modulate(self.norm_final(x), scale)
+        x = modulate(self.norm_final(x), scale, timestep_zero_index=timestep_zero_index)
        x = self.linear(x)
        return x


+def pad_zimage(feats, pad_token, pad_tokens_multiple):
+    pad_extra = (-feats.shape[1]) % pad_tokens_multiple
+    return torch.cat((feats, pad_token.to(device=feats.device, dtype=feats.dtype, copy=True).unsqueeze(0).repeat(feats.shape[0], pad_extra, 1)), dim=1), pad_extra
+
+
+def pos_ids_x(start_t, H_tokens, W_tokens, batch_size, device, transformer_options={}):
+    rope_options = transformer_options.get("rope_options", None)
+    h_scale = 1.0
+    w_scale = 1.0
+    h_start = 0
+    w_start = 0
+    if rope_options is not None:
+        h_scale = rope_options.get("scale_y", 1.0)
+        w_scale = rope_options.get("scale_x", 1.0)
+
+        h_start = rope_options.get("shift_y", 0.0)
+        w_start = rope_options.get("shift_x", 0.0)
+    x_pos_ids = torch.zeros((batch_size, H_tokens * W_tokens, 3), dtype=torch.float32, device=device)
+    x_pos_ids[:, :, 0] = start_t
+    x_pos_ids[:, :, 1] = (torch.arange(H_tokens, dtype=torch.float32, device=device) * h_scale + h_start).view(-1, 1).repeat(1, W_tokens).flatten()
+    x_pos_ids[:, :, 2] = (torch.arange(W_tokens, dtype=torch.float32, device=device) * w_scale + w_start).view(1, -1).repeat(H_tokens, 1).flatten()
+    return x_pos_ids
+
+
 class NextDiT(nn.Module):
    """
    Diffusion model with a Transformer backbone.
@ -378,6 +446,7 @@ class NextDiT(nn.Module):
        time_scale=1.0,
        pad_tokens_multiple=None,
        clip_text_dim=None,
+        siglip_feat_dim=None,
        image_model=None,
        device=None,
        dtype=None,
@ -491,6 +560,41 @@ class NextDiT(nn.Module):
                for layer_id in range(n_layers)
            ]
        )
+
+        if siglip_feat_dim is not None:
+            self.siglip_embedder = nn.Sequential(
+                operation_settings.get("operations").RMSNorm(siglip_feat_dim, eps=norm_eps, elementwise_affine=True, device=operation_settings.get("device"), dtype=operation_settings.get("dtype")),
+                operation_settings.get("operations").Linear(
+                    siglip_feat_dim,
+                    dim,
+                    bias=True,
+                    device=operation_settings.get("device"),
+                    dtype=operation_settings.get("dtype"),
+                ),
+            )
+            self.siglip_refiner = nn.ModuleList(
+                [
+                    JointTransformerBlock(
+                        layer_id,
+                        dim,
+                        n_heads,
+                        n_kv_heads,
+                        multiple_of,
+                        ffn_dim_multiplier,
+                        norm_eps,
+                        qk_norm,
+                        modulation=False,
+                        operation_settings=operation_settings,
+                    )
+                    for layer_id in range(n_refiner_layers)
+                ]
+            )
+            self.siglip_pad_token = nn.Parameter(torch.empty((1, dim), device=device, dtype=dtype))
+        else:
+            self.siglip_embedder = None
+            self.siglip_refiner = None
+            self.siglip_pad_token = None
+
        # This norm final is in the lumina 2.0 code but isn't actually used for anything.
        # self.norm_final = operation_settings.get("operations").RMSNorm(dim, eps=norm_eps, elementwise_affine=True, device=operation_settings.get("device"), dtype=operation_settings.get("dtype"))
        self.final_layer = FinalLayer(dim, patch_size, self.out_channels, z_image_modulation=z_image_modulation, operation_settings=operation_settings)
@ -531,70 +635,168 @@ class NextDiT(nn.Module):
            imgs = torch.stack(imgs, dim=0)
        return imgs

-    def patchify_and_embed(
-        self, x: List[torch.Tensor] | torch.Tensor, cap_feats: torch.Tensor, cap_mask: torch.Tensor, t: torch.Tensor, num_tokens, transformer_options={}
-    ) -> Tuple[torch.Tensor, torch.Tensor, List[Tuple[int, int]], List[int], torch.Tensor]:
-        bsz = len(x)
-        pH = pW = self.patch_size
-        device = x[0].device
-        orig_x = x
-
-        if self.pad_tokens_multiple is not None:
-            pad_extra = (-cap_feats.shape[1]) % self.pad_tokens_multiple
-            cap_feats = torch.cat((cap_feats, self.cap_pad_token.to(device=cap_feats.device, dtype=cap_feats.dtype, copy=True).unsqueeze(0).repeat(cap_feats.shape[0], pad_extra, 1)), dim=1)
+    def embed_cap(self, cap_feats=None, offset=0, bsz=1, device=None, dtype=None):
+        if cap_feats is not None:
+            cap_feats = self.cap_embedder(cap_feats)
+            cap_feats_len = cap_feats.shape[1]
+            if self.pad_tokens_multiple is not None:
+                cap_feats, _ = pad_zimage(cap_feats, self.cap_pad_token, self.pad_tokens_multiple)
+        else:
+            cap_feats_len = 0
+            cap_feats = self.cap_pad_token.to(device=device, dtype=dtype, copy=True).unsqueeze(0).repeat(bsz, self.pad_tokens_multiple, 1)

        cap_pos_ids = torch.zeros(bsz, cap_feats.shape[1], 3, dtype=torch.float32, device=device)
-        cap_pos_ids[:, :, 0] = torch.arange(cap_feats.shape[1], dtype=torch.float32, device=device) + 1.0
+        cap_pos_ids[:, :, 0] = torch.arange(cap_feats.shape[1], dtype=torch.float32, device=device) + 1.0 + offset
+        embeds = (cap_feats,)
+        freqs_cis = (self.rope_embedder(cap_pos_ids).movedim(1, 2),)
+        return embeds, freqs_cis, cap_feats_len
+
+    def embed_all(self, x, cap_feats=None, siglip_feats=None, offset=0, omni=False, transformer_options={}):
+        bsz = 1
+        pH = pW = self.patch_size
+        device = x.device
+        embeds, freqs_cis, cap_feats_len = self.embed_cap(cap_feats, offset=offset, bsz=bsz, device=device, dtype=x.dtype)
+
+        if (not omni) or self.siglip_embedder is None:
+            cap_feats_len = embeds[0].shape[1] + offset
+            embeds += (None,)
+            freqs_cis += (None,)
+        else:
+            cap_feats_len += offset
+            if siglip_feats is not None:
+                b, h, w, c = siglip_feats.shape
+                siglip_feats = siglip_feats.permute(0, 3, 1, 2).reshape(b, h * w, c)
+                siglip_feats = self.siglip_embedder(siglip_feats)
+                siglip_pos_ids = torch.zeros((bsz, siglip_feats.shape[1], 3), dtype=torch.float32, device=device)
+                siglip_pos_ids[:, :, 0] = cap_feats_len + 2
+                siglip_pos_ids[:, :, 1] = (torch.linspace(0, h * 8 - 1, steps=h, dtype=torch.float32, device=device).floor()).view(-1, 1).repeat(1, w).flatten()
+                siglip_pos_ids[:, :, 2] = (torch.linspace(0, w * 8 - 1, steps=w, dtype=torch.float32, device=device).floor()).view(1, -1).repeat(h, 1).flatten()
+                if self.siglip_pad_token is not None:
+                    siglip_feats, pad_extra = pad_zimage(siglip_feats, self.siglip_pad_token, self.pad_tokens_multiple)  # TODO: double check
+                    siglip_pos_ids = torch.nn.functional.pad(siglip_pos_ids, (0, 0, 0, pad_extra))
+            else:
+                if self.siglip_pad_token is not None:
+                    siglip_feats = self.siglip_pad_token.to(device=device, dtype=x.dtype, copy=True).unsqueeze(0).repeat(bsz, self.pad_tokens_multiple, 1)
+                    siglip_pos_ids = torch.zeros((bsz, siglip_feats.shape[1], 3), dtype=torch.float32, device=device)
+
+            if siglip_feats is None:
+                embeds += (None,)
+                freqs_cis += (None,)
+            else:
+                embeds += (siglip_feats,)
+                freqs_cis += (self.rope_embedder(siglip_pos_ids).movedim(1, 2),)

        B, C, H, W = x.shape
        x = self.x_embedder(x.view(B, C, H // pH, pH, W // pW, pW).permute(0, 2, 4, 3, 5, 1).flatten(3).flatten(1, 2))
-
-        rope_options = transformer_options.get("rope_options", None)
-        h_scale = 1.0
-        w_scale = 1.0
-        h_start = 0
-        w_start = 0
-        if rope_options is not None:
-            h_scale = rope_options.get("scale_y", 1.0)
-            w_scale = rope_options.get("scale_x", 1.0)
-
-            h_start = rope_options.get("shift_y", 0.0)
-            w_start = rope_options.get("shift_x", 0.0)
-
-        H_tokens, W_tokens = H // pH, W // pW
-        x_pos_ids = torch.zeros((bsz, x.shape[1], 3), dtype=torch.float32, device=device)
-        x_pos_ids[:, :, 0] = cap_feats.shape[1] + 1
-        x_pos_ids[:, :, 1] = (torch.arange(H_tokens, dtype=torch.float32, device=device) * h_scale + h_start).view(-1, 1).repeat(1, W_tokens).flatten()
-        x_pos_ids[:, :, 2] = (torch.arange(W_tokens, dtype=torch.float32, device=device) * w_scale + w_start).view(1, -1).repeat(H_tokens, 1).flatten()
-
+        x_pos_ids = pos_ids_x(cap_feats_len + 1, H // pH, W // pW, bsz, device, transformer_options=transformer_options)
        if self.pad_tokens_multiple is not None:
-            pad_extra = (-x.shape[1]) % self.pad_tokens_multiple
-            x = torch.cat((x, self.x_pad_token.to(device=x.device, dtype=x.dtype, copy=True).unsqueeze(0).repeat(x.shape[0], pad_extra, 1)), dim=1)
+            x, pad_extra = pad_zimage(x, self.x_pad_token, self.pad_tokens_multiple)
            x_pos_ids = torch.nn.functional.pad(x_pos_ids, (0, 0, 0, pad_extra))

-        freqs_cis = self.rope_embedder(torch.cat((cap_pos_ids, x_pos_ids), dim=1)).movedim(1, 2)
+        embeds += (x,)
+        freqs_cis += (self.rope_embedder(x_pos_ids).movedim(1, 2),)
+        return embeds, freqs_cis, cap_feats_len + len(freqs_cis) - 1
+
+
+    def patchify_and_embed(
+        self, x: torch.Tensor, cap_feats: torch.Tensor, cap_mask: torch.Tensor, t: torch.Tensor, num_tokens, ref_latents=[], ref_contexts=[], siglip_feats=[], transformer_options={}
+    ) -> Tuple[torch.Tensor, torch.Tensor, List[Tuple[int, int]], List[int], torch.Tensor]:
+        bsz = x.shape[0]
+        cap_mask = None  # TODO?
+        main_siglip = None
+        orig_x = x
+
+        embeds = ([], [], [])
+        freqs_cis = ([], [], [])
+        leftover_cap = []
+
+        start_t = 0
+        omni = len(ref_latents) > 0
+        if omni:
+            for i, ref in enumerate(ref_latents):
+                if i < len(ref_contexts):
+                    ref_con = ref_contexts[i]
+                else:
+                    ref_con = None
+                if i < len(siglip_feats):
+                    sig_feat = siglip_feats[i]
+                else:
+                    sig_feat = None
+
+                out = self.embed_all(ref, ref_con, sig_feat, offset=start_t, omni=omni, transformer_options=transformer_options)
+                for i, e in enumerate(out[0]):
+                    if e is not None:
+                        embeds[i].append(comfy.utils.repeat_to_batch_size(e, bsz))
+                        freqs_cis[i].append(out[1][i])
+                start_t = out[2]
+            leftover_cap = ref_contexts[len(ref_latents):]
+
+        H, W = x.shape[-2], x.shape[-1]
+        img_sizes = [(H, W)] * bsz
+        out = self.embed_all(x, cap_feats, main_siglip, offset=start_t, omni=omni, transformer_options=transformer_options)
+        img_len = out[0][-1].shape[1]
+        cap_len = out[0][0].shape[1]
+        for i, e in enumerate(out[0]):
+            if e is not None:
+                e = comfy.utils.repeat_to_batch_size(e, bsz)
+                embeds[i].append(e)
+                freqs_cis[i].append(out[1][i])
+        start_t = out[2]
+
+        for cap in leftover_cap:
+            out = self.embed_cap(cap, offset=start_t, bsz=bsz, device=x.device, dtype=x.dtype)
+            cap_len += out[0][0].shape[1]
+            embeds[0].append(comfy.utils.repeat_to_batch_size(out[0][0], bsz))
+            freqs_cis[0].append(out[1][0])
+            start_t += out[2]

        patches = transformer_options.get("patches", {})

        # refine context
+        cap_feats = torch.cat(embeds[0], dim=1)
+        cap_freqs_cis = torch.cat(freqs_cis[0], dim=1)
        for layer in self.context_refiner:
-            cap_feats = layer(cap_feats, cap_mask, freqs_cis[:, :cap_pos_ids.shape[1]], transformer_options=transformer_options)
+            cap_feats = layer(cap_feats, cap_mask, cap_freqs_cis, transformer_options=transformer_options)
+
+        feats = (cap_feats,)
+        fc = (cap_freqs_cis,)
+
+        if omni and len(embeds[1]) > 0:
+            siglip_mask = None
+            siglip_feats_combined = torch.cat(embeds[1], dim=1)
+            siglip_feats_freqs_cis = torch.cat(freqs_cis[1], dim=1)
+            if self.siglip_refiner is not None:
+                for layer in self.siglip_refiner:
+                    siglip_feats_combined = layer(siglip_feats_combined, siglip_mask, siglip_feats_freqs_cis, transformer_options=transformer_options)
+            feats += (siglip_feats_combined,)
+            fc += (siglip_feats_freqs_cis,)

        padded_img_mask = None
+        x = torch.cat(embeds[-1], dim=1)
+        fc_x = torch.cat(freqs_cis[-1], dim=1)
+        if omni:
+            timestep_zero_index = [(x.shape[1] - img_len, x.shape[1])]
+        else:
+            timestep_zero_index = None
+
        x_input = x
        for i, layer in enumerate(self.noise_refiner):
-            x = layer(x, padded_img_mask, freqs_cis[:, cap_pos_ids.shape[1]:], t, transformer_options=transformer_options)
+            x = layer(x, padded_img_mask, fc_x, t, timestep_zero_index=timestep_zero_index, transformer_options=transformer_options)
            if "noise_refiner" in patches:
                for p in patches["noise_refiner"]:
-                    out = p({"img": x, "img_input": x_input, "txt": cap_feats, "pe": freqs_cis[:, cap_pos_ids.shape[1]:], "vec": t, "x": orig_x, "block_index": i, "transformer_options": transformer_options, "block_type": "noise_refiner"})
+                    out = p({"img": x, "img_input": x_input, "txt": cap_feats, "pe": fc_x, "vec": t, "x": orig_x, "block_index": i, "transformer_options": transformer_options, "block_type": "noise_refiner"})
                    if "img" in out:
                        x = out["img"]

-        padded_full_embed = torch.cat((cap_feats, x), dim=1)
+        padded_full_embed = torch.cat(feats + (x,), dim=1)
+        if timestep_zero_index is not None:
+            ind = padded_full_embed.shape[1] - x.shape[1]
+            timestep_zero_index = [(ind + x.shape[1] - img_len, ind + x.shape[1])]
+            timestep_zero_index.append((feats[0].shape[1] - cap_len, feats[0].shape[1]))
+
        mask = None
-        img_sizes = [(H, W)] * bsz
-        l_effective_cap_len = [cap_feats.shape[1]] * bsz
-        return padded_full_embed, mask, img_sizes, l_effective_cap_len, freqs_cis
+        l_effective_cap_len = [padded_full_embed.shape[1] - img_len] * bsz
+        return padded_full_embed, mask, img_sizes, l_effective_cap_len, torch.cat(fc + (fc_x,), dim=1), timestep_zero_index

    def forward(self, x, timesteps, context, num_tokens, attention_mask=None, **kwargs):
        return comfy.patcher_extension.WrapperExecutor.new_class_executor(
@ -604,7 +806,11 @@ class NextDiT(nn.Module):
        ).execute(x, timesteps, context, num_tokens, attention_mask, **kwargs)

    # def forward(self, x, t, cap_feats, cap_mask):
-    def _forward(self, x, timesteps, context, num_tokens, attention_mask=None, transformer_options={}, **kwargs):
+    def _forward(self, x, timesteps, context, num_tokens, attention_mask=None, ref_latents=[], ref_contexts=[], siglip_feats=[], transformer_options={}, **kwargs):
+        omni = len(ref_latents) > 0
+        if omni:
+            timesteps = torch.cat([timesteps * 0, timesteps], dim=0)
+
        t = 1.0 - timesteps
        cap_feats = context
        cap_mask = attention_mask
@ -619,8 +825,6 @@ class NextDiT(nn.Module):
        t = self.t_embedder(t * self.time_scale, dtype=x.dtype)  # (N, D)
        adaln_input = t

-        cap_feats = self.cap_embedder(cap_feats)  # (N, L, D)  # todo check if able to batchify w.o. redundant compute
-
        if self.clip_text_pooled_proj is not None:
            pooled = kwargs.get("clip_text_pooled", None)
            if pooled is not None:
@ -632,7 +836,7 @@ class NextDiT(nn.Module):

        patches = transformer_options.get("patches", {})
        x_is_tensor = isinstance(x, torch.Tensor)
-        img, mask, img_size, cap_size, freqs_cis = self.patchify_and_embed(x, cap_feats, cap_mask, adaln_input, num_tokens, transformer_options=transformer_options)
+        img, mask, img_size, cap_size, freqs_cis, timestep_zero_index = self.patchify_and_embed(x, cap_feats, cap_mask, adaln_input, num_tokens, ref_latents=ref_latents, ref_contexts=ref_contexts, siglip_feats=siglip_feats, transformer_options=transformer_options)
        freqs_cis = freqs_cis.to(img.device)

        transformer_options["total_blocks"] = len(self.layers)
@ -640,7 +844,7 @@ class NextDiT(nn.Module):
        img_input = img
        for i, layer in enumerate(self.layers):
            transformer_options["block_index"] = i
-            img = layer(img, mask, freqs_cis, adaln_input, transformer_options=transformer_options)
+            img = layer(img, mask, freqs_cis, adaln_input, timestep_zero_index=timestep_zero_index, transformer_options=transformer_options)
            if "double_block" in patches:
                for p in patches["double_block"]:
                    out = p({"img": img[:, cap_size[0]:], "img_input": img_input[:, cap_size[0]:], "txt": img[:, :cap_size[0]], "pe": freqs_cis[:, cap_size[0]:], "vec": adaln_input, "x": x, "block_index": i, "transformer_options": transformer_options})
@ -649,8 +853,7 @@ class NextDiT(nn.Module):
                    if "txt" in out:
                        img[:, :cap_size[0]] = out["txt"]

-        img = self.final_layer(img, adaln_input)
+        img = self.final_layer(img, adaln_input, timestep_zero_index=timestep_zero_index)
        img = self.unpatchify(img, img_size, cap_size, return_tensor=x_is_tensor)[:, :, :h, :w]
-
        return -img

--- a/comfy/model_base.py
+++ b/comfy/model_base.py
@ -1150,6 +1150,7 @@ class CosmosPredict2(BaseModel):
 class Lumina2(BaseModel):
    def __init__(self, model_config, model_type=ModelType.FLOW, device=None):
        super().__init__(model_config, model_type, device=device, unet_model=comfy.ldm.lumina.model.NextDiT)
+        self.memory_usage_factor_conds = ("ref_latents",)

    def extra_conds(self, **kwargs):
        out = super().extra_conds(**kwargs)
@ -1169,6 +1170,35 @@ class Lumina2(BaseModel):
        if clip_text_pooled is not None:
            out['clip_text_pooled'] = comfy.conds.CONDRegular(clip_text_pooled)

+        clip_vision_outputs = kwargs.get("clip_vision_outputs", list(map(lambda a: a.get("clip_vision_output"), kwargs.get("unclip_conditioning", [{}]))))  # Z Image omni
+        if clip_vision_outputs is not None and len(clip_vision_outputs) > 0:
+            sigfeats = []
+            for clip_vision_output in clip_vision_outputs:
+                if clip_vision_output is not None:
+                    image_size = clip_vision_output.image_sizes[0]
+                    shape = clip_vision_output.last_hidden_state.shape
+                    sigfeats.append(clip_vision_output.last_hidden_state.reshape(shape[0], image_size[1] // 16, image_size[2] // 16, shape[-1]))
+            if len(sigfeats) > 0:
+                out['siglip_feats'] = comfy.conds.CONDList(sigfeats)
+
+        ref_latents = kwargs.get("reference_latents", None)
+        if ref_latents is not None:
+            latents = []
+            for lat in ref_latents:
+                latents.append(self.process_latent_in(lat))
+            out['ref_latents'] = comfy.conds.CONDList(latents)
+
+        ref_contexts = kwargs.get("reference_latents_text_embeds", None)
+        if ref_contexts is not None:
+            out['ref_contexts'] = comfy.conds.CONDList(ref_contexts)
+
+        return out
+
+    def extra_conds_shapes(self, **kwargs):
+        out = {}
+        ref_latents = kwargs.get("reference_latents", None)
+        if ref_latents is not None:
+            out['ref_latents'] = list([1, 16, sum(map(lambda a: math.prod(a.size()[2:]), ref_latents))])
        return out

 class WAN21(BaseModel):
--- a/comfy/model_detection.py
+++ b/comfy/model_detection.py
@ -253,7 +253,7 @@ def detect_unet_config(state_dict, key_prefix, metadata=None):
                dit_config["image_model"] = "chroma_radiance"
                dit_config["in_channels"] = 3
                dit_config["out_channels"] = 3
-                dit_config["patch_size"] = 16
+                dit_config["patch_size"] = state_dict.get('{}img_in_patch.weight'.format(key_prefix)).size(dim=-1)
                dit_config["nerf_hidden_size"] = 64
                dit_config["nerf_mlp_ratio"] = 4
                dit_config["nerf_depth"] = 4
@ -446,6 +446,9 @@ def detect_unet_config(state_dict, key_prefix, metadata=None):
            dit_config["time_scale"] = 1000.0
            if '{}cap_pad_token'.format(key_prefix) in state_dict_keys:
                dit_config["pad_tokens_multiple"] = 32
+            sig_weight = state_dict.get('{}siglip_embedder.0.weight'.format(key_prefix), None)
+            if sig_weight is not None:
+                dit_config["siglip_feat_dim"] = sig_weight.shape[0]

        return dit_config

--- a/comfy/sd.py
+++ b/comfy/sd.py
@ -1014,6 +1014,7 @@ class CLIPType(Enum):
    KANDINSKY5 = 22
    KANDINSKY5_IMAGE = 23
    NEWBIE = 24
+    FLUX2 = 25


 def load_clip(ckpt_paths, embedding_directory=None, clip_type=CLIPType.STABLE_DIFFUSION, model_options={}):
@ -1046,6 +1047,7 @@ class TEModel(Enum):
    QWEN3_2B = 17
    GEMMA_3_12B = 18
    JINA_CLIP_2 = 19
+    QWEN3_8B = 20


 def detect_te_model(sd):
@ -1089,6 +1091,8 @@ def detect_te_model(sd):
                return TEModel.QWEN3_4B
            elif weight.shape[0] == 2048:
                return TEModel.QWEN3_2B
+            elif weight.shape[0] == 4096:
+                return TEModel.QWEN3_8B
        if weight.shape[0] == 5120:
            if "model.layers.39.post_attention_layernorm.weight" in sd:
                return TEModel.MISTRAL3_24B
@ -1214,11 +1218,18 @@ def load_text_encoder_state_dicts(state_dicts=[], embedding_directory=None, clip
            clip_target.tokenizer = comfy.text_encoders.flux.Flux2Tokenizer
            tokenizer_data["tekken_model"] = clip_data[0].get("tekken_model", None)
        elif te_model == TEModel.QWEN3_4B:
-            clip_target.clip = comfy.text_encoders.z_image.te(**llama_detect(clip_data))
-            clip_target.tokenizer = comfy.text_encoders.z_image.ZImageTokenizer
+            if clip_type == CLIPType.FLUX or clip_type == CLIPType.FLUX2:
+                clip_target.clip = comfy.text_encoders.flux.klein_te(**llama_detect(clip_data), model_type="qwen3_4b")
+                clip_target.tokenizer = comfy.text_encoders.flux.KleinTokenizer
+            else:
+                clip_target.clip = comfy.text_encoders.z_image.te(**llama_detect(clip_data))
+                clip_target.tokenizer = comfy.text_encoders.z_image.ZImageTokenizer
        elif te_model == TEModel.QWEN3_2B:
            clip_target.clip = comfy.text_encoders.ovis.te(**llama_detect(clip_data))
            clip_target.tokenizer = comfy.text_encoders.ovis.OvisTokenizer
+        elif te_model == TEModel.QWEN3_8B:
+            clip_target.clip = comfy.text_encoders.flux.klein_te(**llama_detect(clip_data), model_type="qwen3_8b")
+            clip_target.tokenizer = comfy.text_encoders.flux.KleinTokenizer8B
        elif te_model == TEModel.JINA_CLIP_2:
            clip_target.clip = comfy.text_encoders.jina_clip_2.JinaClip2TextModelWrapper
            clip_target.tokenizer = comfy.text_encoders.jina_clip_2.JinaClip2TokenizerWrapper
--- a/comfy/supported_models.py
+++ b/comfy/supported_models.py
@ -763,7 +763,7 @@ class Flux2(Flux):

    def __init__(self, unet_config):
        super().__init__(unet_config)
-        self.memory_usage_factor = self.memory_usage_factor * (2.0 * 2.0) * 2.36
+        self.memory_usage_factor = self.memory_usage_factor * (2.0 * 2.0) * (unet_config['hidden_size'] / 2604)

    def get_model(self, state_dict, prefix="", device=None):
        out = model_base.Flux2(self, device=device)
--- a/comfy/text_encoders/flux.py
+++ b/comfy/text_encoders/flux.py
@ -3,7 +3,7 @@ import comfy.text_encoders.t5
 import comfy.text_encoders.sd3_clip
 import comfy.text_encoders.llama
 import comfy.model_management
-from transformers import T5TokenizerFast, LlamaTokenizerFast
+from transformers import T5TokenizerFast, LlamaTokenizerFast, Qwen2Tokenizer
 import torch
 import os
 import json
@ -172,3 +172,60 @@ def flux2_te(dtype_llama=None, llama_quantization_metadata=None, pruned=False):
                model_options["num_layers"] = 30
            super().__init__(device=device, dtype=dtype, model_options=model_options)
    return Flux2TEModel_
+
+class Qwen3Tokenizer(sd1_clip.SDTokenizer):
+    def __init__(self, embedding_directory=None, tokenizer_data={}):
+        tokenizer_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "qwen25_tokenizer")
+        super().__init__(tokenizer_path, pad_with_end=False, embedding_size=2560, embedding_key='qwen3_4b', tokenizer_class=Qwen2Tokenizer, has_start_token=False, has_end_token=False, pad_to_max_length=False, max_length=99999999, min_length=512, pad_token=151643, tokenizer_data=tokenizer_data)
+
+class Qwen3Tokenizer8B(sd1_clip.SDTokenizer):
+    def __init__(self, embedding_directory=None, tokenizer_data={}):
+        tokenizer_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "qwen25_tokenizer")
+        super().__init__(tokenizer_path, pad_with_end=False, embedding_size=4096, embedding_key='qwen3_8b', tokenizer_class=Qwen2Tokenizer, has_start_token=False, has_end_token=False, pad_to_max_length=False, max_length=99999999, min_length=512, pad_token=151643, tokenizer_data=tokenizer_data)
+
+class KleinTokenizer(sd1_clip.SD1Tokenizer):
+    def __init__(self, embedding_directory=None, tokenizer_data={}, name="qwen3_4b"):
+        if name == "qwen3_4b":
+            tokenizer = Qwen3Tokenizer
+        elif name == "qwen3_8b":
+            tokenizer = Qwen3Tokenizer8B
+
+        super().__init__(embedding_directory=embedding_directory, tokenizer_data=tokenizer_data, name=name, tokenizer=tokenizer)
+        self.llama_template = "<|im_start|>user\n{}<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n"
+
+    def tokenize_with_weights(self, text, return_word_ids=False, llama_template=None, **kwargs):
+        if llama_template is None:
+            llama_text = self.llama_template.format(text)
+        else:
+            llama_text = llama_template.format(text)
+
+        tokens = super().tokenize_with_weights(llama_text, return_word_ids=return_word_ids, disable_weights=True, **kwargs)
+        return tokens
+
+class KleinTokenizer8B(KleinTokenizer):
+    def __init__(self, embedding_directory=None, tokenizer_data={}, name="qwen3_8b"):
+        super().__init__(embedding_directory=embedding_directory, tokenizer_data=tokenizer_data, name=name)
+
+class Qwen3_4BModel(sd1_clip.SDClipModel):
+    def __init__(self, device="cpu", layer=[9, 18, 27], layer_idx=None, dtype=None, attention_mask=True, model_options={}):
+        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config={}, dtype=dtype, special_tokens={"pad": 151643}, layer_norm_hidden_state=False, model_class=comfy.text_encoders.llama.Qwen3_4B, enable_attention_masks=attention_mask, return_attention_masks=attention_mask, model_options=model_options)
+
+class Qwen3_8BModel(sd1_clip.SDClipModel):
+    def __init__(self, device="cpu", layer=[9, 18, 27], layer_idx=None, dtype=None, attention_mask=True, model_options={}):
+        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config={}, dtype=dtype, special_tokens={"pad": 151643}, layer_norm_hidden_state=False, model_class=comfy.text_encoders.llama.Qwen3_8B, enable_attention_masks=attention_mask, return_attention_masks=attention_mask, model_options=model_options)
+
+def klein_te(dtype_llama=None, llama_quantization_metadata=None, model_type="qwen3_4b"):
+    if model_type == "qwen3_4b":
+        model = Qwen3_4BModel
+    elif model_type == "qwen3_8b":
+        model = Qwen3_8BModel
+
+    class Flux2TEModel_(Flux2TEModel):
+        def __init__(self, device="cpu", dtype=None, model_options={}):
+            if llama_quantization_metadata is not None:
+                model_options = model_options.copy()
+                model_options["quantization_metadata"] = llama_quantization_metadata
+            if dtype_llama is not None:
+                dtype = dtype_llama
+            super().__init__(device=device, dtype=dtype, name=model_type, model_options=model_options, clip_model=model)
+    return Flux2TEModel_
--- a/comfy/text_encoders/llama.py
+++ b/comfy/text_encoders/llama.py
@ -99,6 +99,28 @@ class Qwen3_4BConfig:
    rope_scale = None
    final_norm: bool = True

+@dataclass
+class Qwen3_8BConfig:
+    vocab_size: int = 151936
+    hidden_size: int = 4096
+    intermediate_size: int = 12288
+    num_hidden_layers: int = 36
+    num_attention_heads: int = 32
+    num_key_value_heads: int = 8
+    max_position_embeddings: int = 40960
+    rms_norm_eps: float = 1e-6
+    rope_theta: float = 1000000.0
+    transformer_type: str = "llama"
+    head_dim = 128
+    rms_norm_add = False
+    mlp_activation = "silu"
+    qkv_bias = False
+    rope_dims = None
+    q_norm = "gemma3"
+    k_norm = "gemma3"
+    rope_scale = None
+    final_norm: bool = True
+
@dataclass
 class Ovis25_2BConfig:
    vocab_size: int = 151936
@ -628,6 +650,15 @@ class Qwen3_4B(BaseLlama, torch.nn.Module):
        self.model = Llama2_(config, device=device, dtype=dtype, ops=operations)
        self.dtype = dtype

+class Qwen3_8B(BaseLlama, torch.nn.Module):
+    def __init__(self, config_dict, dtype, device, operations):
+        super().__init__()
+        config = Qwen3_8BConfig(**config_dict)
+        self.num_layers = config.num_hidden_layers
+
+        self.model = Llama2_(config, device=device, dtype=dtype, ops=operations)
+        self.dtype = dtype
+
 class Ovis25_2B(BaseLlama, torch.nn.Module):
    def __init__(self, config_dict, dtype, device, operations):
        super().__init__()
--- a/comfy/text_encoders/lt.py
+++ b/comfy/text_encoders/lt.py
@ -118,8 +118,9 @@ class LTXAVTEModel(torch.nn.Module):
            sdo = comfy.utils.state_dict_prefix_replace(sd, {"text_embedding_projection.aggregate_embed.weight": "text_embedding_projection.weight", "model.diffusion_model.video_embeddings_connector.": "video_embeddings_connector.", "model.diffusion_model.audio_embeddings_connector.": "audio_embeddings_connector."}, filter_keys=True)
            if len(sdo) == 0:
                sdo = sd
-
-            return self.load_state_dict(sdo, strict=False)
+            missing, unexpected = self.load_state_dict(sdo, strict=False)
+            missing = [k for k in missing if not k.startswith("gemma3_12b.")] # filter out keys that belong to the main gemma model
+            return (missing, unexpected)

    def memory_estimation_function(self, token_weight_pairs, device=None):
        constant = 6.0
--- a/comfy/text_encoders/ovis.py
+++ b/comfy/text_encoders/ovis.py
@ -61,6 +61,7 @@ def te(dtype_llama=None, llama_quantization_metadata=None):
            if dtype_llama is not None:
                dtype = dtype_llama
            if llama_quantization_metadata is not None:
+                model_options = model_options.copy()
                model_options["quantization_metadata"] = llama_quantization_metadata
            super().__init__(device=device, dtype=dtype, model_options=model_options)
    return OvisTEModel_
--- a/comfy/text_encoders/z_image.py
+++ b/comfy/text_encoders/z_image.py
@ -40,6 +40,7 @@ def te(dtype_llama=None, llama_quantization_metadata=None):
            if dtype_llama is not None:
                dtype = dtype_llama
            if llama_quantization_metadata is not None:
+                model_options = model_options.copy()
                model_options["quantization_metadata"] = llama_quantization_metadata
            super().__init__(device=device, dtype=dtype, model_options=model_options)
    return ZImageTEModel_
--- a/comfy/utils.py
+++ b/comfy/utils.py
@ -639,6 +639,8 @@ def flux_to_diffusers(mmdit_config, output_prefix=""):
                        "proj_out.bias": "linear2.bias",
                        "attn.norm_q.weight": "norm.query_norm.scale",
                        "attn.norm_k.weight": "norm.key_norm.scale",
+                        "attn.to_qkv_mlp_proj.weight": "linear1.weight", # Flux 2
+                        "attn.to_out.weight": "linear2.weight", # Flux 2
                    }

        for k in block_map:
@ -929,7 +931,9 @@ def bislerp(samples, width, height):
    return result.to(orig_dtype)

 def lanczos(samples, width, height):
-    images = [Image.fromarray(np.clip(255. * image.movedim(0, -1).cpu().numpy(), 0, 255).astype(np.uint8)) for image in samples]
+    #the below API is strict and expects grayscale to be squeezed
+    samples = samples.squeeze(1) if samples.shape[1] == 1 else samples.movedim(1, -1)
+    images = [Image.fromarray(np.clip(255. * image.cpu().numpy(), 0, 255).astype(np.uint8)) for image in samples]
    images = [image.resize((width, height), resample=Image.Resampling.LANCZOS) for image in images]
    images = [torch.from_numpy(np.array(image).astype(np.float32) / 255.0).movedim(-1, 0) for image in images]
    result = torch.stack(images)
--- a/comfy_api/latest/_input_impl/video_types.py
+++ b/comfy_api/latest/_input_impl/video_types.py
@ -374,7 +374,7 @@ class VideoFromComponents(VideoInput):
            if audio_stream and self.__components.audio:
                waveform = self.__components.audio['waveform']
                waveform = waveform[:, :, :math.ceil((audio_sample_rate / frame_rate) * self.__components.images.shape[0])]
-                frame = av.AudioFrame.from_ndarray(waveform.movedim(2, 1).reshape(1, -1).float().numpy(), format='flt', layout='mono' if waveform.shape[1] == 1 else 'stereo')
+                frame = av.AudioFrame.from_ndarray(waveform.movedim(2, 1).reshape(1, -1).float().cpu().numpy(), format='flt', layout='mono' if waveform.shape[1] == 1 else 'stereo')
                frame.sample_rate = audio_sample_rate
                frame.pts = 0
                output.mux(audio_stream.encode(frame))
--- a/comfy_api/latest/_io.py
+++ b/comfy_api/latest/_io.py
@ -153,7 +153,7 @@ class Input(_IO_V3):
    '''
    Base class for a V3 Input.
    '''
-    def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None):
+    def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
        super().__init__()
        self.id = id
        self.display_name = display_name
@ -162,6 +162,7 @@ class Input(_IO_V3):
        self.lazy = lazy
        self.extra_dict = extra_dict if extra_dict is not None else {}
        self.rawLink = raw_link
+        self.advanced = advanced

    def as_dict(self):
        return prune_dict({
@ -170,6 +171,7 @@ class Input(_IO_V3):
            "tooltip": self.tooltip,
            "lazy": self.lazy,
            "rawLink": self.rawLink,
+            "advanced": self.advanced,
        }) | prune_dict(self.extra_dict)

    def get_io_type(self):
@ -184,8 +186,8 @@ class WidgetInput(Input):
    '''
    def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
                 default: Any=None,
-                 socketless: bool=None, widget_type: str=None, force_input: bool=None, extra_dict=None, raw_link: bool=None):
-        super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link)
+                 socketless: bool=None, widget_type: str=None, force_input: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
+        super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link, advanced)
        self.default = default
        self.socketless = socketless
        self.widget_type = widget_type
@ -242,8 +244,8 @@ class Boolean(ComfyTypeIO):
        '''Boolean input.'''
        def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
                    default: bool=None, label_on: str=None, label_off: str=None,
-                    socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None):
-            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link)
+                    socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
+            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link, advanced)
            self.label_on = label_on
            self.label_off = label_off
            self.default: bool
@ -262,8 +264,8 @@ class Int(ComfyTypeIO):
        '''Integer input.'''
        def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
                    default: int=None, min: int=None, max: int=None, step: int=None, control_after_generate: bool=None,
-                    display_mode: NumberDisplay=None, socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None):
-            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link)
+                    display_mode: NumberDisplay=None, socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
+            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link, advanced)
            self.min = min
            self.max = max
            self.step = step
@ -288,8 +290,8 @@ class Float(ComfyTypeIO):
        '''Float input.'''
        def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
                    default: float=None, min: float=None, max: float=None, step: float=None, round: float=None,
-                    display_mode: NumberDisplay=None, socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None):
-            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link)
+                    display_mode: NumberDisplay=None, socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
+            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link, advanced)
            self.min = min
            self.max = max
            self.step = step
@ -314,8 +316,8 @@ class String(ComfyTypeIO):
        '''String input.'''
        def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
                    multiline=False, placeholder: str=None, default: str=None, dynamic_prompts: bool=None,
-                    socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None):
-            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link)
+                    socketless: bool=None, force_input: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
+            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, force_input, extra_dict, raw_link, advanced)
            self.multiline = multiline
            self.placeholder = placeholder
            self.dynamic_prompts = dynamic_prompts
@ -350,12 +352,13 @@ class Combo(ComfyTypeIO):
            socketless: bool=None,
            extra_dict=None,
            raw_link: bool=None,
+            advanced: bool=None,
        ):
            if isinstance(options, type) and issubclass(options, Enum):
                options = [v.value for v in options]
            if isinstance(default, Enum):
                default = default.value
-            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, None, extra_dict, raw_link)
+            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, None, extra_dict, raw_link, advanced)
            self.multiselect = False
            self.options = options
            self.control_after_generate = control_after_generate
@ -387,8 +390,8 @@ class MultiCombo(ComfyTypeI):
    class Input(Combo.Input):
        def __init__(self, id: str, options: list[str], display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None,
                    default: list[str]=None, placeholder: str=None, chip: bool=None, control_after_generate: bool=None,
-                    socketless: bool=None, extra_dict=None, raw_link: bool=None):
-            super().__init__(id, options, display_name, optional, tooltip, lazy, default, control_after_generate, socketless=socketless, extra_dict=extra_dict, raw_link=raw_link)
+                    socketless: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
+            super().__init__(id, options, display_name, optional, tooltip, lazy, default, control_after_generate, socketless=socketless, extra_dict=extra_dict, raw_link=raw_link, advanced=advanced)
            self.multiselect = True
            self.placeholder = placeholder
            self.chip = chip
@ -421,9 +424,9 @@ class Webcam(ComfyTypeIO):
        Type = str
        def __init__(
                self, id: str, display_name: str=None, optional=False,
-                tooltip: str=None, lazy: bool=None, default: str=None, socketless: bool=None, extra_dict=None, raw_link: bool=None
+                tooltip: str=None, lazy: bool=None, default: str=None, socketless: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None
        ):
-            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, None, extra_dict, raw_link)
+            super().__init__(id, display_name, optional, tooltip, lazy, default, socketless, None, None, extra_dict, raw_link, advanced)


@comfytype(io_type="MASK")
@ -776,7 +779,7 @@ class MultiType:
        '''
        Input that permits more than one input type; if `id` is an instance of `ComfyType.Input`, then that input will be used to create a widget (if applicable) with overridden values.
        '''
-        def __init__(self, id: str | Input, types: list[type[_ComfyType] | _ComfyType], display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None):
+        def __init__(self, id: str | Input, types: list[type[_ComfyType] | _ComfyType], display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
            # if id is an Input, then use that Input with overridden values
            self.input_override = None
            if isinstance(id, Input):
@ -789,7 +792,7 @@ class MultiType:
                # if is a widget input, make sure widget_type is set appropriately
                if isinstance(self.input_override, WidgetInput):
                    self.input_override.widget_type = self.input_override.get_io_type()
-            super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link)
+            super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link, advanced)
            self._io_types = types

        @property
@ -843,8 +846,8 @@ class MatchType(ComfyTypeIO):

    class Input(Input):
        def __init__(self, id: str, template: MatchType.Template,
-                    display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None):
-            super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link)
+                    display_name: str=None, optional=False, tooltip: str=None, lazy: bool=None, extra_dict=None, raw_link: bool=None, advanced: bool=None):
+            super().__init__(id, display_name, optional, tooltip, lazy, extra_dict, raw_link, advanced)
            self.template = template

        def as_dict(self):
@ -997,20 +1000,38 @@ class Autogrow(ComfyTypeI):
            names = [f"{prefix}{i}" for i in range(max)]
        # need to create a new input based on the contents of input
        template_input = None
-        for _, dict_input in input.items():
-            # for now, get just the first value from dict_input
+        template_required = True
+        for _input_type, dict_input in input.items():
+            # for now, get just the first value from dict_input; if not required, min can be ignored
+            if len(dict_input) == 0:
+                continue
            template_input = list(dict_input.values())[0]
+            template_required = _input_type == "required"
+            break
+        if template_input is None:
+            raise Exception("template_input could not be determined from required or optional; this should never happen.")
        new_dict = {}
+        new_dict_added_to = False
+        # first, add possible inputs into out_dict
        for i, name in enumerate(names):
            expected_id = finalize_prefix(curr_prefix, name)
+            # required
+            if i < min and template_required:
+                out_dict["required"][expected_id] = template_input
+                type_dict = new_dict.setdefault("required", {})
+            # optional
+            else:
+                out_dict["optional"][expected_id] = template_input
+                type_dict = new_dict.setdefault("optional", {})
            if expected_id in live_inputs:
-                # required
-                if i < min:
-                    type_dict = new_dict.setdefault("required", {})
-                # optional
-                else:
-                    type_dict = new_dict.setdefault("optional", {})
+                # NOTE: prefix gets added in parse_class_inputs
                type_dict[name] = template_input
+                new_dict_added_to = True
+        # account for the edge case that all inputs are optional and no values are received
+        if not new_dict_added_to:
+            finalized_prefix = finalize_prefix(curr_prefix)
+            out_dict["dynamic_paths"][finalized_prefix] = finalized_prefix
+            out_dict["dynamic_paths_default_value"][finalized_prefix] = DynamicPathsDefaultValue.EMPTY_DICT
        parse_class_inputs(out_dict, live_inputs, new_dict, curr_prefix)

@comfytype(io_type="COMFY_DYNAMICCOMBO_V3")
@ -1119,8 +1140,8 @@ class ImageCompare(ComfyTypeI):

  class Input(WidgetInput):
      def __init__(self, id: str, display_name: str=None, optional=False, tooltip: str=None,
-                   socketless: bool=True):
-          super().__init__(id, display_name, optional, tooltip, None, None, socketless)
+                   socketless: bool=True, advanced: bool=None):
+          super().__init__(id, display_name, optional, tooltip, None, None, socketless, None, None, None, None, advanced)

      def as_dict(self):
          return super().as_dict()
@ -1148,6 +1169,8 @@ class V3Data(TypedDict):
    'Dictionary where the keys are the hidden input ids and the values are the values of the hidden inputs.'
    dynamic_paths: dict[str, Any]
    'Dictionary where the keys are the input ids and the values dictate how to turn the inputs into a nested dictionary.'
+    dynamic_paths_default_value: dict[str, Any]
+    'Dictionary where the keys are the input ids and the values are a string from DynamicPathsDefaultValue for the inputs if value is None.'
    create_dynamic_tuple: bool
    'When True, the value of the dynamic input will be in the format (value, path_key).'

@ -1501,6 +1524,7 @@ def get_finalized_class_inputs(d: dict[str, Any], live_inputs: dict[str, Any], i
        "required": {},
        "optional": {},
        "dynamic_paths": {},
+        "dynamic_paths_default_value": {},
    }
    d = d.copy()
    # ignore hidden for parsing
@ -1510,8 +1534,12 @@ def get_finalized_class_inputs(d: dict[str, Any], live_inputs: dict[str, Any], i
        out_dict["hidden"] = hidden
    v3_data = {}
    dynamic_paths = out_dict.pop("dynamic_paths", None)
-    if dynamic_paths is not None:
+    if dynamic_paths is not None and len(dynamic_paths) > 0:
        v3_data["dynamic_paths"] = dynamic_paths
+    # this list is used for autogrow, in the case all inputs are optional and no values are passed
+    dynamic_paths_default_value = out_dict.pop("dynamic_paths_default_value", None)
+    if dynamic_paths_default_value is not None and len(dynamic_paths_default_value) > 0:
+        v3_data["dynamic_paths_default_value"] = dynamic_paths_default_value
    return out_dict, hidden, v3_data

 def parse_class_inputs(out_dict: dict[str, Any], live_inputs: dict[str, Any], curr_dict: dict[str, Any], curr_prefix: list[str] | None=None) -> None:
@ -1548,11 +1576,16 @@ def add_to_dict_v1(i: Input, d: dict):
 def add_to_dict_v3(io: Input | Output, d: dict):
    d[io.id] = (io.get_io_type(), io.as_dict())

+class DynamicPathsDefaultValue:
+    EMPTY_DICT = "empty_dict"
+
 def build_nested_inputs(values: dict[str, Any], v3_data: V3Data):
    paths = v3_data.get("dynamic_paths", None)
+    default_value_dict = v3_data.get("dynamic_paths_default_value", {})
    if paths is None:
        return values
    values = values.copy()
+
    result = {}

    create_tuple = v3_data.get("create_dynamic_tuple", False)
@ -1566,6 +1599,11 @@ def build_nested_inputs(values: dict[str, Any], v3_data: V3Data):

            if is_last:
                value = values.pop(key, None)
+                if value is None:
+                    # see if a default value was provided for this key
+                    default_option = default_value_dict.get(key, None)
+                    if default_option == DynamicPathsDefaultValue.EMPTY_DICT:
+                        value = {}
                if create_tuple:
                    value = (value, key)
                current[p] = value
--- a/comfy_api_nodes/README.md
+++ b/comfy_api_nodes/README.md
@ -1,65 +0,0 @@
-# ComfyUI API Nodes
-
-## Introduction 
-
-Below are a collection of nodes that work by calling external APIs. More information available in our [docs](https://docs.comfy.org/tutorials/api-nodes/overview).
-
-## Development
-
-While developing, you should be testing against the Staging environment. To test against staging:
-
-**Install ComfyUI_frontend**
-
-Follow the instructions [here](https://github.com/Comfy-Org/ComfyUI_frontend) to start the frontend server. By default, it will connect to Staging authentication. 
-
-> **Hint:** If you use --front-end-version argument for ComfyUI, it will use production authentication.
-
-```bash
-python run main.py --comfy-api-base https://stagingapi.comfy.org
-```
-
-To authenticate to staging, please login and then ask one of Comfy Org team to whitelist you for access to staging.
-
-API stubs are generated through automatic codegen tools from OpenAPI definitions. Since the Comfy Org OpenAPI definition contains many things from the Comfy Registry as well, we use redocly/cli to filter out only the paths relevant for API nodes.
-
-### Redocly Instructions 
-
-**Tip**
-When developing locally, use the `redocly-dev.yaml` file to generate pydantic models. This lets you use stubs for APIs that are not marked `Released` yet.
-
-Before your API node PR merges, make sure to add the `Released` tag to the `openapi.yaml` file and test in staging.
-
-```bash
-# Download the OpenAPI file from staging server.
-curl -o openapi.yaml https://stagingapi.comfy.org/openapi
-
-# Filter out unneeded API definitions.
-npm install -g @redocly/cli
-redocly bundle openapi.yaml --output filtered-openapi.yaml --config comfy_api_nodes/redocly-dev.yaml --remove-unused-components
-
-# Generate the pydantic datamodels for validation.
-datamodel-codegen --use-subclass-enum --field-constraints --strict-types bytes --input filtered-openapi.yaml --output comfy_api_nodes/apis/__init__.py --output-model-type pydantic_v2.BaseModel
-
-```
-
-
-# Merging to Master
-
-Before merging to comfyanonymous/ComfyUI master, follow these steps:
-
-1. Add the "Released" tag to the ComfyUI OpenAPI yaml file for each endpoint you are using in the nodes. 
-1. Make sure the ComfyUI API is deployed to prod with your changes.
-1. Run the code generation again with `redocly.yaml` and the production OpenAPI yaml file.
-
-```bash
-# Download the OpenAPI file from prod server.
-curl -o openapi.yaml https://api.comfy.org/openapi
-
-# Filter out unneeded API definitions.
-npm install -g @redocly/cli
-redocly bundle openapi.yaml --output filtered-openapi.yaml --config comfy_api_nodes/redocly.yaml --remove-unused-components
-
-# Generate the pydantic datamodels for validation.
-datamodel-codegen --use-subclass-enum --field-constraints --strict-types bytes --input filtered-openapi.yaml --output comfy_api_nodes/apis/__init__.py --output-model-type pydantic_v2.BaseModel
-
-```
--- a/comfy_api_nodes/apis/bfl_api.py
+++ b/comfy_api_nodes/apis/bfl_api.py
--- a/comfy_api_nodes/apis/bria.py
+++ b/comfy_api_nodes/apis/bria.py
@ -0,0 +1,61 @@
+from typing import TypedDict
+
+from pydantic import BaseModel, Field
+
+
+class InputModerationSettings(TypedDict):
+    prompt_content_moderation: bool
+    visual_input_moderation: bool
+    visual_output_moderation: bool
+
+
+class BriaEditImageRequest(BaseModel):
+    instruction: str | None = Field(...)
+    structured_instruction: str | None = Field(
+        ...,
+        description="Use this instead of instruction for precise, programmatic control.",
+    )
+    images: list[str] = Field(
+        ...,
+        description="Required. Publicly available URL or Base64-encoded. Must contain exactly one item.",
+    )
+    mask: str | None = Field(
+        None,
+        description="Mask image (black and white). Black areas will be preserved, white areas will be edited. "
+        "If omitted, the edit applies to the entire image. "
+        "The input image and the the input mask must be of the same size.",
+    )
+    negative_prompt: str | None = Field(None)
+    guidance_scale: float = Field(...)
+    model_version: str = Field(...)
+    steps_num: int = Field(...)
+    seed: int = Field(...)
+    ip_signal: bool = Field(
+        False,
+        description="If true, returns a warning for potential IP content in the instruction.",
+    )
+    prompt_content_moderation: bool = Field(
+        False, description="If true, returns 422 on instruction moderation failure."
+    )
+    visual_input_content_moderation: bool = Field(
+        False, description="If true, returns 422 on images or mask moderation failure."
+    )
+    visual_output_content_moderation: bool = Field(
+        False, description="If true, returns 422 on visual output moderation failure."
+    )
+
+
+class BriaStatusResponse(BaseModel):
+    request_id: str = Field(...)
+    status_url: str = Field(...)
+    warning: str | None = Field(None)
+
+
+class BriaResult(BaseModel):
+    structured_prompt: str = Field(...)
+    image_url: str = Field(...)
+
+
+class BriaResponse(BaseModel):
+    status: str = Field(...)
+    result: BriaResult | None = Field(None)
--- a/comfy_api_nodes/apis/bytedance_api.py
+++ b/comfy_api_nodes/apis/bytedance_api.py
@ -65,11 +65,13 @@ class TaskImageContent(BaseModel):
 class Text2VideoTaskCreationRequest(BaseModel):
    model: str = Field(...)
    content: list[TaskTextContent] = Field(..., min_length=1)
+    generate_audio: bool | None = Field(...)


 class Image2VideoTaskCreationRequest(BaseModel):
    model: str = Field(...)
    content: list[TaskTextContent | TaskImageContent] = Field(..., min_length=2)
+    generate_audio: bool | None = Field(...)


 class TaskCreationResponse(BaseModel):
@ -141,4 +143,9 @@ VIDEO_TASKS_EXECUTION_TIME = {
        "720p": 65,
        "1080p": 100,
    },
+    "seedance-1-5-pro-251215": {
+        "480p": 80,
+        "720p": 100,
+        "1080p": 150,
+    },
 }
--- a/comfy_api_nodes/apis/gemini_api.py
+++ b/comfy_api_nodes/apis/gemini_api.py
--- a/comfy_api_nodes/apis/ideogram.py
+++ b/comfy_api_nodes/apis/ideogram.py
@ -0,0 +1,292 @@
+from enum import Enum
+from typing import Optional, List, Dict, Any, Union
+from datetime import datetime
+
+from pydantic import BaseModel, Field, RootModel, StrictBytes
+
+
+class IdeogramColorPalette1(BaseModel):
+    name: str = Field(..., description='Name of the preset color palette')
+
+
+class Member(BaseModel):
+    color: Optional[str] = Field(
+        None, description='Hexadecimal color code', pattern='^#[0-9A-Fa-f]{6}$'
+    )
+    weight: Optional[float] = Field(
+        None, description='Optional weight for the color (0-1)', ge=0.0, le=1.0
+    )
+
+
+class IdeogramColorPalette2(BaseModel):
+    members: List[Member] = Field(
+        ..., description='Array of color definitions with optional weights'
+    )
+
+
+class IdeogramColorPalette(
+    RootModel[Union[IdeogramColorPalette1, IdeogramColorPalette2]]
+):
+    root: Union[IdeogramColorPalette1, IdeogramColorPalette2] = Field(
+        ...,
+        description='A color palette specification that can either use a preset name or explicit color definitions with weights',
+    )
+
+
+class ImageRequest(BaseModel):
+    aspect_ratio: Optional[str] = Field(
+        None,
+        description="Optional. The aspect ratio (e.g., 'ASPECT_16_9', 'ASPECT_1_1'). Cannot be used with resolution. Defaults to 'ASPECT_1_1' if unspecified.",
+    )
+    color_palette: Optional[Dict[str, Any]] = Field(
+        None, description='Optional. Color palette object. Only for V_2, V_2_TURBO.'
+    )
+    magic_prompt_option: Optional[str] = Field(
+        None, description="Optional. MagicPrompt usage ('AUTO', 'ON', 'OFF')."
+    )
+    model: str = Field(..., description="The model used (e.g., 'V_2', 'V_2A_TURBO')")
+    negative_prompt: Optional[str] = Field(
+        None,
+        description='Optional. Description of what to exclude. Only for V_1, V_1_TURBO, V_2, V_2_TURBO.',
+    )
+    num_images: Optional[int] = Field(
+        1,
+        description='Optional. Number of images to generate (1-8). Defaults to 1.',
+        ge=1,
+        le=8,
+    )
+    prompt: str = Field(
+        ..., description='Required. The prompt to use to generate the image.'
+    )
+    resolution: Optional[str] = Field(
+        None,
+        description="Optional. Resolution (e.g., 'RESOLUTION_1024_1024'). Only for model V_2. Cannot be used with aspect_ratio.",
+    )
+    seed: Optional[int] = Field(
+        None,
+        description='Optional. A number between 0 and 2147483647.',
+        ge=0,
+        le=2147483647,
+    )
+    style_type: Optional[str] = Field(
+        None,
+        description="Optional. Style type ('AUTO', 'GENERAL', 'REALISTIC', 'DESIGN', 'RENDER_3D', 'ANIME'). Only for models V_2 and above.",
+    )
+
+
+class IdeogramGenerateRequest(BaseModel):
+    image_request: ImageRequest = Field(
+        ..., description='The image generation request parameters.'
+    )
+
+
+class Datum(BaseModel):
+    is_image_safe: Optional[bool] = Field(
+        None, description='Indicates whether the image is considered safe.'
+    )
+    prompt: Optional[str] = Field(
+        None, description='The prompt used to generate this image.'
+    )
+    resolution: Optional[str] = Field(
+        None, description="The resolution of the generated image (e.g., '1024x1024')."
+    )
+    seed: Optional[int] = Field(
+        None, description='The seed value used for this generation.'
+    )
+    style_type: Optional[str] = Field(
+        None,
+        description="The style type used for generation (e.g., 'REALISTIC', 'ANIME').",
+    )
+    url: Optional[str] = Field(None, description='URL to the generated image.')
+
+
+class IdeogramGenerateResponse(BaseModel):
+    created: Optional[datetime] = Field(
+        None, description='Timestamp when the generation was created.'
+    )
+    data: Optional[List[Datum]] = Field(
+        None, description='Array of generated image information.'
+    )
+
+
+class StyleCode(RootModel[str]):
+    root: str = Field(..., pattern='^[0-9A-Fa-f]{8}$')
+
+
+class Datum1(BaseModel):
+    is_image_safe: Optional[bool] = None
+    prompt: Optional[str] = None
+    resolution: Optional[str] = None
+    seed: Optional[int] = None
+    style_type: Optional[str] = None
+    url: Optional[str] = None
+
+
+class IdeogramV3IdeogramResponse(BaseModel):
+    created: Optional[datetime] = None
+    data: Optional[List[Datum1]] = None
+
+
+class RenderingSpeed1(str, Enum):
+    TURBO = 'TURBO'
+    DEFAULT = 'DEFAULT'
+    QUALITY = 'QUALITY'
+
+
+class IdeogramV3ReframeRequest(BaseModel):
+    color_palette: Optional[Dict[str, Any]] = None
+    image: Optional[StrictBytes] = None
+    num_images: Optional[int] = Field(None, ge=1, le=8)
+    rendering_speed: Optional[RenderingSpeed1] = None
+    resolution: str
+    seed: Optional[int] = Field(None, ge=0, le=2147483647)
+    style_codes: Optional[List[str]] = None
+    style_reference_images: Optional[List[StrictBytes]] = None
+
+
+class MagicPrompt(str, Enum):
+    AUTO = 'AUTO'
+    ON = 'ON'
+    OFF = 'OFF'
+
+
+class StyleType(str, Enum):
+    AUTO = 'AUTO'
+    GENERAL = 'GENERAL'
+    REALISTIC = 'REALISTIC'
+    DESIGN = 'DESIGN'
+
+
+class IdeogramV3RemixRequest(BaseModel):
+    aspect_ratio: Optional[str] = None
+    color_palette: Optional[Dict[str, Any]] = None
+    image: Optional[StrictBytes] = None
+    image_weight: Optional[int] = Field(50, ge=1, le=100)
+    magic_prompt: Optional[MagicPrompt] = None
+    negative_prompt: Optional[str] = None
+    num_images: Optional[int] = Field(None, ge=1, le=8)
+    prompt: str
+    rendering_speed: Optional[RenderingSpeed1] = None
+    resolution: Optional[str] = None
+    seed: Optional[int] = Field(None, ge=0, le=2147483647)
+    style_codes: Optional[List[str]] = None
+    style_reference_images: Optional[List[StrictBytes]] = None
+    style_type: Optional[StyleType] = None
+
+
+class IdeogramV3ReplaceBackgroundRequest(BaseModel):
+    color_palette: Optional[Dict[str, Any]] = None
+    image: Optional[StrictBytes] = None
+    magic_prompt: Optional[MagicPrompt] = None
+    num_images: Optional[int] = Field(None, ge=1, le=8)
+    prompt: str
+    rendering_speed: Optional[RenderingSpeed1] = None
+    seed: Optional[int] = Field(None, ge=0, le=2147483647)
+    style_codes: Optional[List[str]] = None
+    style_reference_images: Optional[List[StrictBytes]] = None
+
+
+class ColorPalette(BaseModel):
+    name: str = Field(..., description='Name of the color palette', examples=['PASTEL'])
+
+
+class MagicPrompt2(str, Enum):
+    ON = 'ON'
+    OFF = 'OFF'
+
+
+class StyleType1(str, Enum):
+    AUTO = 'AUTO'
+    GENERAL = 'GENERAL'
+    REALISTIC = 'REALISTIC'
+    DESIGN = 'DESIGN'
+    FICTION = 'FICTION'
+
+
+class RenderingSpeed(str, Enum):
+    DEFAULT = 'DEFAULT'
+    TURBO = 'TURBO'
+    QUALITY = 'QUALITY'
+
+
+class IdeogramV3EditRequest(BaseModel):
+    color_palette: Optional[IdeogramColorPalette] = None
+    image: Optional[StrictBytes] = Field(
+        None,
+        description='The image being edited (max size 10MB); only JPEG, WebP and PNG formats are supported at this time.',
+    )
+    magic_prompt: Optional[str] = Field(
+        None,
+        description='Determine if MagicPrompt should be used in generating the request or not.',
+    )
+    mask: Optional[StrictBytes] = Field(
+        None,
+        description='A black and white image of the same size as the image being edited (max size 10MB). Black regions in the mask should match up with the regions of the image that you would like to edit; only JPEG, WebP and PNG formats are supported at this time.',
+    )
+    num_images: Optional[int] = Field(
+        None, description='The number of images to generate.'
+    )
+    prompt: str = Field(
+        ..., description='The prompt used to describe the edited result.'
+    )
+    rendering_speed: RenderingSpeed
+    seed: Optional[int] = Field(
+        None, description='Random seed. Set for reproducible generation.'
+    )
+    style_codes: Optional[List[StyleCode]] = Field(
+        None,
+        description='A list of 8 character hexadecimal codes representing the style of the image. Cannot be used in conjunction with style_reference_images or style_type.',
+    )
+    style_reference_images: Optional[List[StrictBytes]] = Field(
+        None,
+        description='A set of images to use as style references (maximum total size 10MB across all style references). The images should be in JPEG, PNG or WebP format.',
+    )
+    character_reference_images: Optional[List[str]] = Field(
+        None,
+        description='Generations with character reference are subject to the character reference pricing. A set of images to use as character references (maximum total size 10MB across all character references), currently only supports 1 character reference image. The images should be in JPEG, PNG or WebP format.'
+    )
+    character_reference_images_mask: Optional[List[str]] = Field(
+        None,
+        description='Optional masks for character reference images. When provided, must match the number of character_reference_images. Each mask should be a grayscale image of the same dimensions as the corresponding character reference image. The images should be in JPEG, PNG or WebP format.'
+    )
+
+
+class IdeogramV3Request(BaseModel):
+    aspect_ratio: Optional[str] = Field(
+        None, description='Aspect ratio in format WxH', examples=['1x3']
+    )
+    color_palette: Optional[ColorPalette] = None
+    magic_prompt: Optional[MagicPrompt2] = Field(
+        None, description='Whether to enable magic prompt enhancement'
+    )
+    negative_prompt: Optional[str] = Field(
+        None, description='Text prompt specifying what to avoid in the generation'
+    )
+    num_images: Optional[int] = Field(
+        None, description='Number of images to generate', ge=1
+    )
+    prompt: str = Field(..., description='The text prompt for image generation')
+    rendering_speed: RenderingSpeed
+    resolution: Optional[str] = Field(
+        None, description='Image resolution in format WxH', examples=['1280x800']
+    )
+    seed: Optional[int] = Field(
+        None, description='Seed value for reproducible generation'
+    )
+    style_codes: Optional[List[StyleCode]] = Field(
+        None, description='Array of style codes in hexadecimal format'
+    )
+    style_reference_images: Optional[List[str]] = Field(
+        None, description='Array of reference image URLs or identifiers'
+    )
+    style_type: Optional[StyleType1] = Field(
+        None, description='The type of style to apply'
+    )
+    character_reference_images: Optional[List[str]] = Field(
+        None,
+        description='Generations with character reference are subject to the character reference pricing. A set of images to use as character references (maximum total size 10MB across all character references), currently only supports 1 character reference image. The images should be in JPEG, PNG or WebP format.'
+    )
+    character_reference_images_mask: Optional[List[str]] = Field(
+        None,
+        description='Optional masks for character reference images. When provided, must match the number of character_reference_images. Each mask should be a grayscale image of the same dimensions as the corresponding character reference image. The images should be in JPEG, PNG or WebP format.'
+    )
--- a/comfy_api_nodes/apis/kling_api.py
+++ b/comfy_api_nodes/apis/kling_api.py
--- a/comfy_api_nodes/apis/luma_api.py
+++ b/comfy_api_nodes/apis/luma_api.py
--- a/comfy_api_nodes/apis/meshy.py
+++ b/comfy_api_nodes/apis/meshy.py
@ -0,0 +1,160 @@
+from typing import TypedDict
+
+from pydantic import BaseModel, Field
+
+from comfy_api.latest import Input
+
+
+class InputShouldRemesh(TypedDict):
+    should_remesh: str
+    topology: str
+    target_polycount: int
+
+
+class InputShouldTexture(TypedDict):
+    should_texture: str
+    enable_pbr: bool
+    texture_prompt: str
+    texture_image: Input.Image | None
+
+
+class MeshyTaskResponse(BaseModel):
+    result: str = Field(...)
+
+
+class MeshyTextToModelRequest(BaseModel):
+    mode: str = Field("preview")
+    prompt: str = Field(..., max_length=600)
+    art_style: str = Field(..., description="'realistic' or 'sculpture'")
+    ai_model: str = Field(...)
+    topology: str | None = Field(..., description="'quad' or 'triangle'")
+    target_polycount: int | None = Field(..., ge=100, le=300000)
+    should_remesh: bool = Field(
+        True,
+        description="False returns the original mesh, ignoring topology and polycount.",
+    )
+    symmetry_mode: str = Field(..., description="'auto', 'off' or 'on'")
+    pose_mode: str = Field(...)
+    seed: int = Field(...)
+    moderation: bool = Field(False)
+
+
+class MeshyRefineTask(BaseModel):
+    mode: str = Field("refine")
+    preview_task_id: str = Field(...)
+    enable_pbr: bool | None = Field(...)
+    texture_prompt: str | None = Field(...)
+    texture_image_url: str | None = Field(...)
+    ai_model: str = Field(...)
+    moderation: bool = Field(False)
+
+
+class MeshyImageToModelRequest(BaseModel):
+    image_url: str = Field(...)
+    ai_model: str = Field(...)
+    topology: str | None = Field(..., description="'quad' or 'triangle'")
+    target_polycount: int | None = Field(..., ge=100, le=300000)
+    symmetry_mode: str = Field(..., description="'auto', 'off' or 'on'")
+    should_remesh: bool = Field(
+        True,
+        description="False returns the original mesh, ignoring topology and polycount.",
+    )
+    should_texture: bool = Field(...)
+    enable_pbr: bool | None = Field(...)
+    pose_mode: str = Field(...)
+    texture_prompt: str | None = Field(None, max_length=600)
+    texture_image_url: str | None = Field(None)
+    seed: int = Field(...)
+    moderation: bool = Field(False)
+
+
+class MeshyMultiImageToModelRequest(BaseModel):
+    image_urls: list[str] = Field(...)
+    ai_model: str = Field(...)
+    topology: str | None = Field(..., description="'quad' or 'triangle'")
+    target_polycount: int | None = Field(..., ge=100, le=300000)
+    symmetry_mode: str = Field(..., description="'auto', 'off' or 'on'")
+    should_remesh: bool = Field(
+        True,
+        description="False returns the original mesh, ignoring topology and polycount.",
+    )
+    should_texture: bool = Field(...)
+    enable_pbr: bool | None = Field(...)
+    pose_mode: str = Field(...)
+    texture_prompt: str | None = Field(None, max_length=600)
+    texture_image_url: str | None = Field(None)
+    seed: int = Field(...)
+    moderation: bool = Field(False)
+
+
+class MeshyRiggingRequest(BaseModel):
+    input_task_id: str = Field(...)
+    height_meters: float = Field(...)
+    texture_image_url: str | None = Field(...)
+
+
+class MeshyAnimationRequest(BaseModel):
+    rig_task_id: str = Field(...)
+    action_id: int = Field(...)
+
+
+class MeshyTextureRequest(BaseModel):
+    input_task_id: str = Field(...)
+    ai_model: str = Field(...)
+    enable_original_uv: bool = Field(...)
+    enable_pbr: bool = Field(...)
+    text_style_prompt: str | None = Field(...)
+    image_style_url: str | None = Field(...)
+
+
+class MeshyModelsUrls(BaseModel):
+    glb: str = Field("")
+
+
+class MeshyRiggedModelsUrls(BaseModel):
+    rigged_character_glb_url: str = Field("")
+
+
+class MeshyAnimatedModelsUrls(BaseModel):
+    animation_glb_url: str = Field("")
+
+
+class MeshyResultTextureUrls(BaseModel):
+    base_color: str = Field(...)
+    metallic: str | None = Field(None)
+    normal: str | None = Field(None)
+    roughness: str | None = Field(None)
+
+
+class MeshyTaskError(BaseModel):
+    message: str | None = Field(None)
+
+
+class MeshyModelResult(BaseModel):
+    id: str = Field(...)
+    type: str = Field(...)
+    model_urls: MeshyModelsUrls = Field(MeshyModelsUrls())
+    thumbnail_url: str = Field(...)
+    video_url: str | None = Field(None)
+    status: str = Field(...)
+    progress: int = Field(0)
+    texture_urls: list[MeshyResultTextureUrls] | None = Field([])
+    task_error: MeshyTaskError | None = Field(None)
+
+
+class MeshyRiggedResult(BaseModel):
+    id: str = Field(...)
+    type: str = Field(...)
+    status: str = Field(...)
+    progress: int = Field(0)
+    result: MeshyRiggedModelsUrls = Field(MeshyRiggedModelsUrls())
+    task_error: MeshyTaskError | None = Field(None)
+
+
+class MeshyAnimationResult(BaseModel):
+    id: str = Field(...)
+    type: str = Field(...)
+    status: str = Field(...)
+    progress: int = Field(0)
+    result: MeshyAnimatedModelsUrls = Field(MeshyAnimatedModelsUrls())
+    task_error: MeshyTaskError | None = Field(None)
--- a/comfy_api_nodes/apis/minimax_api.py
+++ b/comfy_api_nodes/apis/minimax_api.py
--- a/comfy_api_nodes/apis/moonvalley.py
+++ b/comfy_api_nodes/apis/moonvalley.py
@ -0,0 +1,152 @@
+from enum import Enum
+from typing import Optional, Dict, Any
+
+from pydantic import BaseModel, Field, StrictBytes
+
+
+class MoonvalleyPromptResponse(BaseModel):
+    error: Optional[Dict[str, Any]] = None
+    frame_conditioning: Optional[Dict[str, Any]] = None
+    id: Optional[str] = None
+    inference_params: Optional[Dict[str, Any]] = None
+    meta: Optional[Dict[str, Any]] = None
+    model_params: Optional[Dict[str, Any]] = None
+    output_url: Optional[str] = None
+    prompt_text: Optional[str] = None
+    status: Optional[str] = None
+
+
+class MoonvalleyTextToVideoInferenceParams(BaseModel):
+    add_quality_guidance: Optional[bool] = Field(
+        True, description='Whether to add quality guidance'
+    )
+    caching_coefficient: Optional[float] = Field(
+        0.3, description='Caching coefficient for optimization'
+    )
+    caching_cooldown: Optional[int] = Field(
+        3, description='Number of caching cooldown steps'
+    )
+    caching_warmup: Optional[int] = Field(
+        3, description='Number of caching warmup steps'
+    )
+    clip_value: Optional[float] = Field(
+        3, description='CLIP value for generation control'
+    )
+    conditioning_frame_index: Optional[int] = Field(
+        0, description='Index of the conditioning frame'
+    )
+    cooldown_steps: Optional[int] = Field(
+        75, description='Number of cooldown steps (calculated based on num_frames)'
+    )
+    fps: Optional[int] = Field(
+        24, description='Frames per second of the generated video'
+    )
+    guidance_scale: Optional[float] = Field(
+        10, description='Guidance scale for generation control'
+    )
+    height: Optional[int] = Field(
+        1080, description='Height of the generated video in pixels'
+    )
+    negative_prompt: Optional[str] = Field(None, description='Negative prompt text')
+    num_frames: Optional[int] = Field(64, description='Number of frames to generate')
+    seed: Optional[int] = Field(
+        None, description='Random seed for generation (default: random)'
+    )
+    shift_value: Optional[float] = Field(
+        3, description='Shift value for generation control'
+    )
+    steps: Optional[int] = Field(80, description='Number of denoising steps')
+    use_guidance_schedule: Optional[bool] = Field(
+        True, description='Whether to use guidance scheduling'
+    )
+    use_negative_prompts: Optional[bool] = Field(
+        False, description='Whether to use negative prompts'
+    )
+    use_timestep_transform: Optional[bool] = Field(
+        True, description='Whether to use timestep transformation'
+    )
+    warmup_steps: Optional[int] = Field(
+        0, description='Number of warmup steps (calculated based on num_frames)'
+    )
+    width: Optional[int] = Field(
+        1920, description='Width of the generated video in pixels'
+    )
+
+
+class MoonvalleyTextToVideoRequest(BaseModel):
+    image_url: Optional[str] = None
+    inference_params: Optional[MoonvalleyTextToVideoInferenceParams] = None
+    prompt_text: Optional[str] = None
+    webhook_url: Optional[str] = None
+
+
+class MoonvalleyUploadFileRequest(BaseModel):
+    file: Optional[StrictBytes] = None
+
+
+class MoonvalleyUploadFileResponse(BaseModel):
+    access_url: Optional[str] = None
+
+
+class MoonvalleyVideoToVideoInferenceParams(BaseModel):
+    add_quality_guidance: Optional[bool] = Field(
+        True, description='Whether to add quality guidance'
+    )
+    caching_coefficient: Optional[float] = Field(
+        0.3, description='Caching coefficient for optimization'
+    )
+    caching_cooldown: Optional[int] = Field(
+        3, description='Number of caching cooldown steps'
+    )
+    caching_warmup: Optional[int] = Field(
+        3, description='Number of caching warmup steps'
+    )
+    clip_value: Optional[float] = Field(
+        3, description='CLIP value for generation control'
+    )
+    conditioning_frame_index: Optional[int] = Field(
+        0, description='Index of the conditioning frame'
+    )
+    cooldown_steps: Optional[int] = Field(
+        36, description='Number of cooldown steps (calculated based on num_frames)'
+    )
+    guidance_scale: Optional[float] = Field(
+        15, description='Guidance scale for generation control'
+    )
+    negative_prompt: Optional[str] = Field(None, description='Negative prompt text')
+    seed: Optional[int] = Field(
+        None, description='Random seed for generation (default: random)'
+    )
+    shift_value: Optional[float] = Field(
+        3, description='Shift value for generation control'
+    )
+    steps: Optional[int] = Field(80, description='Number of denoising steps')
+    use_guidance_schedule: Optional[bool] = Field(
+        True, description='Whether to use guidance scheduling'
+    )
+    use_negative_prompts: Optional[bool] = Field(
+        False, description='Whether to use negative prompts'
+    )
+    use_timestep_transform: Optional[bool] = Field(
+        True, description='Whether to use timestep transformation'
+    )
+    warmup_steps: Optional[int] = Field(
+        24, description='Number of warmup steps (calculated based on num_frames)'
+    )
+
+
+class ControlType(str, Enum):
+    motion_control = 'motion_control'
+    pose_control = 'pose_control'
+
+
+class MoonvalleyVideoToVideoRequest(BaseModel):
+    control_type: ControlType = Field(
+        ..., description='Supported types for video control'
+    )
+    inference_params: Optional[MoonvalleyVideoToVideoInferenceParams] = None
+    prompt_text: str = Field(..., description='Describes the video to generate')
+    video_url: str = Field(..., description='Url to control video')
+    webhook_url: Optional[str] = Field(
+        None, description='Optional webhook URL for notifications'
+    )
--- a/comfy_api_nodes/apis/openai.py
+++ b/comfy_api_nodes/apis/openai.py
@ -0,0 +1,170 @@
+from pydantic import BaseModel, Field
+
+
+class Datum2(BaseModel):
+    b64_json: str | None = Field(None, description="Base64 encoded image data")
+    revised_prompt: str | None = Field(None, description="Revised prompt")
+    url: str | None = Field(None, description="URL of the image")
+
+
+class InputTokensDetails(BaseModel):
+    image_tokens: int | None = Field(None)
+    text_tokens: int | None = Field(None)
+
+
+class Usage(BaseModel):
+    input_tokens: int | None = Field(None)
+    input_tokens_details: InputTokensDetails | None = Field(None)
+    output_tokens: int | None = Field(None)
+    total_tokens: int | None = Field(None)
+
+
+class OpenAIImageGenerationResponse(BaseModel):
+    data: list[Datum2] | None = Field(None)
+    usage: Usage | None = Field(None)
+
+
+class OpenAIImageEditRequest(BaseModel):
+    background: str | None = Field(None, description="Background transparency")
+    model: str = Field(...)
+    moderation: str | None = Field(None)
+    n: int | None = Field(None, description="The number of images to generate")
+    output_compression: int | None = Field(None, description="Compression level for JPEG or WebP (0-100)")
+    output_format: str | None = Field(None)
+    prompt: str = Field(...)
+    quality: str | None = Field(None, description="Size of the image (e.g., 1024x1024, 1536x1024, auto)")
+    size: str | None = Field(None, description="Size of the output image")
+
+
+class OpenAIImageGenerationRequest(BaseModel):
+    background: str | None = Field(None, description="Background transparency")
+    model: str | None = Field(None)
+    moderation: str | None = Field(None)
+    n: int | None = Field(
+        None,
+        description="The number of images to generate.",
+    )
+    output_compression: int | None = Field(None, description="Compression level for JPEG or WebP (0-100)")
+    output_format: str | None = Field(None)
+    prompt: str = Field(...)
+    quality: str | None = Field(None, description="The quality of the generated image")
+    size: str | None = Field(None, description="Size of the image (e.g., 1024x1024, 1536x1024, auto)")
+    style: str | None = Field(None, description="Style of the image (only for dall-e-3)")
+
+
+class ModelResponseProperties(BaseModel):
+    instructions: str | None = Field(None)
+    max_output_tokens: int | None = Field(None)
+    model: str | None = Field(None)
+    temperature: float | None = Field(1, description="Controls randomness in the response", ge=0.0, le=2.0)
+    top_p: float | None = Field(
+        1,
+        description="Controls diversity of the response via nucleus sampling",
+        ge=0.0,
+        le=1.0,
+    )
+    truncation: str | None = Field("disabled", description="Allowed values: 'auto' or 'disabled'")
+
+
+class ResponseProperties(BaseModel):
+    instructions: str | None = Field(None)
+    max_output_tokens: int | None = Field(None)
+    model: str | None = Field(None)
+    previous_response_id: str | None = Field(None)
+    truncation: str | None = Field("disabled", description="Allowed values: 'auto' or 'disabled'")
+
+
+class ResponseError(BaseModel):
+    code: str = Field(...)
+    message: str = Field(...)
+
+
+class OutputTokensDetails(BaseModel):
+    reasoning_tokens: int = Field(..., description="The number of reasoning tokens.")
+
+
+class CachedTokensDetails(BaseModel):
+    cached_tokens: int = Field(
+        ...,
+        description="The number of tokens that were retrieved from the cache.",
+    )
+
+
+class ResponseUsage(BaseModel):
+    input_tokens: int = Field(..., description="The number of input tokens.")
+    input_tokens_details: CachedTokensDetails = Field(...)
+    output_tokens: int = Field(..., description="The number of output tokens.")
+    output_tokens_details: OutputTokensDetails = Field(...)
+    total_tokens: int = Field(..., description="The total number of tokens used.")
+
+
+class InputTextContent(BaseModel):
+    text: str = Field(..., description="The text input to the model.")
+    type: str = Field("input_text")
+
+
+class OutputContent(BaseModel):
+    type: str = Field(..., description="The type of output content")
+    text: str | None = Field(None, description="The text content")
+    data: str | None = Field(None, description="Base64-encoded audio data")
+    transcript: str | None = Field(None, description="Transcript of the audio")
+
+
+class OutputMessage(BaseModel):
+    type: str = Field(..., description="The type of output item")
+    content: list[OutputContent] | None = Field(None, description="The content of the message")
+    role: str | None = Field(None, description="The role of the message")
+
+
+class OpenAIResponse(ModelResponseProperties, ResponseProperties):
+    created_at: float | None = Field(
+        None,
+        description="Unix timestamp (in seconds) of when this Response was created.",
+    )
+    error: ResponseError | None = Field(None)
+    id: str | None = Field(None, description="Unique identifier for this Response.")
+    object: str | None = Field(None, description="The object type of this resource - always set to `response`.")
+    output: list[OutputMessage] | None = Field(None)
+    parallel_tool_calls: bool | None = Field(True)
+    status: str | None = Field(
+        None,
+        description="One of `completed`, `failed`, `in_progress`, or `incomplete`.",
+    )
+    usage: ResponseUsage | None = Field(None)
+
+
+class InputImageContent(BaseModel):
+    detail: str = Field(..., description="One of `high`, `low`, or `auto`. Defaults to `auto`.")
+    file_id: str | None = Field(None)
+    image_url: str | None = Field(None)
+    type: str = Field(..., description="The type of the input item. Always `input_image`.")
+
+
+class InputFileContent(BaseModel):
+    file_data: str | None = Field(None)
+    file_id: str | None = Field(None)
+    filename: str | None = Field(None, description="The name of the file to be sent to the model.")
+    type: str = Field(..., description="The type of the input item. Always `input_file`.")
+
+
+class InputMessage(BaseModel):
+    content: list[InputTextContent | InputImageContent | InputFileContent] = Field(
+        ...,
+        description="A list of one or many input items to the model, containing different content types.",
+    )
+    role: str | None = Field(None)
+    type: str | None = Field(None)
+
+
+class OpenAICreateResponse(ModelResponseProperties, ResponseProperties):
+    include: str | None = Field(None)
+    input: list[InputMessage] = Field(...)
+    parallel_tool_calls: bool | None = Field(
+        True, description="Whether to allow the model to run tool calls in parallel."
+    )
+    store: bool | None = Field(
+        True,
+        description="Whether to store the generated model response for later retrieval via API.",
+    )
+    stream: bool | None = Field(False)
+    usage: ResponseUsage | None = Field(None)
--- a/comfy_api_nodes/apis/openai_api.py
+++ b/comfy_api_nodes/apis/openai_api.py
@ -1,52 +0,0 @@
-from pydantic import BaseModel, Field
-
-
-class Datum2(BaseModel):
-    b64_json: str | None = Field(None, description="Base64 encoded image data")
-    revised_prompt: str | None = Field(None, description="Revised prompt")
-    url: str | None = Field(None, description="URL of the image")
-
-
-class InputTokensDetails(BaseModel):
-    image_tokens: int | None = None
-    text_tokens: int | None = None
-
-
-class Usage(BaseModel):
-    input_tokens: int | None = None
-    input_tokens_details: InputTokensDetails | None = None
-    output_tokens: int | None = None
-    total_tokens: int | None = None
-
-
-class OpenAIImageGenerationResponse(BaseModel):
-    data: list[Datum2] | None = None
-    usage: Usage | None = None
-
-
-class OpenAIImageEditRequest(BaseModel):
-    background: str | None = Field(None, description="Background transparency")
-    model: str = Field(...)
-    moderation: str | None = Field(None)
-    n: int | None = Field(None, description="The number of images to generate")
-    output_compression: int | None = Field(None, description="Compression level for JPEG or WebP (0-100)")
-    output_format: str | None = Field(None)
-    prompt: str = Field(...)
-    quality: str | None = Field(None, description="Size of the image (e.g., 1024x1024, 1536x1024, auto)")
-    size: str | None = Field(None, description="Size of the output image")
-
-
-class OpenAIImageGenerationRequest(BaseModel):
-    background: str | None = Field(None, description="Background transparency")
-    model: str | None = Field(None)
-    moderation: str | None = Field(None)
-    n: int | None = Field(
-        None,
-        description="The number of images to generate.",
-    )
-    output_compression: int | None = Field(None, description="Compression level for JPEG or WebP (0-100)")
-    output_format: str | None = Field(None)
-    prompt: str = Field(...)
-    quality: str | None = Field(None, description="The quality of the generated image")
-    size: str | None = Field(None, description="Size of the image (e.g., 1024x1024, 1536x1024, auto)")
-    style: str | None = Field(None, description="Style of the image (only for dall-e-3)")
--- a/comfy_api_nodes/apis/pixverse_api.py
+++ b/comfy_api_nodes/apis/pixverse_api.py
--- a/comfy_api_nodes/apis/recraft_api.py
+++ b/comfy_api_nodes/apis/recraft_api.py
--- a/comfy_api_nodes/apis/rodin_api.py
+++ b/comfy_api_nodes/apis/rodin_api.py
--- a/comfy_api_nodes/apis/runway.py
+++ b/comfy_api_nodes/apis/runway.py
@ -0,0 +1,127 @@
+from enum import Enum
+from typing import Optional, List, Union
+from datetime import datetime
+
+from pydantic import BaseModel, Field, RootModel
+
+
+class RunwayAspectRatioEnum(str, Enum):
+    field_1280_720 = '1280:720'
+    field_720_1280 = '720:1280'
+    field_1104_832 = '1104:832'
+    field_832_1104 = '832:1104'
+    field_960_960 = '960:960'
+    field_1584_672 = '1584:672'
+    field_1280_768 = '1280:768'
+    field_768_1280 = '768:1280'
+
+
+class Position(str, Enum):
+    first = 'first'
+    last = 'last'
+
+
+class RunwayPromptImageDetailedObject(BaseModel):
+    position: Position = Field(
+        ...,
+        description="The position of the image in the output video. 'last' is currently supported for gen3a_turbo only.",
+    )
+    uri: str = Field(
+        ..., description='A HTTPS URL or data URI containing an encoded image.'
+    )
+
+
+class RunwayPromptImageObject(
+    RootModel[Union[str, List[RunwayPromptImageDetailedObject]]]
+):
+    root: Union[str, List[RunwayPromptImageDetailedObject]] = Field(
+        ...,
+        description='Image(s) to use for the video generation. Can be a single URI or an array of image objects with positions.',
+    )
+
+
+class RunwayModelEnum(str, Enum):
+    gen4_turbo = 'gen4_turbo'
+    gen3a_turbo = 'gen3a_turbo'
+
+
+class RunwayDurationEnum(int, Enum):
+    integer_5 = 5
+    integer_10 = 10
+
+
+class RunwayImageToVideoRequest(BaseModel):
+    duration: RunwayDurationEnum
+    model: RunwayModelEnum
+    promptImage: RunwayPromptImageObject
+    promptText: Optional[str] = Field(
+        None, description='Text prompt for the generation', max_length=1000
+    )
+    ratio: RunwayAspectRatioEnum
+    seed: int = Field(
+        ..., description='Random seed for generation', ge=0, le=4294967295
+    )
+
+
+class RunwayImageToVideoResponse(BaseModel):
+    id: Optional[str] = Field(None, description='Task ID')
+
+
+class RunwayTaskStatusEnum(str, Enum):
+    SUCCEEDED = 'SUCCEEDED'
+    RUNNING = 'RUNNING'
+    FAILED = 'FAILED'
+    PENDING = 'PENDING'
+    CANCELLED = 'CANCELLED'
+    THROTTLED = 'THROTTLED'
+
+
+class RunwayTaskStatusResponse(BaseModel):
+    createdAt: datetime = Field(..., description='Task creation timestamp')
+    id: str = Field(..., description='Task ID')
+    output: Optional[List[str]] = Field(None, description='Array of output video URLs')
+    progress: Optional[float] = Field(
+        None,
+        description='Float value between 0 and 1 representing the progress of the task. Only available if status is RUNNING.',
+        ge=0.0,
+        le=1.0,
+    )
+    status: RunwayTaskStatusEnum
+
+
+class Model4(str, Enum):
+    gen4_image = 'gen4_image'
+
+
+class ReferenceImage(BaseModel):
+    uri: Optional[str] = Field(
+        None, description='A HTTPS URL or data URI containing an encoded image'
+    )
+
+
+class RunwayTextToImageAspectRatioEnum(str, Enum):
+    field_1920_1080 = '1920:1080'
+    field_1080_1920 = '1080:1920'
+    field_1024_1024 = '1024:1024'
+    field_1360_768 = '1360:768'
+    field_1080_1080 = '1080:1080'
+    field_1168_880 = '1168:880'
+    field_1440_1080 = '1440:1080'
+    field_1080_1440 = '1080:1440'
+    field_1808_768 = '1808:768'
+    field_2112_912 = '2112:912'
+
+
+class RunwayTextToImageRequest(BaseModel):
+    model: Model4 = Field(..., description='Model to use for generation')
+    promptText: str = Field(
+        ..., description='Text prompt for the image generation', max_length=1000
+    )
+    ratio: RunwayTextToImageAspectRatioEnum
+    referenceImages: Optional[List[ReferenceImage]] = Field(
+        None, description='Array of reference images to guide the generation'
+    )
+
+
+class RunwayTextToImageResponse(BaseModel):
+    id: Optional[str] = Field(None, description='Task ID')
--- a/comfy_api_nodes/apis/stability_api.py
+++ b/comfy_api_nodes/apis/stability_api.py
--- a/comfy_api_nodes/apis/topaz_api.py
+++ b/comfy_api_nodes/apis/topaz_api.py
@ -41,7 +41,7 @@ class Resolution(BaseModel):
    height: int = Field(...)


-class CreateCreateVideoRequestSource(BaseModel):
+class CreateVideoRequestSource(BaseModel):
    container: str = Field(...)
    size: int = Field(..., description="Size of the video file in bytes")
    duration: int = Field(..., description="Duration of the video file in seconds")
@ -89,7 +89,7 @@ class Overrides(BaseModel):


 class CreateVideoRequest(BaseModel):
-    source: CreateCreateVideoRequestSource = Field(...)
+    source: CreateVideoRequestSource = Field(...)
    filters: list[Union[VideoFrameInterpolationFilter, VideoEnhancementFilter]] = Field(...)
    output: OutputInformationVideo = Field(...)
    overrides: Overrides = Field(Overrides(isPaidDiffusion=True))
--- a/comfy_api_nodes/apis/tripo_api.py
+++ b/comfy_api_nodes/apis/tripo_api.py
--- a/comfy_api_nodes/apis/veo_api.py
+++ b/comfy_api_nodes/apis/veo_api.py
--- a/comfy_api_nodes/apis/wavespeed.py
+++ b/comfy_api_nodes/apis/wavespeed.py
@ -0,0 +1,35 @@
+from pydantic import BaseModel, Field
+
+
+class SeedVR2ImageRequest(BaseModel):
+    image: str = Field(...)
+    target_resolution: str = Field(...)
+    output_format: str = Field("png")
+    enable_sync_mode: bool = Field(False)
+
+
+class FlashVSRRequest(BaseModel):
+    target_resolution: str = Field(...)
+    video: str = Field(...)
+    duration: float = Field(...)
+
+
+class TaskCreatedDataResponse(BaseModel):
+    id: str = Field(...)
+
+
+class TaskCreatedResponse(BaseModel):
+    code: int = Field(...)
+    message: str = Field(...)
+    data: TaskCreatedDataResponse | None = Field(None)
+
+
+class TaskResultDataResponse(BaseModel):
+    status: str = Field(...)
+    outputs: list[str] = Field([])
+
+
+class TaskResultResponse(BaseModel):
+    code: int = Field(...)
+    message: str = Field(...)
+    data: TaskResultDataResponse | None = Field(None)
--- a/comfy_api_nodes/canary.py
+++ b/comfy_api_nodes/canary.py
@ -1,10 +0,0 @@
-import av
-
-ver = av.__version__.split(".")
-if int(ver[0]) < 14:
-    raise Exception("INSTALL NEW VERSION OF PYAV TO USE API NODES.")
-
-if int(ver[0]) == 14 and int(ver[1]) < 2:
-    raise Exception("INSTALL NEW VERSION OF PYAV TO USE API NODES.")
-
-NODE_CLASS_MAPPINGS = {}
--- a/comfy_api_nodes/mapper_utils.py
+++ b/comfy_api_nodes/mapper_utils.py
@ -1,116 +0,0 @@
-from enum import Enum
-
-from pydantic.fields import FieldInfo
-from pydantic import BaseModel
-from pydantic_core import PydanticUndefined
-
-from comfy.comfy_types.node_typing import IO, InputTypeOptions
-
-NodeInput = tuple[IO, InputTypeOptions]
-
-
-def _create_base_config(field_info: FieldInfo) -> InputTypeOptions:
-    config = {}
-    if hasattr(field_info, "default") and field_info.default is not PydanticUndefined:
-        config["default"] = field_info.default
-    if hasattr(field_info, "description") and field_info.description is not None:
-        config["tooltip"] = field_info.description
-    return config
-
-
-def _get_number_constraints_config(field_info: FieldInfo) -> dict:
-    config = {}
-    if hasattr(field_info, "metadata"):
-        metadata = field_info.metadata
-        for constraint in metadata:
-            if hasattr(constraint, "ge"):
-                config["min"] = constraint.ge
-            if hasattr(constraint, "le"):
-                config["max"] = constraint.le
-            if hasattr(constraint, "multiple_of"):
-                config["step"] = constraint.multiple_of
-    return config
-
-
-def _model_field_to_image_input(field_info: FieldInfo, **kwargs) -> NodeInput:
-    return IO.IMAGE, {
-        **_create_base_config(field_info),
-        **kwargs,
-    }
-
-
-def _model_field_to_string_input(field_info: FieldInfo, **kwargs) -> NodeInput:
-    return IO.STRING, {
-        **_create_base_config(field_info),
-        **kwargs,
-    }
-
-
-def _model_field_to_float_input(field_info: FieldInfo, **kwargs) -> NodeInput:
-    return IO.FLOAT, {
-        **_create_base_config(field_info),
-        **_get_number_constraints_config(field_info),
-        **kwargs,
-    }
-
-
-def _model_field_to_int_input(field_info: FieldInfo, **kwargs) -> NodeInput:
-    return IO.INT, {
-        **_create_base_config(field_info),
-        **_get_number_constraints_config(field_info),
-        **kwargs,
-    }
-
-
-def _model_field_to_combo_input(
-    field_info: FieldInfo, enum_type: type[Enum] = None, **kwargs
-) -> NodeInput:
-    combo_config = {}
-    if enum_type is not None:
-        combo_config["options"] = [option.value for option in enum_type]
-    combo_config = {
-        **combo_config,
-        **_create_base_config(field_info),
-        **kwargs,
-    }
-    return IO.COMBO, combo_config
-
-
-def model_field_to_node_input(
-    input_type: IO, base_model: type[BaseModel], field_name: str, **kwargs
-) -> NodeInput:
-    """
-    Maps a field from a Pydantic model to a Comfy node input.
-
-    Args:
-        input_type: The type of the input.
-        base_model: The Pydantic model to map the field from.
-        field_name: The name of the field to map.
-        **kwargs: Additional key/values to include in the input options.
-
-    Note:
-        For combo inputs, pass an `Enum` to the `enum_type` keyword argument to populate the options automatically.
-
-    Example:
-        >>> model_field_to_node_input(IO.STRING, MyModel, "my_field", multiline=True)
-        >>> model_field_to_node_input(IO.COMBO, MyModel, "my_field", enum_type=MyEnum)
-        >>> model_field_to_node_input(IO.FLOAT, MyModel, "my_field", slider=True)
-    """
-    field_info: FieldInfo = base_model.model_fields[field_name]
-    result: NodeInput
-
-    if input_type == IO.IMAGE:
-        result = _model_field_to_image_input(field_info, **kwargs)
-    elif input_type == IO.STRING:
-        result = _model_field_to_string_input(field_info, **kwargs)
-    elif input_type == IO.FLOAT:
-        result = _model_field_to_float_input(field_info, **kwargs)
-    elif input_type == IO.INT:
-        result = _model_field_to_int_input(field_info, **kwargs)
-    elif input_type == IO.COMBO:
-        result = _model_field_to_combo_input(field_info, **kwargs)
-    else:
-        message = f"Invalid input type: {input_type}"
-        raise ValueError(message)
-
-    return result
--- a/comfy_api_nodes/nodes_bfl.py
+++ b/comfy_api_nodes/nodes_bfl.py
@ -3,7 +3,7 @@ from pydantic import BaseModel
 from typing_extensions import override

 from comfy_api.latest import IO, ComfyExtension, Input
-from comfy_api_nodes.apis.bfl_api import (
+from comfy_api_nodes.apis.bfl import (
    BFLFluxExpandImageRequest,
    BFLFluxFillImageRequest,
    BFLFluxKontextProGenerateRequest,
--- a/comfy_api_nodes/nodes_bria.py
+++ b/comfy_api_nodes/nodes_bria.py
@ -0,0 +1,198 @@
+from typing_extensions import override
+
+from comfy_api.latest import IO, ComfyExtension, Input
+from comfy_api_nodes.apis.bria import (
+    BriaEditImageRequest,
+    BriaResponse,
+    BriaStatusResponse,
+    InputModerationSettings,
+)
+from comfy_api_nodes.util import (
+    ApiEndpoint,
+    convert_mask_to_image,
+    download_url_to_image_tensor,
+    get_number_of_images,
+    poll_op,
+    sync_op,
+    upload_images_to_comfyapi,
+)
+
+
+class BriaImageEditNode(IO.ComfyNode):
+
+    @classmethod
+    def define_schema(cls):
+        return IO.Schema(
+            node_id="BriaImageEditNode",
+            display_name="Bria Image Edit",
+            category="api node/image/Bria",
+            description="Edit images using Bria latest model",
+            inputs=[
+                IO.Combo.Input("model", options=["FIBO"]),
+                IO.Image.Input("image"),
+                IO.String.Input(
+                    "prompt",
+                    multiline=True,
+                    default="",
+                    tooltip="Instruction to edit image",
+                ),
+                IO.String.Input("negative_prompt", multiline=True, default=""),
+                IO.String.Input(
+                    "structured_prompt",
+                    multiline=True,
+                    default="",
+                    tooltip="A string containing the structured edit prompt in JSON format. "
+                    "Use this instead of usual prompt for precise, programmatic control.",
+                ),
+                IO.Int.Input(
+                    "seed",
+                    default=1,
+                    min=1,
+                    max=2147483647,
+                    step=1,
+                    display_mode=IO.NumberDisplay.number,
+                    control_after_generate=True,
+                ),
+                IO.Float.Input(
+                    "guidance_scale",
+                    default=3,
+                    min=3,
+                    max=5,
+                    step=0.01,
+                    display_mode=IO.NumberDisplay.number,
+                    tooltip="Higher value makes the image follow the prompt more closely.",
+                ),
+                IO.Int.Input(
+                    "steps",
+                    default=50,
+                    min=20,
+                    max=50,
+                    step=1,
+                    display_mode=IO.NumberDisplay.number,
+                ),
+                IO.DynamicCombo.Input(
+                    "moderation",
+                    options=[
+                        IO.DynamicCombo.Option(
+                            "true",
+                            [
+                                IO.Boolean.Input(
+                                    "prompt_content_moderation", default=False
+                                ),
+                                IO.Boolean.Input(
+                                    "visual_input_moderation", default=False
+                                ),
+                                IO.Boolean.Input(
+                                    "visual_output_moderation", default=True
+                                ),
+                            ],
+                        ),
+                        IO.DynamicCombo.Option("false", []),
+                    ],
+                    tooltip="Moderation settings",
+                ),
+                IO.Mask.Input(
+                    "mask",
+                    tooltip="If omitted, the edit applies to the entire image.",
+                    optional=True,
+                ),
+            ],
+            outputs=[
+                IO.Image.Output(),
+                IO.String.Output(display_name="structured_prompt"),
+            ],
+            hidden=[
+                IO.Hidden.auth_token_comfy_org,
+                IO.Hidden.api_key_comfy_org,
+                IO.Hidden.unique_id,
+            ],
+            is_api_node=True,
+            price_badge=IO.PriceBadge(
+                expr="""{"type":"usd","usd":0.04}""",
+            ),
+        )
+
+    @classmethod
+    async def execute(
+        cls,
+        model: str,
+        image: Input.Image,
+        prompt: str,
+        negative_prompt: str,
+        structured_prompt: str,
+        seed: int,
+        guidance_scale: float,
+        steps: int,
+        moderation: InputModerationSettings,
+        mask: Input.Image | None = None,
+    ) -> IO.NodeOutput:
+        if not prompt and not structured_prompt:
+            raise ValueError(
+                "One of prompt or structured_prompt is required to be non-empty."
+            )
+        if get_number_of_images(image) != 1:
+            raise ValueError("Exactly one input image is required.")
+        mask_url = None
+        if mask is not None:
+            mask_url = (
+                await upload_images_to_comfyapi(
+                    cls,
+                    convert_mask_to_image(mask),
+                    max_images=1,
+                    mime_type="image/png",
+                    wait_label="Uploading mask",
+                )
+            )[0]
+        response = await sync_op(
+            cls,
+            ApiEndpoint(path="proxy/bria/v2/image/edit", method="POST"),
+            data=BriaEditImageRequest(
+                instruction=prompt if prompt else None,
+                structured_instruction=structured_prompt if structured_prompt else None,
+                images=await upload_images_to_comfyapi(
+                    cls,
+                    image,
+                    max_images=1,
+                    mime_type="image/png",
+                    wait_label="Uploading image",
+                ),
+                mask=mask_url,
+                negative_prompt=negative_prompt if negative_prompt else None,
+                guidance_scale=guidance_scale,
+                seed=seed,
+                model_version=model,
+                steps_num=steps,
+                prompt_content_moderation=moderation.get(
+                    "prompt_content_moderation", False
+                ),
+                visual_input_content_moderation=moderation.get(
+                    "visual_input_moderation", False
+                ),
+                visual_output_content_moderation=moderation.get(
+                    "visual_output_moderation", False
+                ),
+            ),
+            response_model=BriaStatusResponse,
+        )
+        response = await poll_op(
+            cls,
+            ApiEndpoint(path=f"/proxy/bria/v2/status/{response.request_id}"),
+            status_extractor=lambda r: r.status,
+            response_model=BriaResponse,
+        )
+        return IO.NodeOutput(
+            await download_url_to_image_tensor(response.result.image_url),
+            response.result.structured_prompt,
+        )
+
+
+class BriaExtension(ComfyExtension):
+    @override
+    async def get_node_list(self) -> list[type[IO.ComfyNode]]:
+        return [
+            BriaImageEditNode,
+        ]
+
+
+async def comfy_entrypoint() -> BriaExtension:
+    return BriaExtension()
--- a/comfy_api_nodes/nodes_bytedance.py
+++ b/comfy_api_nodes/nodes_bytedance.py
@ -5,7 +5,7 @@ import torch
 from typing_extensions import override

 from comfy_api.latest import IO, ComfyExtension, Input
-from comfy_api_nodes.apis.bytedance_api import (
+from comfy_api_nodes.apis.bytedance import (
    RECOMMENDED_PRESETS,
    RECOMMENDED_PRESETS_SEEDREAM_4,
    VIDEO_TASKS_EXECUTION_TIME,
@ -477,7 +477,12 @@ class ByteDanceTextToVideoNode(IO.ComfyNode):
            inputs=[
                IO.Combo.Input(
                    "model",
-                    options=["seedance-1-0-pro-250528", "seedance-1-0-lite-t2v-250428", "seedance-1-0-pro-fast-251015"],
+                    options=[
+                        "seedance-1-5-pro-251215",
+                        "seedance-1-0-pro-250528",
+                        "seedance-1-0-lite-t2v-250428",
+                        "seedance-1-0-pro-fast-251015",
+                    ],
                    default="seedance-1-0-pro-fast-251015",
                ),
                IO.String.Input(
@ -528,6 +533,12 @@ class ByteDanceTextToVideoNode(IO.ComfyNode):
                    tooltip='Whether to add an "AI generated" watermark to the video.',
                    optional=True,
                ),
+                IO.Boolean.Input(
+                    "generate_audio",
+                    default=False,
+                    tooltip="This parameter is ignored for any model except seedance-1-5-pro.",
+                    optional=True,
+                ),
            ],
            outputs=[
                IO.Video.Output(),
@ -552,7 +563,10 @@ class ByteDanceTextToVideoNode(IO.ComfyNode):
        seed: int,
        camera_fixed: bool,
        watermark: bool,
+        generate_audio: bool = False,
    ) -> IO.NodeOutput:
+        if model == "seedance-1-5-pro-251215" and duration < 4:
+            raise ValueError("Minimum supported duration for Seedance 1.5 Pro is 4 seconds.")
        validate_string(prompt, strip_whitespace=True, min_length=1)
        raise_if_text_params(prompt, ["resolution", "ratio", "duration", "seed", "camerafixed", "watermark"])

@ -567,7 +581,11 @@ class ByteDanceTextToVideoNode(IO.ComfyNode):
        )
        return await process_video_task(
            cls,
-            payload=Text2VideoTaskCreationRequest(model=model, content=[TaskTextContent(text=prompt)]),
+            payload=Text2VideoTaskCreationRequest(
+                model=model,
+                content=[TaskTextContent(text=prompt)],
+                generate_audio=generate_audio if model == "seedance-1-5-pro-251215" else None,
+            ),
            estimated_duration=max(1, math.ceil(VIDEO_TASKS_EXECUTION_TIME[model][resolution] * (duration / 10.0))),
        )

@ -584,7 +602,12 @@ class ByteDanceImageToVideoNode(IO.ComfyNode):
            inputs=[
                IO.Combo.Input(
                    "model",
-                    options=["seedance-1-0-pro-250528", "seedance-1-0-lite-t2v-250428", "seedance-1-0-pro-fast-251015"],
+                    options=[
+                        "seedance-1-5-pro-251215",
+                        "seedance-1-0-pro-250528",
+                        "seedance-1-0-lite-i2v-250428",
+                        "seedance-1-0-pro-fast-251015",
+                    ],
                    default="seedance-1-0-pro-fast-251015",
                ),
                IO.String.Input(
@ -639,6 +662,12 @@ class ByteDanceImageToVideoNode(IO.ComfyNode):
                    tooltip='Whether to add an "AI generated" watermark to the video.',
                    optional=True,
                ),
+                IO.Boolean.Input(
+                    "generate_audio",
+                    default=False,
+                    tooltip="This parameter is ignored for any model except seedance-1-5-pro.",
+                    optional=True,
+                ),
            ],
            outputs=[
                IO.Video.Output(),
@ -664,7 +693,10 @@ class ByteDanceImageToVideoNode(IO.ComfyNode):
        seed: int,
        camera_fixed: bool,
        watermark: bool,
+        generate_audio: bool = False,
    ) -> IO.NodeOutput:
+        if model == "seedance-1-5-pro-251215" and duration < 4:
+            raise ValueError("Minimum supported duration for Seedance 1.5 Pro is 4 seconds.")
        validate_string(prompt, strip_whitespace=True, min_length=1)
        raise_if_text_params(prompt, ["resolution", "ratio", "duration", "seed", "camerafixed", "watermark"])
        validate_image_dimensions(image, min_width=300, min_height=300, max_width=6000, max_height=6000)
@ -686,6 +718,7 @@ class ByteDanceImageToVideoNode(IO.ComfyNode):
            payload=Image2VideoTaskCreationRequest(
                model=model,
                content=[TaskTextContent(text=prompt), TaskImageContent(image_url=TaskImageContentUrl(url=image_url))],
+                generate_audio=generate_audio if model == "seedance-1-5-pro-251215" else None,
            ),
            estimated_duration=max(1, math.ceil(VIDEO_TASKS_EXECUTION_TIME[model][resolution] * (duration / 10.0))),
        )
@ -703,7 +736,7 @@ class ByteDanceFirstLastFrameNode(IO.ComfyNode):
            inputs=[
                IO.Combo.Input(
                    "model",
-                    options=["seedance-1-0-pro-250528", "seedance-1-0-lite-i2v-250428"],
+                    options=["seedance-1-5-pro-251215", "seedance-1-0-pro-250528", "seedance-1-0-lite-i2v-250428"],
                    default="seedance-1-0-lite-i2v-250428",
                ),
                IO.String.Input(
@ -762,6 +795,12 @@ class ByteDanceFirstLastFrameNode(IO.ComfyNode):
                    tooltip='Whether to add an "AI generated" watermark to the video.',
                    optional=True,
                ),
+                IO.Boolean.Input(
+                    "generate_audio",
+                    default=False,
+                    tooltip="This parameter is ignored for any model except seedance-1-5-pro.",
+                    optional=True,
+                ),
            ],
            outputs=[
                IO.Video.Output(),
@ -788,7 +827,10 @@ class ByteDanceFirstLastFrameNode(IO.ComfyNode):
        seed: int,
        camera_fixed: bool,
        watermark: bool,
+        generate_audio: bool = False,
    ) -> IO.NodeOutput:
+        if model == "seedance-1-5-pro-251215" and duration < 4:
+            raise ValueError("Minimum supported duration for Seedance 1.5 Pro is 4 seconds.")
        validate_string(prompt, strip_whitespace=True, min_length=1)
        raise_if_text_params(prompt, ["resolution", "ratio", "duration", "seed", "camerafixed", "watermark"])
        for i in (first_frame, last_frame):
@ -821,6 +863,7 @@ class ByteDanceFirstLastFrameNode(IO.ComfyNode):
                    TaskImageContent(image_url=TaskImageContentUrl(url=str(download_urls[0])), role="first_frame"),
                    TaskImageContent(image_url=TaskImageContentUrl(url=str(download_urls[1])), role="last_frame"),
                ],
+                generate_audio=generate_audio if model == "seedance-1-5-pro-251215" else None,
            ),
            estimated_duration=max(1, math.ceil(VIDEO_TASKS_EXECUTION_TIME[model][resolution] * (duration / 10.0))),
        )
@ -896,7 +939,41 @@ class ByteDanceImageReferenceNode(IO.ComfyNode):
                IO.Hidden.unique_id,
            ],
            is_api_node=True,
-            price_badge=PRICE_BADGE_VIDEO,
+            price_badge=IO.PriceBadge(
+                depends_on=IO.PriceBadgeDepends(widgets=["model", "duration", "resolution"]),
+                expr="""
+                (
+                  $priceByModel := {
+                    "seedance-1-0-pro": {
+                      "480p":[0.23,0.24],
+                      "720p":[0.51,0.56]
+                    },
+                    "seedance-1-0-lite": {
+                      "480p":[0.17,0.18],
+                      "720p":[0.37,0.41]
+                    }
+                  };
+                  $model := widgets.model;
+                  $modelKey :=
+                    $contains($model, "seedance-1-0-pro")  ? "seedance-1-0-pro" :
+                    "seedance-1-0-lite";
+                  $resolution := widgets.resolution;
+                  $resKey :=
+                    $contains($resolution, "720") ? "720p" :
+                    "480p";
+                  $modelPrices := $lookup($priceByModel, $modelKey);
+                  $baseRange := $lookup($modelPrices, $resKey);
+                  $min10s := $baseRange[0];
+                  $max10s := $baseRange[1];
+                  $scale := widgets.duration / 10;
+                  $minCost := $min10s * $scale;
+                  $maxCost := $max10s * $scale;
+                  ($minCost = $maxCost)
+                    ? {"type":"usd","usd": $minCost}
+                    : {"type":"range_usd","min_usd": $minCost, "max_usd": $maxCost}
+                )
+                """,
+            ),
        )

    @classmethod
@ -967,10 +1044,15 @@ def raise_if_text_params(prompt: str, text_params: list[str]) -> None:


 PRICE_BADGE_VIDEO = IO.PriceBadge(
-    depends_on=IO.PriceBadgeDepends(widgets=["model", "duration", "resolution"]),
+    depends_on=IO.PriceBadgeDepends(widgets=["model", "duration", "resolution", "generate_audio"]),
    expr="""
    (
      $priceByModel := {
+        "seedance-1-5-pro": {
+          "480p":[0.12,0.12],
+          "720p":[0.26,0.26],
+          "1080p":[0.58,0.59]
+        },
        "seedance-1-0-pro": {
          "480p":[0.23,0.24],
          "720p":[0.51,0.56],
@ -989,6 +1071,7 @@ PRICE_BADGE_VIDEO = IO.PriceBadge(
      };
      $model := widgets.model;
      $modelKey :=
+        $contains($model, "seedance-1-5-pro")      ? "seedance-1-5-pro" :
        $contains($model, "seedance-1-0-pro-fast") ? "seedance-1-0-pro-fast" :
        $contains($model, "seedance-1-0-pro")      ? "seedance-1-0-pro" :
        "seedance-1-0-lite";
@ -1002,11 +1085,12 @@ PRICE_BADGE_VIDEO = IO.PriceBadge(
      $min10s := $baseRange[0];
      $max10s := $baseRange[1];
      $scale := widgets.duration / 10;
-      $minCost := $min10s * $scale;
-      $maxCost := $max10s * $scale;
+      $audioMultiplier := ($modelKey = "seedance-1-5-pro" and widgets.generate_audio) ? 2 : 1;
+      $minCost := $min10s * $scale * $audioMultiplier;
+      $maxCost := $max10s * $scale * $audioMultiplier;
      ($minCost = $maxCost)
-        ? {"type":"usd","usd": $minCost}
-        : {"type":"range_usd","min_usd": $minCost, "max_usd": $maxCost}
+        ? {"type":"usd","usd": $minCost, "format": { "approximate": true }}
+        : {"type":"range_usd","min_usd": $minCost, "max_usd": $maxCost, "format": { "approximate": true }}
    )
    """,
 )
--- a/comfy_api_nodes/nodes_gemini.py
+++ b/comfy_api_nodes/nodes_gemini.py
@ -14,7 +14,7 @@ from typing_extensions import override

 import folder_paths
 from comfy_api.latest import IO, ComfyExtension, Input, Types
-from comfy_api_nodes.apis.gemini_api import (
+from comfy_api_nodes.apis.gemini import (
    GeminiContent,
    GeminiFileData,
    GeminiGenerateContentRequest,
--- a/comfy_api_nodes/nodes_ideogram.py
+++ b/comfy_api_nodes/nodes_ideogram.py
@ -4,7 +4,7 @@ from comfy_api.latest import IO, ComfyExtension
 from PIL import Image
 import numpy as np
 import torch
-from comfy_api_nodes.apis import (
+from comfy_api_nodes.apis.ideogram import (
    IdeogramGenerateRequest,
    IdeogramGenerateResponse,
    ImageRequest,
--- a/comfy_api_nodes/nodes_kling.py
+++ b/comfy_api_nodes/nodes_kling.py
@ -49,7 +49,7 @@ from comfy_api_nodes.apis import (
    KlingCharacterEffectModelName,
    KlingSingleImageEffectModelName,
 )
-from comfy_api_nodes.apis.kling_api import (
+from comfy_api_nodes.apis.kling import (
    ImageToVideoWithAudioRequest,
    MotionControlRequest,
    OmniImageParamImage,
--- a/comfy_api_nodes/nodes_luma.py
+++ b/comfy_api_nodes/nodes_luma.py
@ -4,7 +4,7 @@ import torch
 from typing_extensions import override

 from comfy_api.latest import IO, ComfyExtension
-from comfy_api_nodes.apis.luma_api import (
+from comfy_api_nodes.apis.luma import (
    LumaAspectRatio,
    LumaCharacterRef,
    LumaConceptChain,
--- a/comfy_api_nodes/nodes_meshy.py
+++ b/comfy_api_nodes/nodes_meshy.py
@ -0,0 +1,790 @@
+import os
+
+from typing_extensions import override
+
+from comfy_api.latest import IO, ComfyExtension, Input
+from comfy_api_nodes.apis.meshy import (
+    InputShouldRemesh,
+    InputShouldTexture,
+    MeshyAnimationRequest,
+    MeshyAnimationResult,
+    MeshyImageToModelRequest,
+    MeshyModelResult,
+    MeshyMultiImageToModelRequest,
+    MeshyRefineTask,
+    MeshyRiggedResult,
+    MeshyRiggingRequest,
+    MeshyTaskResponse,
+    MeshyTextToModelRequest,
+    MeshyTextureRequest,
+)
+from comfy_api_nodes.util import (
+    ApiEndpoint,
+    download_url_to_bytesio,
+    poll_op,
+    sync_op,
+    upload_images_to_comfyapi,
+    validate_string,
+)
+from folder_paths import get_output_directory
+
+
+class MeshyTextToModelNode(IO.ComfyNode):
+
+    @classmethod
+    def define_schema(cls):
+        return IO.Schema(
+            node_id="MeshyTextToModelNode",
+            display_name="Meshy: Text to Model",
+            category="api node/3d/Meshy",
+            inputs=[
+                IO.Combo.Input("model", options=["latest"]),
+                IO.String.Input("prompt", multiline=True, default=""),
+                IO.Combo.Input("style", options=["realistic", "sculpture"]),
+                IO.DynamicCombo.Input(
+                    "should_remesh",
+                    options=[
+                        IO.DynamicCombo.Option(
+                            "true",
+                            [
+                                IO.Combo.Input("topology", options=["triangle", "quad"]),
+                                IO.Int.Input(
+                                    "target_polycount",
+                                    default=300000,
+                                    min=100,
+                                    max=300000,
+                                    display_mode=IO.NumberDisplay.number,
+                                ),
+                            ],
+                        ),
+                        IO.DynamicCombo.Option("false", []),
+                    ],
+                    tooltip="When set to false, returns an unprocessed triangular mesh.",
+                ),
+                IO.Combo.Input("symmetry_mode", options=["auto", "on", "off"]),
+                IO.Combo.Input(
+                    "pose_mode",
+                    options=["", "A-pose", "T-pose"],
+                    tooltip="Specify the pose mode for the generated model.",
+                ),
+                IO.Int.Input(
+                    "seed",
+                    default=0,
+                    min=0,
+                    max=2147483647,
+                    display_mode=IO.NumberDisplay.number,
+                    control_after_generate=True,
+                    tooltip="Seed controls whether the node should re-run; "
+                    "results are non-deterministic regardless of seed.",
+                ),
+            ],
+            outputs=[
+                IO.String.Output(display_name="model_file"),
+                IO.Custom("MESHY_TASK_ID").Output(display_name="meshy_task_id"),
+            ],
+            hidden=[
+                IO.Hidden.auth_token_comfy_org,
+                IO.Hidden.api_key_comfy_org,
+                IO.Hidden.unique_id,
+            ],
+            is_api_node=True,
+            is_output_node=True,
+            price_badge=IO.PriceBadge(
+                expr="""{"type":"usd","usd":0.8}""",
+            ),
+        )
+
+    @classmethod
+    async def execute(
+        cls,
+        model: str,
+        prompt: str,
+        style: str,
+        should_remesh: InputShouldRemesh,
+        symmetry_mode: str,
+        pose_mode: str,
+        seed: int,
+    ) -> IO.NodeOutput:
+        validate_string(prompt, field_name="prompt", min_length=1, max_length=600)
+        response = await sync_op(
+            cls,
+            ApiEndpoint(path="/proxy/meshy/openapi/v2/text-to-3d", method="POST"),
+            response_model=MeshyTaskResponse,
+            data=MeshyTextToModelRequest(
+                prompt=prompt,
+                art_style=style,
+                ai_model=model,
+                topology=should_remesh.get("topology", None),
+                target_polycount=should_remesh.get("target_polycount", None),
+                should_remesh=should_remesh["should_remesh"] == "true",
+                symmetry_mode=symmetry_mode,
+                pose_mode=pose_mode.lower(),
+                seed=seed,
+            ),
+        )
+        result = await poll_op(
+            cls,
+            ApiEndpoint(path=f"/proxy/meshy/openapi/v2/text-to-3d/{response.result}"),
+            response_model=MeshyModelResult,
+            status_extractor=lambda r: r.status,
+            progress_extractor=lambda r: r.progress,
+        )
+        model_file = f"meshy_model_{response.result}.glb"
+        await download_url_to_bytesio(result.model_urls.glb, os.path.join(get_output_directory(), model_file))
+        return IO.NodeOutput(model_file, response.result)
+
+
+class MeshyRefineNode(IO.ComfyNode):
+
+    @classmethod
+    def define_schema(cls):
+        return IO.Schema(
+            node_id="MeshyRefineNode",
+            display_name="Meshy: Refine Draft Model",
+            category="api node/3d/Meshy",
+            description="Refine a previously created draft model.",
+            inputs=[
+                IO.Combo.Input("model", options=["latest"]),
+                IO.Custom("MESHY_TASK_ID").Input("meshy_task_id"),
+                IO.Boolean.Input(
+                    "enable_pbr",
+                    default=False,
+                    tooltip="Generate PBR Maps (metallic, roughness, normal) in addition to the base color. "
+                    "Note: this should be set to false when using Sculpture style, "
+                    "as Sculpture style generates its own set of PBR maps.",
+                ),
+                IO.String.Input(
+                    "texture_prompt",
+                    default="",
+                    multiline=True,
+                    tooltip="Provide a text prompt to guide the texturing process. "
+                    "Maximum 600 characters. Cannot be used at the same time as 'texture_image'.",
+                ),
+                IO.Image.Input(
+                    "texture_image",
+                    tooltip="Only one of 'texture_image' or 'texture_prompt' may be used at the same time.",
+                    optional=True,
+                ),
+            ],
+            outputs=[
+                IO.String.Output(display_name="model_file"),
+                IO.Custom("MESHY_TASK_ID").Output(display_name="meshy_task_id"),
+            ],
+            hidden=[
+                IO.Hidden.auth_token_comfy_org,
+                IO.Hidden.api_key_comfy_org,
+                IO.Hidden.unique_id,
+            ],
+            is_api_node=True,
+            is_output_node=True,
+            price_badge=IO.PriceBadge(
+                expr="""{"type":"usd","usd":0.4}""",
+            ),
+        )
+
+    @classmethod
+    async def execute(
+        cls,
+        model: str,
+        meshy_task_id: str,
+        enable_pbr: bool,
+        texture_prompt: str,
+        texture_image: Input.Image | None = None,
+    ) -> IO.NodeOutput:
+        if texture_prompt and texture_image is not None:
+            raise ValueError("texture_prompt and texture_image cannot be used at the same time")
+        texture_image_url = None
+        if texture_prompt:
+            validate_string(texture_prompt, field_name="texture_prompt", max_length=600)
+        if texture_image is not None:
+            texture_image_url = (await upload_images_to_comfyapi(cls, texture_image, wait_label="Uploading texture"))[0]
+        response = await sync_op(
+            cls,
+            endpoint=ApiEndpoint(path="/proxy/meshy/openapi/v2/text-to-3d", method="POST"),
+            response_model=MeshyTaskResponse,
+            data=MeshyRefineTask(
+                preview_task_id=meshy_task_id,
+                enable_pbr=enable_pbr,
+                texture_prompt=texture_prompt if texture_prompt else None,
+                texture_image_url=texture_image_url,
+                ai_model=model,
+            ),
+        )
+        result = await poll_op(
+            cls,
+            ApiEndpoint(path=f"/proxy/meshy/openapi/v2/text-to-3d/{response.result}"),
+            response_model=MeshyModelResult,
+            status_extractor=lambda r: r.status,
+            progress_extractor=lambda r: r.progress,
+        )
+        model_file = f"meshy_model_{response.result}.glb"
+        await download_url_to_bytesio(result.model_urls.glb, os.path.join(get_output_directory(), model_file))
+        return IO.NodeOutput(model_file, response.result)
+
+
+class MeshyImageToModelNode(IO.ComfyNode):
+
+    @classmethod
+    def define_schema(cls):
+        return IO.Schema(
+            node_id="MeshyImageToModelNode",
+            display_name="Meshy: Image to Model",
+            category="api node/3d/Meshy",
+            inputs=[
+                IO.Combo.Input("model", options=["latest"]),
+                IO.Image.Input("image"),
+                IO.DynamicCombo.Input(
+                    "should_remesh",
+                    options=[
+                        IO.DynamicCombo.Option(
+                            "true",
+                            [
+                                IO.Combo.Input("topology", options=["triangle", "quad"]),
+                                IO.Int.Input(
+                                    "target_polycount",
+                                    default=300000,
+                                    min=100,
+                                    max=300000,
+                                    display_mode=IO.NumberDisplay.number,
+                                ),
+                            ],
+                        ),
+                        IO.DynamicCombo.Option("false", []),
+                    ],
+                    tooltip="When set to false, returns an unprocessed triangular mesh.",
+                ),
+                IO.Combo.Input("symmetry_mode", options=["auto", "on", "off"]),
+                IO.DynamicCombo.Input(
+                    "should_texture",
+                    options=[
+                        IO.DynamicCombo.Option(
+                            "true",
+                            [
+                                IO.Boolean.Input(
+                                    "enable_pbr",
+                                    default=False,
+                                    tooltip="Generate PBR Maps (metallic, roughness, normal) "
+                                    "in addition to the base color.",
+                                ),
+                                IO.String.Input(
+                                    "texture_prompt",
+                                    default="",
+                                    multiline=True,
+                                    tooltip="Provide a text prompt to guide the texturing process. "
+                                    "Maximum 600 characters. Cannot be used at the same time as 'texture_image'.",
+                                ),
+                                IO.Image.Input(
+                                    "texture_image",
+                                    tooltip="Only one of 'texture_image' or 'texture_prompt' "
+                                    "may be used at the same time.",
+                                    optional=True,
+                                ),
+                            ],
+                        ),
+                        IO.DynamicCombo.Option("false", []),
+                    ],
+                    tooltip="Determines whether textures are generated. "
+                    "Setting it to false skips the texture phase and returns a mesh without textures.",
+                ),
+                IO.Combo.Input(
+                    "pose_mode",
+                    options=["", "A-pose", "T-pose"],
+                    tooltip="Specify the pose mode for the generated model.",
+                ),
+                IO.Int.Input(
+                    "seed",
+                    default=0,
+                    min=0,
+                    max=2147483647,
+                    display_mode=IO.NumberDisplay.number,
+                    control_after_generate=True,
+                    tooltip="Seed controls whether the node should re-run; "
+                    "results are non-deterministic regardless of seed.",
+                ),
+            ],
+            outputs=[
+                IO.String.Output(display_name="model_file"),
+                IO.Custom("MESHY_TASK_ID").Output(display_name="meshy_task_id"),
+            ],
+            hidden=[
+                IO.Hidden.auth_token_comfy_org,
+                IO.Hidden.api_key_comfy_org,
+                IO.Hidden.unique_id,
+            ],
+            is_api_node=True,
+            is_output_node=True,
+            price_badge=IO.PriceBadge(
+                depends_on=IO.PriceBadgeDepends(widgets=["should_texture"]),
+                expr="""
+                (
+                  $prices := {"true": 1.2, "false": 0.8};
+                  {"type":"usd","usd": $lookup($prices, widgets.should_texture)}
+                )
+                """,
+            ),
+        )
+
+    @classmethod
+    async def execute(
+        cls,
+        model: str,
+        image: Input.Image,
+        should_remesh: InputShouldRemesh,
+        symmetry_mode: str,
+        should_texture: InputShouldTexture,
+        pose_mode: str,
+        seed: int,
+    ) -> IO.NodeOutput:
+        texture = should_texture["should_texture"] == "true"
+        texture_image_url = texture_prompt = None
+        if texture:
+            if should_texture["texture_prompt"] and should_texture["texture_image"] is not None:
+                raise ValueError("texture_prompt and texture_image cannot be used at the same time")
+            if should_texture["texture_prompt"]:
+                validate_string(should_texture["texture_prompt"], field_name="texture_prompt", max_length=600)
+                texture_prompt = should_texture["texture_prompt"]
+            if should_texture["texture_image"] is not None:
+                texture_image_url = (
+                    await upload_images_to_comfyapi(
+                        cls, should_texture["texture_image"], wait_label="Uploading texture"
+                    )
+                )[0]
+        response = await sync_op(
+            cls,
+            ApiEndpoint(path="/proxy/meshy/openapi/v1/image-to-3d", method="POST"),
+            response_model=MeshyTaskResponse,
+            data=MeshyImageToModelRequest(
+                image_url=(await upload_images_to_comfyapi(cls, image, wait_label="Uploading base image"))[0],
+                ai_model=model,
+                topology=should_remesh.get("topology", None),
+                target_polycount=should_remesh.get("target_polycount", None),
+                symmetry_mode=symmetry_mode,
+                should_remesh=should_remesh["should_remesh"] == "true",
+                should_texture=texture,
+                enable_pbr=should_texture.get("enable_pbr", None),
+                pose_mode=pose_mode.lower(),
+                texture_prompt=texture_prompt,
+                texture_image_url=texture_image_url,
+                seed=seed,
+            ),
+        )
+        result = await poll_op(
+            cls,
+            ApiEndpoint(path=f"/proxy/meshy/openapi/v1/image-to-3d/{response.result}"),
+            response_model=MeshyModelResult,
+            status_extractor=lambda r: r.status,
+            progress_extractor=lambda r: r.progress,
+        )
+        model_file = f"meshy_model_{response.result}.glb"
+        await download_url_to_bytesio(result.model_urls.glb, os.path.join(get_output_directory(), model_file))
+        return IO.NodeOutput(model_file, response.result)
+
+
+class MeshyMultiImageToModelNode(IO.ComfyNode):
+
+    @classmethod
+    def define_schema(cls):
+        return IO.Schema(
+            node_id="MeshyMultiImageToModelNode",
+            display_name="Meshy: Multi-Image to Model",
+            category="api node/3d/Meshy",
+            inputs=[
+                IO.Combo.Input("model", options=["latest"]),
+                IO.Autogrow.Input(
+                    "images",
+                    template=IO.Autogrow.TemplatePrefix(IO.Image.Input("image"), prefix="image", min=2, max=4),
+                ),
+                IO.DynamicCombo.Input(
+                    "should_remesh",
+                    options=[
+                        IO.DynamicCombo.Option(
+                            "true",
+                            [
+                                IO.Combo.Input("topology", options=["triangle", "quad"]),
+                                IO.Int.Input(
+                                    "target_polycount",
+                                    default=300000,
+                                    min=100,
+                                    max=300000,
+                                    display_mode=IO.NumberDisplay.number,
+                                ),
+                            ],
+                        ),
+                        IO.DynamicCombo.Option("false", []),
+                    ],
+                    tooltip="When set to false, returns an unprocessed triangular mesh.",
+                ),
+                IO.Combo.Input("symmetry_mode", options=["auto", "on", "off"]),
+                IO.DynamicCombo.Input(
+                    "should_texture",
+                    options=[
+                        IO.DynamicCombo.Option(
+                            "true",
+                            [
+                                IO.Boolean.Input(
+                                    "enable_pbr",
+                                    default=False,
+                                    tooltip="Generate PBR Maps (metallic, roughness, normal) "
+                                    "in addition to the base color.",
+                                ),
+                                IO.String.Input(
+                                    "texture_prompt",
+                                    default="",
+                                    multiline=True,
+                                    tooltip="Provide a text prompt to guide the texturing process. "
+                                    "Maximum 600 characters. Cannot be used at the same time as 'texture_image'.",
+                                ),
+                                IO.Image.Input(
+                                    "texture_image",
+                                    tooltip="Only one of 'texture_image' or 'texture_prompt' "
+                                    "may be used at the same time.",
+                                    optional=True,
+                                ),
+                            ],
+                        ),
+                        IO.DynamicCombo.Option("false", []),
+                    ],
+                    tooltip="Determines whether textures are generated. "
+                    "Setting it to false skips the texture phase and returns a mesh without textures.",
+                ),
+                IO.Combo.Input(
+                    "pose_mode",
+                    options=["", "A-pose", "T-pose"],
+                    tooltip="Specify the pose mode for the generated model.",
+                ),
+                IO.Int.Input(
+                    "seed",
+                    default=0,
+                    min=0,
+                    max=2147483647,
+                    display_mode=IO.NumberDisplay.number,
+                    control_after_generate=True,
+                    tooltip="Seed controls whether the node should re-run; "
+                    "results are non-deterministic regardless of seed.",
+                ),
+            ],
+            outputs=[
+                IO.String.Output(display_name="model_file"),
+                IO.Custom("MESHY_TASK_ID").Output(display_name="meshy_task_id"),
+            ],
+            hidden=[
+                IO.Hidden.auth_token_comfy_org,
+                IO.Hidden.api_key_comfy_org,
+                IO.Hidden.unique_id,
+            ],
+            is_api_node=True,
+            is_output_node=True,
+            price_badge=IO.PriceBadge(
+                depends_on=IO.PriceBadgeDepends(widgets=["should_texture"]),
+                expr="""
+                (
+                  $prices := {"true": 0.6, "false": 0.2};
+                  {"type":"usd","usd": $lookup($prices, widgets.should_texture)}
+                )
+                """,
+            ),
+        )
+
+    @classmethod
+    async def execute(
+        cls,
+        model: str,
+        images: IO.Autogrow.Type,
+        should_remesh: InputShouldRemesh,
+        symmetry_mode: str,
+        should_texture: InputShouldTexture,
+        pose_mode: str,
+        seed: int,
+    ) -> IO.NodeOutput:
+        texture = should_texture["should_texture"] == "true"
+        texture_image_url = texture_prompt = None
+        if texture:
+            if should_texture["texture_prompt"] and should_texture["texture_image"] is not None:
+                raise ValueError("texture_prompt and texture_image cannot be used at the same time")
+            if should_texture["texture_prompt"]:
+                validate_string(should_texture["texture_prompt"], field_name="texture_prompt", max_length=600)
+                texture_prompt = should_texture["texture_prompt"]
+            if should_texture["texture_image"] is not None:
+                texture_image_url = (
+                    await upload_images_to_comfyapi(
+                        cls, should_texture["texture_image"], wait_label="Uploading texture"
+                    )
+                )[0]
+        response = await sync_op(
+            cls,
+            ApiEndpoint(path="/proxy/meshy/openapi/v1/multi-image-to-3d", method="POST"),
+            response_model=MeshyTaskResponse,
+            data=MeshyMultiImageToModelRequest(
+                image_urls=await upload_images_to_comfyapi(
+                    cls, list(images.values()), wait_label="Uploading base images"
+                ),
+                ai_model=model,
+                topology=should_remesh.get("topology", None),
+                target_polycount=should_remesh.get("target_polycount", None),
+                symmetry_mode=symmetry_mode,
+                should_remesh=should_remesh["should_remesh"] == "true",
+                should_texture=texture,
+                enable_pbr=should_texture.get("enable_pbr", None),
+                pose_mode=pose_mode.lower(),
+                texture_prompt=texture_prompt,
+                texture_image_url=texture_image_url,
+                seed=seed,
+            ),
+        )
+        result = await poll_op(
+            cls,
+            ApiEndpoint(path=f"/proxy/meshy/openapi/v1/multi-image-to-3d/{response.result}"),
+            response_model=MeshyModelResult,
+            status_extractor=lambda r: r.status,
+            progress_extractor=lambda r: r.progress,
+        )
+        model_file = f"meshy_model_{response.result}.glb"
+        await download_url_to_bytesio(result.model_urls.glb, os.path.join(get_output_directory(), model_file))
+        return IO.NodeOutput(model_file, response.result)
+
+
+class MeshyRigModelNode(IO.ComfyNode):
+
+    @classmethod
+    def define_schema(cls):
+        return IO.Schema(
+            node_id="MeshyRigModelNode",
+            display_name="Meshy: Rig Model",
+            category="api node/3d/Meshy",
+            description="Provides a rigged character in standard formats. "
+            "Auto-rigging is currently not suitable for untextured meshes, non-humanoid assets, "
+            "or humanoid assets with unclear limb and body structure.",
+            inputs=[
+                IO.Custom("MESHY_TASK_ID").Input("meshy_task_id"),
+                IO.Float.Input(
+                    "height_meters",
+                    min=0.1,
+                    max=15.0,
+                    default=1.7,
+                    tooltip="The approximate height of the character model in meters. "
+                    "This aids in scaling and rigging accuracy.",
+                ),
+                IO.Image.Input(
+                    "texture_image",
+                    tooltip="The model's UV-unwrapped base color texture image.",
+                    optional=True,
+                ),
+            ],
+            outputs=[
+                IO.String.Output(display_name="model_file"),
+                IO.Custom("MESHY_RIGGED_TASK_ID").Output(display_name="rig_task_id"),
+            ],
+            hidden=[
+                IO.Hidden.auth_token_comfy_org,
+                IO.Hidden.api_key_comfy_org,
+                IO.Hidden.unique_id,
+            ],
+            is_api_node=True,
+            is_output_node=True,
+            price_badge=IO.PriceBadge(
+                expr="""{"type":"usd","usd":0.2}""",
+            ),
+        )
+
+    @classmethod
+    async def execute(
+        cls,
+        meshy_task_id: str,
+        height_meters: float,
+        texture_image: Input.Image | None = None,
+    ) -> IO.NodeOutput:
+        texture_image_url = None
+        if texture_image is not None:
+            texture_image_url = (await upload_images_to_comfyapi(cls, texture_image, wait_label="Uploading texture"))[0]
+        response = await sync_op(
+            cls,
+            endpoint=ApiEndpoint(path="/proxy/meshy/openapi/v1/rigging", method="POST"),
+            response_model=MeshyTaskResponse,
+            data=MeshyRiggingRequest(
+                input_task_id=meshy_task_id,
+                height_meters=height_meters,
+                texture_image_url=texture_image_url,
+            ),
+        )
+        result = await poll_op(
+            cls,
+            ApiEndpoint(path=f"/proxy/meshy/openapi/v1/rigging/{response.result}"),
+            response_model=MeshyRiggedResult,
+            status_extractor=lambda r: r.status,
+            progress_extractor=lambda r: r.progress,
+        )
+        model_file = f"meshy_model_{response.result}.glb"
+        await download_url_to_bytesio(
+            result.result.rigged_character_glb_url, os.path.join(get_output_directory(), model_file)
+        )
+        return IO.NodeOutput(model_file, response.result)
+
+
+class MeshyAnimateModelNode(IO.ComfyNode):
+
+    @classmethod
+    def define_schema(cls):
+        return IO.Schema(
+            node_id="MeshyAnimateModelNode",
+            display_name="Meshy: Animate Model",
+            category="api node/3d/Meshy",
+            description="Apply a specific animation action to a previously rigged character.",
+            inputs=[
+                IO.Custom("MESHY_RIGGED_TASK_ID").Input("rig_task_id"),
+                IO.Int.Input(
+                    "action_id",
+                    default=0,
+                    min=0,
+                    max=696,
+                    tooltip="Visit https://docs.meshy.ai/en/api/animation-library for a list of available values.",
+                ),
+            ],
+            outputs=[
+                IO.String.Output(display_name="model_file"),
+            ],
+            hidden=[
+                IO.Hidden.auth_token_comfy_org,
+                IO.Hidden.api_key_comfy_org,
+                IO.Hidden.unique_id,
+            ],
+            is_api_node=True,
+            is_output_node=True,
+            price_badge=IO.PriceBadge(
+                expr="""{"type":"usd","usd":0.12}""",
+            ),
+        )
+
+    @classmethod
+    async def execute(
+        cls,
+        rig_task_id: str,
+        action_id: int,
+    ) -> IO.NodeOutput:
+        response = await sync_op(
+            cls,
+            endpoint=ApiEndpoint(path="/proxy/meshy/openapi/v1/animations", method="POST"),
+            response_model=MeshyTaskResponse,
+            data=MeshyAnimationRequest(
+                rig_task_id=rig_task_id,
+                action_id=action_id,
+            ),
+        )
+        result = await poll_op(
+            cls,
+            ApiEndpoint(path=f"/proxy/meshy/openapi/v1/animations/{response.result}"),
+            response_model=MeshyAnimationResult,
+            status_extractor=lambda r: r.status,
+            progress_extractor=lambda r: r.progress,
+        )
+        model_file = f"meshy_model_{response.result}.glb"
+        await download_url_to_bytesio(result.result.animation_glb_url, os.path.join(get_output_directory(), model_file))
+        return IO.NodeOutput(model_file, response.result)
+
+
+class MeshyTextureNode(IO.ComfyNode):
+
+    @classmethod
+    def define_schema(cls):
+        return IO.Schema(
+            node_id="MeshyTextureNode",
+            display_name="Meshy: Texture Model",
+            category="api node/3d/Meshy",
+            inputs=[
+                IO.Combo.Input("model", options=["latest"]),
+                IO.Custom("MESHY_TASK_ID").Input("meshy_task_id"),
+                IO.Boolean.Input(
+                    "enable_original_uv",
+                    default=True,
+                    tooltip="Use the original UV of the model instead of generating new UVs. "
+                    "When enabled, Meshy preserves existing textures from the uploaded model. "
+                    "If the model has no original UV, the quality of the output might not be as good.",
+                ),
+                IO.Boolean.Input("pbr", default=False),
+                IO.String.Input(
+                    "text_style_prompt",
+                    default="",
+                    multiline=True,
+                    tooltip="Describe your desired texture style of the object using text. Maximum 600 characters."
+                    "Maximum 600 characters. Cannot be used at the same time as 'image_style'.",
+                ),
+                IO.Image.Input(
+                    "image_style",
+                    optional=True,
+                    tooltip="A 2d image to guide the texturing process. "
+                    "Can not be used at the same time with 'text_style_prompt'.",
+                ),
+            ],
+            outputs=[
+                IO.String.Output(display_name="model_file"),
+                IO.Custom("MODEL_TASK_ID").Output(display_name="meshy_task_id"),
+            ],
+            hidden=[
+                IO.Hidden.auth_token_comfy_org,
+                IO.Hidden.api_key_comfy_org,
+                IO.Hidden.unique_id,
+            ],
+            is_api_node=True,
+            is_output_node=True,
+            price_badge=IO.PriceBadge(
+                expr="""{"type":"usd","usd":0.4}""",
+            ),
+        )
+
+    @classmethod
+    async def execute(
+        cls,
+        model: str,
+        meshy_task_id: str,
+        enable_original_uv: bool,
+        pbr: bool,
+        text_style_prompt: str,
+        image_style: Input.Image | None = None,
+    ) -> IO.NodeOutput:
+        if text_style_prompt and image_style is not None:
+            raise ValueError("text_style_prompt and image_style cannot be used at the same time")
+        if not text_style_prompt and image_style is None:
+            raise ValueError("Either text_style_prompt or image_style is required")
+        image_style_url = None
+        if image_style is not None:
+            image_style_url = (await upload_images_to_comfyapi(cls, image_style, wait_label="Uploading style"))[0]
+        response = await sync_op(
+            cls,
+            endpoint=ApiEndpoint(path="/proxy/meshy/openapi/v1/retexture", method="POST"),
+            response_model=MeshyTaskResponse,
+            data=MeshyTextureRequest(
+                input_task_id=meshy_task_id,
+                ai_model=model,
+                enable_original_uv=enable_original_uv,
+                enable_pbr=pbr,
+                text_style_prompt=text_style_prompt if text_style_prompt else None,
+                image_style_url=image_style_url,
+            ),
+        )
+        result = await poll_op(
+            cls,
+            ApiEndpoint(path=f"/proxy/meshy/openapi/v1/retexture/{response.result}"),
+            response_model=MeshyModelResult,
+            status_extractor=lambda r: r.status,
+            progress_extractor=lambda r: r.progress,
+        )
+        model_file = f"meshy_model_{response.result}.glb"
+        await download_url_to_bytesio(result.model_urls.glb, os.path.join(get_output_directory(), model_file))
+        return IO.NodeOutput(model_file, response.result)
+
+
+class MeshyExtension(ComfyExtension):
+    @override
+    async def get_node_list(self) -> list[type[IO.ComfyNode]]:
+        return [
+            MeshyTextToModelNode,
+            MeshyRefineNode,
+            MeshyImageToModelNode,
+            MeshyMultiImageToModelNode,
+            MeshyRigModelNode,
+            MeshyAnimateModelNode,
+            MeshyTextureNode,
+        ]
+
+
+async def comfy_entrypoint() -> MeshyExtension:
+    return MeshyExtension()
--- a/comfy_api_nodes/nodes_minimax.py
+++ b/comfy_api_nodes/nodes_minimax.py
@ -4,7 +4,7 @@ import torch
 from typing_extensions import override

 from comfy_api.latest import IO, ComfyExtension
-from comfy_api_nodes.apis.minimax_api import (
+from comfy_api_nodes.apis.minimax import (
    MinimaxFileRetrieveResponse,
    MiniMaxModel,
    MinimaxTaskResultResponse,
--- a/comfy_api_nodes/nodes_moonvalley.py
+++ b/comfy_api_nodes/nodes_moonvalley.py
@ -3,7 +3,7 @@ import logging
 from typing_extensions import override

 from comfy_api.latest import IO, ComfyExtension, Input
-from comfy_api_nodes.apis import (
+from comfy_api_nodes.apis.moonvalley import (
    MoonvalleyPromptResponse,
    MoonvalleyTextToVideoInferenceParams,
    MoonvalleyTextToVideoRequest,
--- a/comfy_api_nodes/nodes_openai.py
+++ b/comfy_api_nodes/nodes_openai.py
@ -10,24 +10,18 @@ from typing_extensions import override

 import folder_paths
 from comfy_api.latest import IO, ComfyExtension, Input
-from comfy_api_nodes.apis import (
-    CreateModelResponseProperties,
-    Detail,
-    InputContent,
+from comfy_api_nodes.apis.openai import (
    InputFileContent,
    InputImageContent,
    InputMessage,
-    InputMessageContentList,
    InputTextContent,
-    Item,
+    ModelResponseProperties,
    OpenAICreateResponse,
-    OpenAIResponse,
-    OutputContent,
-)
-from comfy_api_nodes.apis.openai_api import (
    OpenAIImageEditRequest,
    OpenAIImageGenerationRequest,
    OpenAIImageGenerationResponse,
+    OpenAIResponse,
+    OutputContent,
 )
 from comfy_api_nodes.util import (
    ApiEndpoint,
@ -266,7 +260,7 @@ class OpenAIDalle3(IO.ComfyNode):
                    "seed",
                    default=0,
                    min=0,
-                    max=2 ** 31 - 1,
+                    max=2**31 - 1,
                    step=1,
                    display_mode=IO.NumberDisplay.number,
                    control_after_generate=True,
@ -384,7 +378,7 @@ class OpenAIGPTImage1(IO.ComfyNode):
                    "seed",
                    default=0,
                    min=0,
-                    max=2 ** 31 - 1,
+                    max=2**31 - 1,
                    step=1,
                    display_mode=IO.NumberDisplay.number,
                    control_after_generate=True,
@ -500,8 +494,8 @@ class OpenAIGPTImage1(IO.ComfyNode):
            files = []
            batch_size = image.shape[0]
            for i in range(batch_size):
-                single_image = image[i: i + 1]
-                scaled_image = downscale_image_tensor(single_image, total_pixels=2048*2048).squeeze()
+                single_image = image[i : i + 1]
+                scaled_image = downscale_image_tensor(single_image, total_pixels=2048 * 2048).squeeze()

                image_np = (scaled_image.numpy() * 255).astype(np.uint8)
                img = Image.fromarray(image_np)
@ -523,7 +517,7 @@ class OpenAIGPTImage1(IO.ComfyNode):
                rgba_mask = torch.zeros(height, width, 4, device="cpu")
                rgba_mask[:, :, 3] = 1 - mask.squeeze().cpu()

-                scaled_mask = downscale_image_tensor(rgba_mask.unsqueeze(0), total_pixels=2048*2048).squeeze()
+                scaled_mask = downscale_image_tensor(rgba_mask.unsqueeze(0), total_pixels=2048 * 2048).squeeze()

                mask_np = (scaled_mask.numpy() * 255).astype(np.uint8)
                mask_img = Image.fromarray(mask_np)
@ -696,29 +690,23 @@ class OpenAIChatNode(IO.ComfyNode):
        )

    @classmethod
-    def get_message_content_from_response(
-        cls, response: OpenAIResponse
-    ) -> list[OutputContent]:
+    def get_message_content_from_response(cls, response: OpenAIResponse) -> list[OutputContent]:
        """Extract message content from the API response."""
        for output in response.output:
-            if output.root.type == "message":
-                return output.root.content
+            if output.type == "message":
+                return output.content
        raise TypeError("No output message found in response")

    @classmethod
-    def get_text_from_message_content(
-        cls, message_content: list[OutputContent]
-    ) -> str:
+    def get_text_from_message_content(cls, message_content: list[OutputContent]) -> str:
        """Extract text content from message content."""
        for content_item in message_content:
-            if content_item.root.type == "output_text":
-                return str(content_item.root.text)
+            if content_item.type == "output_text":
+                return str(content_item.text)
        return "No text output found in response"

    @classmethod
-    def tensor_to_input_image_content(
-        cls, image: torch.Tensor, detail_level: Detail = "auto"
-    ) -> InputImageContent:
+    def tensor_to_input_image_content(cls, image: torch.Tensor, detail_level: str = "auto") -> InputImageContent:
        """Convert a tensor to an input image content object."""
        return InputImageContent(
            detail=detail_level,
@ -732,9 +720,9 @@ class OpenAIChatNode(IO.ComfyNode):
        prompt: str,
        image: torch.Tensor | None = None,
        files: list[InputFileContent] | None = None,
-    ) -> InputMessageContentList:
+    ) -> list[InputTextContent | InputImageContent | InputFileContent]:
        """Create a list of input message contents from prompt and optional image."""
-        content_list: list[InputContent | InputTextContent | InputImageContent | InputFileContent] = [
+        content_list: list[InputTextContent | InputImageContent | InputFileContent] = [
            InputTextContent(text=prompt, type="input_text"),
        ]
        if image is not None:
@ -746,13 +734,9 @@ class OpenAIChatNode(IO.ComfyNode):
                        type="input_image",
                    )
                )
-
        if files is not None:
            content_list.extend(files)
-
-        return InputMessageContentList(
-            root=content_list,
-        )
+        return content_list

    @classmethod
    async def execute(
@ -762,7 +746,7 @@ class OpenAIChatNode(IO.ComfyNode):
        model: SupportedOpenAIModel = SupportedOpenAIModel.gpt_5.value,
        images: torch.Tensor | None = None,
        files: list[InputFileContent] | None = None,
-        advanced_options: CreateModelResponseProperties | None = None,
+        advanced_options: ModelResponseProperties | None = None,
    ) -> IO.NodeOutput:
        validate_string(prompt, strip_whitespace=False)

@ -773,36 +757,28 @@ class OpenAIChatNode(IO.ComfyNode):
            response_model=OpenAIResponse,
            data=OpenAICreateResponse(
                input=[
-                    Item(
-                        root=InputMessage(
-                            content=cls.create_input_message_contents(
-                                prompt, images, files
-                            ),
-                            role="user",
-                        )
+                    InputMessage(
+                        content=cls.create_input_message_contents(prompt, images, files),
+                        role="user",
                    ),
                ],
                store=True,
                stream=False,
                model=model,
                previous_response_id=None,
-                **(
-                    advanced_options.model_dump(exclude_none=True)
-                    if advanced_options
-                    else {}
-                ),
+                **(advanced_options.model_dump(exclude_none=True) if advanced_options else {}),
            ),
        )
        response_id = create_response.id

        # Get result output
        result_response = await poll_op(
-                cls,
-                ApiEndpoint(path=f"{RESPONSES_ENDPOINT}/{response_id}"),
-                response_model=OpenAIResponse,
-                status_extractor=lambda response: response.status,
-                completed_statuses=["incomplete", "completed"]
-            )
+            cls,
+            ApiEndpoint(path=f"{RESPONSES_ENDPOINT}/{response_id}"),
+            response_model=OpenAIResponse,
+            status_extractor=lambda response: response.status,
+            completed_statuses=["incomplete", "completed"],
+        )
        return IO.NodeOutput(cls.get_text_from_message_content(cls.get_message_content_from_response(result_response)))


@ -923,7 +899,7 @@ class OpenAIChatConfig(IO.ComfyNode):
            remove depending on model choice.
        """
        return IO.NodeOutput(
-            CreateModelResponseProperties(
+            ModelResponseProperties(
                instructions=instructions,
                truncation=truncation,
                max_output_tokens=max_output_tokens,
--- a/comfy_api_nodes/nodes_pixverse.py
+++ b/comfy_api_nodes/nodes_pixverse.py
@ -1,7 +1,7 @@
 import torch
 from typing_extensions import override
 from comfy_api.latest import IO, ComfyExtension
-from comfy_api_nodes.apis.pixverse_api import (
+from comfy_api_nodes.apis.pixverse import (
    PixverseTextVideoRequest,
    PixverseImageVideoRequest,
    PixverseTransitionVideoRequest,
--- a/comfy_api_nodes/nodes_recraft.py
+++ b/comfy_api_nodes/nodes_recraft.py
@ -8,7 +8,7 @@ from typing_extensions import override

 from comfy.utils import ProgressBar
 from comfy_api.latest import IO, ComfyExtension
-from comfy_api_nodes.apis.recraft_api import (
+from comfy_api_nodes.apis.recraft import (
    RecraftColor,
    RecraftColorChain,
    RecraftControls,
--- a/comfy_api_nodes/nodes_rodin.py
+++ b/comfy_api_nodes/nodes_rodin.py
@ -14,7 +14,7 @@ from typing import Optional
 from io import BytesIO
 from typing_extensions import override
 from PIL import Image
-from comfy_api_nodes.apis.rodin_api import (
+from comfy_api_nodes.apis.rodin import (
    Rodin3DGenerateRequest,
    Rodin3DGenerateResponse,
    Rodin3DCheckStatusRequest,
--- a/comfy_api_nodes/nodes_runway.py
+++ b/comfy_api_nodes/nodes_runway.py
@ -16,7 +16,7 @@ from enum import Enum
 from typing_extensions import override

 from comfy_api.latest import IO, ComfyExtension, Input, InputImpl
-from comfy_api_nodes.apis import (
+from comfy_api_nodes.apis.runway import (
    RunwayImageToVideoRequest,
    RunwayImageToVideoResponse,
    RunwayTaskStatusResponse as TaskStatusResponse,
--- a/comfy_api_nodes/nodes_stability.py
+++ b/comfy_api_nodes/nodes_stability.py
@ -3,7 +3,7 @@ from typing import Optional
 from typing_extensions import override

 from comfy_api.latest import ComfyExtension, Input, IO
-from comfy_api_nodes.apis.stability_api import (
+from comfy_api_nodes.apis.stability import (
    StabilityUpscaleConservativeRequest,
    StabilityUpscaleCreativeRequest,
    StabilityAsyncResponse,
--- a/comfy_api_nodes/nodes_topaz.py
+++ b/comfy_api_nodes/nodes_topaz.py
@ -5,7 +5,24 @@ import aiohttp
 from typing_extensions import override

 from comfy_api.latest import IO, ComfyExtension, Input
-from comfy_api_nodes.apis import topaz_api
+from comfy_api_nodes.apis.topaz import (
+    CreateVideoRequest,
+    CreateVideoRequestSource,
+    CreateVideoResponse,
+    ImageAsyncTaskResponse,
+    ImageDownloadResponse,
+    ImageEnhanceRequest,
+    ImageStatusResponse,
+    OutputInformationVideo,
+    Resolution,
+    VideoAcceptResponse,
+    VideoCompleteUploadRequest,
+    VideoCompleteUploadRequestPart,
+    VideoCompleteUploadResponse,
+    VideoEnhancementFilter,
+    VideoFrameInterpolationFilter,
+    VideoStatusResponse,
+)
 from comfy_api_nodes.util import (
    ApiEndpoint,
    download_url_to_image_tensor,
@ -153,13 +170,13 @@ class TopazImageEnhance(IO.ComfyNode):
        if get_number_of_images(image) != 1:
            raise ValueError("Only one input image is supported.")
        download_url = await upload_images_to_comfyapi(
-            cls, image, max_images=1, mime_type="image/png", total_pixels=4096*4096
+            cls, image, max_images=1, mime_type="image/png", total_pixels=4096 * 4096
        )
        initial_response = await sync_op(
            cls,
            ApiEndpoint(path="/proxy/topaz/image/v1/enhance-gen/async", method="POST"),
-            response_model=topaz_api.ImageAsyncTaskResponse,
-            data=topaz_api.ImageEnhanceRequest(
+            response_model=ImageAsyncTaskResponse,
+            data=ImageEnhanceRequest(
                model=model,
                prompt=prompt,
                subject_detection=subject_detection,
@ -181,7 +198,7 @@ class TopazImageEnhance(IO.ComfyNode):
        await poll_op(
            cls,
            poll_endpoint=ApiEndpoint(path=f"/proxy/topaz/image/v1/status/{initial_response.process_id}"),
-            response_model=topaz_api.ImageStatusResponse,
+            response_model=ImageStatusResponse,
            status_extractor=lambda x: x.status,
            progress_extractor=lambda x: getattr(x, "progress", 0),
            price_extractor=lambda x: x.credits * 0.08,
@ -193,7 +210,7 @@ class TopazImageEnhance(IO.ComfyNode):
        results = await sync_op(
            cls,
            ApiEndpoint(path=f"/proxy/topaz/image/v1/download/{initial_response.process_id}"),
-            response_model=topaz_api.ImageDownloadResponse,
+            response_model=ImageDownloadResponse,
            monitor_progress=False,
        )
        return IO.NodeOutput(await download_url_to_image_tensor(results.download_url))
@ -331,7 +348,7 @@ class TopazVideoEnhance(IO.ComfyNode):
            if target_height % 2 != 0:
                target_height += 1
            filters.append(
-                topaz_api.VideoEnhancementFilter(
+                VideoEnhancementFilter(
                    model=UPSCALER_MODELS_MAP[upscaler_model],
                    creativity=(upscaler_creativity if UPSCALER_MODELS_MAP[upscaler_model] == "slc-1" else None),
                    isOptimizedMode=(True if UPSCALER_MODELS_MAP[upscaler_model] == "slc-1" else None),
@ -340,7 +357,7 @@ class TopazVideoEnhance(IO.ComfyNode):
        if interpolation_enabled:
            target_frame_rate = interpolation_frame_rate
            filters.append(
-                topaz_api.VideoFrameInterpolationFilter(
+                VideoFrameInterpolationFilter(
                    model=interpolation_model,
                    slowmo=interpolation_slowmo,
                    fps=interpolation_frame_rate,
@ -351,19 +368,19 @@ class TopazVideoEnhance(IO.ComfyNode):
        initial_res = await sync_op(
            cls,
            ApiEndpoint(path="/proxy/topaz/video/", method="POST"),
-            response_model=topaz_api.CreateVideoResponse,
-            data=topaz_api.CreateVideoRequest(
-                source=topaz_api.CreateCreateVideoRequestSource(
+            response_model=CreateVideoResponse,
+            data=CreateVideoRequest(
+                source=CreateVideoRequestSource(
                    container="mp4",
                    size=get_fs_object_size(src_video_stream),
                    duration=int(duration_sec),
                    frameCount=video.get_frame_count(),
                    frameRate=src_frame_rate,
-                    resolution=topaz_api.Resolution(width=src_width, height=src_height),
+                    resolution=Resolution(width=src_width, height=src_height),
                ),
                filters=filters,
-                output=topaz_api.OutputInformationVideo(
-                    resolution=topaz_api.Resolution(width=target_width, height=target_height),
+                output=OutputInformationVideo(
+                    resolution=Resolution(width=target_width, height=target_height),
                    frameRate=target_frame_rate,
                    audioCodec="AAC",
                    audioTransfer="Copy",
@ -379,7 +396,7 @@ class TopazVideoEnhance(IO.ComfyNode):
                path=f"/proxy/topaz/video/{initial_res.requestId}/accept",
                method="PATCH",
            ),
-            response_model=topaz_api.VideoAcceptResponse,
+            response_model=VideoAcceptResponse,
            wait_label="Preparing upload",
            final_label_on_success="Upload started",
        )
@ -402,10 +419,10 @@ class TopazVideoEnhance(IO.ComfyNode):
                path=f"/proxy/topaz/video/{initial_res.requestId}/complete-upload",
                method="PATCH",
            ),
-            response_model=topaz_api.VideoCompleteUploadResponse,
-            data=topaz_api.VideoCompleteUploadRequest(
+            response_model=VideoCompleteUploadResponse,
+            data=VideoCompleteUploadRequest(
                uploadResults=[
-                    topaz_api.VideoCompleteUploadRequestPart(
+                    VideoCompleteUploadRequestPart(
                        partNum=1,
                        eTag=upload_etag,
                    ),
@ -417,7 +434,7 @@ class TopazVideoEnhance(IO.ComfyNode):
        final_response = await poll_op(
            cls,
            ApiEndpoint(path=f"/proxy/topaz/video/{initial_res.requestId}/status"),
-            response_model=topaz_api.VideoStatusResponse,
+            response_model=VideoStatusResponse,
            status_extractor=lambda x: x.status,
            progress_extractor=lambda x: getattr(x, "progress", 0),
            price_extractor=lambda x: (x.estimates.cost[0] * 0.08 if x.estimates and x.estimates.cost[0] else None),
--- a/comfy_api_nodes/nodes_tripo.py
+++ b/comfy_api_nodes/nodes_tripo.py
@ -5,7 +5,7 @@ import torch
 from typing_extensions import override

 from comfy_api.latest import IO, ComfyExtension
-from comfy_api_nodes.apis.tripo_api import (
+from comfy_api_nodes.apis.tripo import (
    TripoAnimateRetargetRequest,
    TripoAnimateRigRequest,
    TripoConvertModelRequest,
--- a/comfy_api_nodes/nodes_veo2.py
+++ b/comfy_api_nodes/nodes_veo2.py
@ -4,7 +4,7 @@ from io import BytesIO
 from typing_extensions import override

 from comfy_api.latest import IO, ComfyExtension, Input, InputImpl
-from comfy_api_nodes.apis.veo_api import (
+from comfy_api_nodes.apis.veo import (
    VeoGenVidPollRequest,
    VeoGenVidPollResponse,
    VeoGenVidRequest,
--- a/comfy_api_nodes/nodes_wavespeed.py
+++ b/comfy_api_nodes/nodes_wavespeed.py
@ -0,0 +1,178 @@
+from typing_extensions import override
+
+from comfy_api.latest import IO, ComfyExtension, Input
+from comfy_api_nodes.apis.wavespeed import (
+    FlashVSRRequest,
+    TaskCreatedResponse,
+    TaskResultResponse,
+    SeedVR2ImageRequest,
+)
+from comfy_api_nodes.util import (
+    ApiEndpoint,
+    download_url_to_video_output,
+    poll_op,
+    sync_op,
+    upload_video_to_comfyapi,
+    validate_container_format_is_mp4,
+    validate_video_duration,
+    upload_images_to_comfyapi,
+    get_number_of_images,
+    download_url_to_image_tensor,
+)
+
+
+class WavespeedFlashVSRNode(IO.ComfyNode):
+    @classmethod
+    def define_schema(cls):
+        return IO.Schema(
+            node_id="WavespeedFlashVSRNode",
+            display_name="FlashVSR Video Upscale",
+            category="api node/video/WaveSpeed",
+            description="Fast, high-quality video upscaler that "
+            "boosts resolution and restores clarity for low-resolution or blurry footage.",
+            inputs=[
+                IO.Video.Input("video"),
+                IO.Combo.Input("target_resolution", options=["720p", "1080p", "2K", "4K"]),
+            ],
+            outputs=[
+                IO.Video.Output(),
+            ],
+            hidden=[
+                IO.Hidden.auth_token_comfy_org,
+                IO.Hidden.api_key_comfy_org,
+                IO.Hidden.unique_id,
+            ],
+            is_api_node=True,
+            price_badge=IO.PriceBadge(
+                depends_on=IO.PriceBadgeDepends(widgets=["target_resolution"]),
+                expr="""
+                (
+                  $price_for_1sec := {"720p": 0.012, "1080p": 0.018, "2k": 0.024, "4k": 0.032};
+                  {
+                    "type":"usd",
+                    "usd": $lookup($price_for_1sec, widgets.target_resolution),
+                    "format":{"suffix": "/second", "approximate": true}
+                  }
+                )
+                """,
+            ),
+        )
+
+    @classmethod
+    async def execute(
+        cls,
+        video: Input.Video,
+        target_resolution: str,
+    ) -> IO.NodeOutput:
+        validate_container_format_is_mp4(video)
+        validate_video_duration(video, min_duration=5, max_duration=60 * 10)
+        initial_res = await sync_op(
+            cls,
+            ApiEndpoint(path="/proxy/wavespeed/api/v3/wavespeed-ai/flashvsr", method="POST"),
+            response_model=TaskCreatedResponse,
+            data=FlashVSRRequest(
+                target_resolution=target_resolution.lower(),
+                video=await upload_video_to_comfyapi(cls, video),
+                duration=video.get_duration(),
+            ),
+        )
+        if initial_res.code != 200:
+            raise ValueError(f"Task creation fails with code={initial_res.code} and message={initial_res.message}")
+        final_response = await poll_op(
+            cls,
+            ApiEndpoint(path=f"/proxy/wavespeed/api/v3/predictions/{initial_res.data.id}/result"),
+            response_model=TaskResultResponse,
+            status_extractor=lambda x: "failed" if x.data is None else x.data.status,
+            poll_interval=10.0,
+            max_poll_attempts=480,
+        )
+        if final_response.code != 200:
+            raise ValueError(
+                f"Task processing failed with code={final_response.code} and message={final_response.message}"
+            )
+        return IO.NodeOutput(await download_url_to_video_output(final_response.data.outputs[0]))
+
+
+class WavespeedImageUpscaleNode(IO.ComfyNode):
+    @classmethod
+    def define_schema(cls):
+        return IO.Schema(
+            node_id="WavespeedImageUpscaleNode",
+            display_name="WaveSpeed Image Upscale",
+            category="api node/image/WaveSpeed",
+            description="Boost image resolution and quality, upscaling photos to 4K or 8K for sharp, detailed results.",
+            inputs=[
+                IO.Combo.Input("model", options=["SeedVR2", "Ultimate"]),
+                IO.Image.Input("image"),
+                IO.Combo.Input("target_resolution", options=["2K", "4K", "8K"]),
+            ],
+            outputs=[
+                IO.Image.Output(),
+            ],
+            hidden=[
+                IO.Hidden.auth_token_comfy_org,
+                IO.Hidden.api_key_comfy_org,
+                IO.Hidden.unique_id,
+            ],
+            is_api_node=True,
+            price_badge=IO.PriceBadge(
+                depends_on=IO.PriceBadgeDepends(widgets=["model"]),
+                expr="""
+                (
+                  $prices := {"seedvr2": 0.01, "ultimate": 0.06};
+                  {"type":"usd", "usd": $lookup($prices, widgets.model)}
+                )
+                """,
+            ),
+        )
+
+    @classmethod
+    async def execute(
+        cls,
+        model: str,
+        image: Input.Image,
+        target_resolution: str,
+    ) -> IO.NodeOutput:
+        if get_number_of_images(image) != 1:
+            raise ValueError("Exactly one input image is required.")
+        if model == "SeedVR2":
+            model_path = "seedvr2/image"
+        else:
+            model_path = "ultimate-image-upscaler"
+        initial_res = await sync_op(
+            cls,
+            ApiEndpoint(path=f"/proxy/wavespeed/api/v3/wavespeed-ai/{model_path}", method="POST"),
+            response_model=TaskCreatedResponse,
+            data=SeedVR2ImageRequest(
+                target_resolution=target_resolution.lower(),
+                image=(await upload_images_to_comfyapi(cls, image, max_images=1))[0],
+            ),
+        )
+        if initial_res.code != 200:
+            raise ValueError(f"Task creation fails with code={initial_res.code} and message={initial_res.message}")
+        final_response = await poll_op(
+            cls,
+            ApiEndpoint(path=f"/proxy/wavespeed/api/v3/predictions/{initial_res.data.id}/result"),
+            response_model=TaskResultResponse,
+            status_extractor=lambda x: "failed" if x.data is None else x.data.status,
+            poll_interval=10.0,
+            max_poll_attempts=480,
+        )
+        if final_response.code != 200:
+            raise ValueError(
+                f"Task processing failed with code={final_response.code} and message={final_response.message}"
+            )
+        return IO.NodeOutput(await download_url_to_image_tensor(final_response.data.outputs[0]))
+
+
+class WavespeedExtension(ComfyExtension):
+    @override
+    async def get_node_list(self) -> list[type[IO.ComfyNode]]:
+        return [
+            WavespeedFlashVSRNode,
+            WavespeedImageUpscaleNode,
+        ]
+
+
+async def comfy_entrypoint() -> WavespeedExtension:
+    return WavespeedExtension()
--- a/comfy_api_nodes/redocly-dev.yaml
+++ b/comfy_api_nodes/redocly-dev.yaml
@ -1,10 +0,0 @@
-# This file is used to filter the Comfy Org OpenAPI spec for schemas related to API Nodes.
-# This is used for development purposes to generate stubs for unreleased API endpoints.
-apis:
-  filter:
-    root: openapi.yaml
-    decorators:
-      filter-in:
-        property: tags
-        value: ['API Nodes']
-        matchStrategy: all
--- a/comfy_api_nodes/redocly.yaml
+++ b/comfy_api_nodes/redocly.yaml
@ -1,10 +0,0 @@
-# This file is used to filter the Comfy Org OpenAPI spec for schemas related to API Nodes.
-
-apis:
-  filter:
-    root: openapi.yaml
-    decorators:
-      filter-in:
-        property: tags
-        value: ['API Nodes', 'Released']
-        matchStrategy: all
--- a/comfy_api_nodes/util/init.py
+++ b/comfy_api_nodes/util/init.py
@ -11,6 +11,7 @@ from .conversions import (
    audio_input_to_mp3,
    audio_to_base64_string,
    bytesio_to_image_tensor,
+    convert_mask_to_image,
    downscale_image_tensor,
    image_tensor_pair_to_batch,
    pil_to_bytesio,
@ -72,6 +73,7 @@ __all__ = [
    "audio_input_to_mp3",
    "audio_to_base64_string",
    "bytesio_to_image_tensor",
+    "convert_mask_to_image",
    "downscale_image_tensor",
    "image_tensor_pair_to_batch",
    "pil_to_bytesio",
--- a/comfy_api_nodes/util/conversions.py
+++ b/comfy_api_nodes/util/conversions.py
@ -451,6 +451,12 @@ def resize_mask_to_image(
    return mask


+def convert_mask_to_image(mask: Input.Image) -> torch.Tensor:
+    """Make mask have the expected amount of dims (4) and channels (3) to be recognized as an image."""
+    mask = mask.unsqueeze(-1)
+    return torch.cat([mask] * 3, dim=-1)
+
+
 def text_filepath_to_base64_string(filepath: str) -> str:
    """Converts a text file to a base64 string."""
    with open(filepath, "rb") as f:
--- a/comfy_api_nodes/util/upload_helpers.py
+++ b/comfy_api_nodes/util/upload_helpers.py
@ -43,7 +43,7 @@ class UploadResponse(BaseModel):

 async def upload_images_to_comfyapi(
    cls: type[IO.ComfyNode],
-    image: torch.Tensor,
+    image: torch.Tensor | list[torch.Tensor],
    *,
    max_images: int = 8,
    mime_type: str | None = None,
@ -55,15 +55,28 @@ async def upload_images_to_comfyapi(
    Uploads images to ComfyUI API and returns download URLs.
    To upload multiple images, stack them in the batch dimension first.
    """
+    tensors: list[torch.Tensor] = []
+    if isinstance(image, list):
+        for img in image:
+            is_batch = len(img.shape) > 3
+            if is_batch:
+                tensors.extend(img[i] for i in range(img.shape[0]))
+            else:
+                tensors.append(img)
+    else:
+        is_batch = len(image.shape) > 3
+        if is_batch:
+            tensors.extend(image[i] for i in range(image.shape[0]))
+        else:
+            tensors.append(image)
+
    # if batched, try to upload each file if max_images is greater than 0
    download_urls: list[str] = []
-    is_batch = len(image.shape) > 3
-    batch_len = image.shape[0] if is_batch else 1
-    num_to_upload = min(batch_len, max_images)
+    num_to_upload = min(len(tensors), max_images)
    batch_start_ts = time.monotonic()

    for idx in range(num_to_upload):
-        tensor = image[idx] if is_batch else image
+        tensor = tensors[idx]
        img_io = tensor_to_bytesio(tensor, total_pixels=total_pixels, mime_type=mime_type)

        effective_label = wait_label
--- a/comfy_extras/nodes_zimage.py
+++ b/comfy_extras/nodes_zimage.py
@ -0,0 +1,88 @@
+import node_helpers
+from typing_extensions import override
+from comfy_api.latest import ComfyExtension, io
+import math
+import comfy.utils
+
+
+class TextEncodeZImageOmni(io.ComfyNode):
+    @classmethod
+    def define_schema(cls):
+        return io.Schema(
+            node_id="TextEncodeZImageOmni",
+            category="advanced/conditioning",
+            is_experimental=True,
+            inputs=[
+                io.Clip.Input("clip"),
+                io.ClipVision.Input("image_encoder", optional=True),
+                io.String.Input("prompt", multiline=True, dynamic_prompts=True),
+                io.Boolean.Input("auto_resize_images", default=True),
+                io.Vae.Input("vae", optional=True),
+                io.Image.Input("image1", optional=True),
+                io.Image.Input("image2", optional=True),
+                io.Image.Input("image3", optional=True),
+            ],
+            outputs=[
+                io.Conditioning.Output(),
+            ],
+        )
+
+    @classmethod
+    def execute(cls, clip, prompt, image_encoder=None, auto_resize_images=True, vae=None, image1=None, image2=None, image3=None) -> io.NodeOutput:
+        ref_latents = []
+        images = list(filter(lambda a: a is not None, [image1, image2, image3]))
+
+        prompt_list = []
+        template = None
+        if len(images) > 0:
+            prompt_list = ["<|im_start|>user\n<|vision_start|>"]
+            prompt_list += ["<|vision_end|><|vision_start|>"] * (len(images) - 1)
+            prompt_list += ["<|vision_end|><|im_end|>"]
+            template = "<|vision_end|>{}<|im_end|>\n<|im_start|>assistant\n<|vision_start|>"
+
+        encoded_images = []
+
+        for i, image in enumerate(images):
+            if image_encoder is not None:
+                encoded_images.append(image_encoder.encode_image(image))
+
+            if vae is not None:
+                if auto_resize_images:
+                    samples = image.movedim(-1, 1)
+                    total = int(1024 * 1024)
+                    scale_by = math.sqrt(total / (samples.shape[3] * samples.shape[2]))
+                    width = round(samples.shape[3] * scale_by / 8.0) * 8
+                    height = round(samples.shape[2] * scale_by / 8.0) * 8
+
+                    image = comfy.utils.common_upscale(samples, width, height, "area", "disabled").movedim(1, -1)
+                ref_latents.append(vae.encode(image))
+
+        tokens = clip.tokenize(prompt, llama_template=template)
+        conditioning = clip.encode_from_tokens_scheduled(tokens)
+
+        extra_text_embeds = []
+        for p in prompt_list:
+            tokens = clip.tokenize(p, llama_template="{}")
+            text_embeds = clip.encode_from_tokens_scheduled(tokens)
+            extra_text_embeds.append(text_embeds[0][0])
+
+        if len(ref_latents) > 0:
+            conditioning = node_helpers.conditioning_set_values(conditioning, {"reference_latents": ref_latents}, append=True)
+        if len(encoded_images) > 0:
+            conditioning = node_helpers.conditioning_set_values(conditioning, {"clip_vision_outputs": encoded_images}, append=True)
+        if len(extra_text_embeds) > 0:
+            conditioning = node_helpers.conditioning_set_values(conditioning, {"reference_latents_text_embeds": extra_text_embeds}, append=True)
+
+        return io.NodeOutput(conditioning)
+
+
+class ZImageExtension(ComfyExtension):
+    @override
+    async def get_node_list(self) -> list[type[io.ComfyNode]]:
+        return [
+            TextEncodeZImageOmni,
+        ]
+
+
+async def comfy_entrypoint() -> ZImageExtension:
+    return ZImageExtension()
--- a/comfyui_version.py
+++ b/comfyui_version.py
@ -1,3 +1,3 @@
 # This file is automatically generated by the build process when version is
 # updated in pyproject.toml.
-__version__ = "0.9.1"
+__version__ = "0.10.0"
--- a/nodes.py
+++ b/nodes.py
@ -5,6 +5,7 @@ import torch
 import os
 import sys
 import json
+import glob
 import hashlib
 import inspect
 import traceback
@ -788,6 +789,7 @@ class VAELoader:

    #TODO: scale factor?
    def load_vae(self, vae_name):
+        metadata = None
        if vae_name == "pixel_space":
            sd = {}
            sd["pixel_space_vae"] = torch.tensor(1.0)
@ -2371,6 +2373,7 @@ async def init_builtin_extra_nodes():
        "nodes_kandinsky5.py",
        "nodes_wanmove.py",
        "nodes_image_compare.py",
+        "nodes_zimage.py",
    ]

    import_failed = []
@ -2383,37 +2386,12 @@ async def init_builtin_extra_nodes():

 async def init_builtin_api_nodes():
    api_nodes_dir = os.path.join(os.path.dirname(os.path.realpath(__file__)), "comfy_api_nodes")
-    api_nodes_files = [
-        "nodes_ideogram.py",
-        "nodes_openai.py",
-        "nodes_minimax.py",
-        "nodes_veo2.py",
-        "nodes_kling.py",
-        "nodes_bfl.py",
-        "nodes_bytedance.py",
-        "nodes_ltxv.py",
-        "nodes_luma.py",
-        "nodes_recraft.py",
-        "nodes_pixverse.py",
-        "nodes_stability.py",
-        "nodes_runway.py",
-        "nodes_sora.py",
-        "nodes_topaz.py",
-        "nodes_tripo.py",
-        "nodes_moonvalley.py",
-        "nodes_rodin.py",
-        "nodes_gemini.py",
-        "nodes_vidu.py",
-        "nodes_wan.py",
-    ]
-
-    if not await load_custom_node(os.path.join(api_nodes_dir, "canary.py"), module_parent="comfy_api_nodes"):
-        return api_nodes_files
+    api_nodes_files = sorted(glob.glob(os.path.join(api_nodes_dir, "nodes_*.py")))

    import_failed = []
    for node_file in api_nodes_files:
-        if not await load_custom_node(os.path.join(api_nodes_dir, node_file), module_parent="comfy_api_nodes"):
-            import_failed.append(node_file)
+        if not await load_custom_node(node_file, module_parent="comfy_api_nodes"):
+            import_failed.append(os.path.basename(node_file))

    return import_failed

--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "ComfyUI"
-version = "0.9.1"
+version = "0.10.0"
 readme = "README.md"
 license = { file = "LICENSE" }
 requires-python = ">=3.10"
--- a/requirements.txt
+++ b/requirements.txt
@ -1,5 +1,5 @@
-comfyui-frontend-package==1.36.14
-comfyui-workflow-templates==0.8.4
+comfyui-frontend-package==1.37.11
+comfyui-workflow-templates==0.8.15
 comfyui-embedded-docs==0.4.0
 torch
 torchsde
@ -21,7 +21,7 @@ psutil
 alembic
 SQLAlchemy
 av>=14.2.0
-comfy-kitchen>=0.2.6
+comfy-kitchen>=0.2.7

 #non essential dependencies:
 kornia>=0.7.1
--- a/run_comfyui.bat
+++ b/run_comfyui.bat
@ -0,0 +1,227 @@
+@echo off
+chcp 65001 >nul 2>&1
+cd /d "%~dp0"
+
+echo.
+echo  ComfyUI Windows launcher
+echo  Performing quick preflight checks...
+echo.
+
+REM Check Python availability
+python --version >nul 2>&1
+if errorlevel 1 (
+    echo.
+    echo  ╔═══════════════════════════════════════════════════════════╗
+    echo  ║                    Python Not Found                       ║
+    echo  ╚═══════════════════════════════════════════════════════════╝
+    echo.
+    echo  ▓ ComfyUI needs Python to run, but we couldn't find it on your computer.
+    echo.
+    echo  ▓ What to do:
+    echo    1. Download Python from: https://www.python.org/downloads/
+    echo    2. During installation, make sure to check "Add Python to PATH"
+    echo    3. Restart your computer after installing
+    echo    4. Try running this script again
+    echo.
+    pause
+    exit /b 1
+)
+
+REM Get Python environment information
+python -c "import sys, os; venv = os.environ.get('VIRTUAL_ENV', ''); is_venv = hasattr(sys, 'real_prefix') or (hasattr(sys, 'base_prefix') and sys.base_prefix != sys.prefix); env_type = 'VENV_DETECTED' if (venv or is_venv) else 'SYSTEM_PYTHON'; print(env_type); print('PYTHON_PATH=' + sys.executable)" > env_info.tmp
+for /f "tokens=1,* delims==" %%a in (env_info.tmp) do (
+    if "%%a"=="VENV_DETECTED" set ENV_TYPE=VENV_DETECTED
+    if "%%a"=="SYSTEM_PYTHON" set ENV_TYPE=SYSTEM_PYTHON
+    if "%%a"=="PYTHON_PATH" set PYTHON_PATH=%%b
+)
+del env_info.tmp
+
+REM ---------------------------------------------------------------
+REM Weekly full check logic (informational checks only)
+REM Force with: run_comfyui.bat --full-check
+REM ---------------------------------------------------------------
+set STATE_DIR=%LOCALAPPDATA%\ComfyUI\state
+if not exist "%STATE_DIR%" mkdir "%STATE_DIR%" >nul 2>&1
+set FULL_STAMP=%STATE_DIR%\last_full_check.stamp
+
+set NEED_FULL=
+for %%A in (%*) do (
+    if /i "%%~A"=="--full-check" set NEED_FULL=1
+)
+
+if not defined NEED_FULL (
+    if not exist "%FULL_STAMP%" (
+        set NEED_FULL=1
+    ) else (
+        forfiles /P "%STATE_DIR%" /M "last_full_check.stamp" /D -7 >nul 2>&1
+        if errorlevel 1 set NEED_FULL=
+        if not errorlevel 1 set NEED_FULL=1
+    )
+)
+
+REM Dependency presence check (informational only)
+if not defined NEED_FULL goto :check_pytorch
+python -c "import importlib.util as u; mods=['yaml','torch','torchvision','torchaudio','numpy','einops','transformers','tokenizers','sentencepiece','safetensors','aiohttp','yarl','PIL','scipy','tqdm','psutil','alembic','sqlalchemy','av']; missing=[m for m in mods if not u.find_spec(m)]; print('MISSING:' + (','.join(missing) if missing else 'NONE'))" > deps_check.tmp
+for /f "tokens=1,* delims=:" %%a in (deps_check.tmp) do (
+    if "%%a"=="MISSING" set MISSING_CRITICAL=%%b
+)
+del deps_check.tmp
+
+if not "%MISSING_CRITICAL%"=="NONE" (
+    echo.
+    echo  Missing required Python packages:
+    echo    %MISSING_CRITICAL%
+    echo.
+    if "%ENV_TYPE%"=="SYSTEM_PYTHON" (
+        echo  Tip: Creating a virtual environment is recommended:
+        echo    python -m venv venv ^&^& venv\Scripts\activate
+    )
+    echo.
+    echo  Install the dependencies, then run this script again:
+    echo    python -m pip install -r requirements.txt
+    echo.
+    exit /b 1
+)
+type nul > "%FULL_STAMP%"
+goto :check_pytorch
+
+:check_pytorch
+REM Fast path: read torch version without importing (import is slow)
+python -c "import sys; from importlib import util, metadata; s=util.find_spec('torch'); print('HAS_TORCH:' + ('1' if s else '0')); print('PYTORCH_VERSION:' + (metadata.version('torch') if s else 'NONE'))" > torch_meta.tmp 2>nul
+set HAS_TORCH=
+set PYTORCH_VERSION=NONE
+for /f "tokens=1,* delims=:" %%a in (torch_meta.tmp) do (
+    if "%%a"=="HAS_TORCH" set HAS_TORCH=%%b
+    if "%%a"=="PYTORCH_VERSION" set PYTORCH_VERSION=%%b
+)
+del torch_meta.tmp 2>nul
+
+REM Default CUDA vars
+set CUDA_AVAILABLE=False
+set CUDA_VERSION=NONE
+
+REM Only import torch to check CUDA if present and not CPU build
+if "%HAS_TORCH%"=="1" (
+    echo %PYTORCH_VERSION% | findstr /C:"+cpu" >nul
+    if errorlevel 1 (
+        python -c "import torch; print('CUDA_AVAILABLE:' + str(torch.cuda.is_available())); print('CUDA_VERSION:' + (torch.version.cuda or 'NONE'))" > pytorch_check.tmp 2>nul
+        if not errorlevel 1 (
+            for /f "tokens=1,* delims=:" %%a in (pytorch_check.tmp) do (
+                if "%%a"=="CUDA_AVAILABLE" set CUDA_AVAILABLE=%%b
+                if "%%a"=="CUDA_VERSION" set CUDA_VERSION=%%b
+            )
+        )
+        del pytorch_check.tmp 2>nul
+    )
+)
+
+REM Check if PyTorch version contains "+cpu" indicating CPU-only build
+echo %PYTORCH_VERSION% | findstr /C:"+cpu" >nul
+if not errorlevel 1 (
+    echo.
+    echo  CPU-only PyTorch detected.
+    echo  ComfyUI requires a CUDA-enabled PyTorch build for GPU acceleration.
+    echo.
+    echo  Install CUDA-enabled PyTorch, then run this script again. Example:
+    echo    python -m pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu130
+    echo.
+    exit /b 1
+)
+
+REM Check if CUDA is not available but PyTorch doesn't have "+cpu" (might be CUDA build but no GPU)
+if "%CUDA_AVAILABLE%"=="False" (
+    echo %PYTORCH_VERSION% | findstr /C:"+cpu" >nul
+    if errorlevel 1 (
+        echo.
+        echo  ╔═══════════════════════════════════════════════════════════╗
+        echo  ║                    GPU Not Detected                       ║
+        echo  ╚═══════════════════════════════════════════════════════════╝
+        echo.
+        echo  ▓ PyTorch has GPU support installed, but we couldn't find your graphics card.
+        echo.
+        echo  ▓ This could mean:
+        echo    - You don't have an NVIDIA graphics card
+        echo    - Your graphics card drivers need to be updated
+        echo    - Your graphics card isn't properly connected
+        echo.
+        echo  ▓ ComfyUI will run on your CPU instead, which will be slower.
+        echo.
+        set /p CONTINUE_CHOICE="Continue anyway? (Y/N): "
+        if /i not "%CONTINUE_CHOICE%"=="Y" (
+            echo.
+            echo  ▓ Exiting. Check your graphics card setup and try again.
+            pause
+            exit /b 0
+        )
+)
+)
+
+REM Proceed to launch
+goto :check_port
+
+:check_port
+if "%COMFY_PORT%"=="" set COMFY_PORT=8188
+netstat -ano | findstr /r /c:":%COMFY_PORT% .*LISTENING" >nul
+if errorlevel 1 (
+    goto :port_ok
+) else (
+    for /l %%P in (8189,1,8199) do (
+        netstat -ano | findstr /r /c:":%%P .*LISTENING" >nul
+        if errorlevel 1 (
+            set COMFY_PORT=%%P
+            echo.
+            echo  ▓ Port 8188 is busy. Rolling to free port %COMFY_PORT% in 5 seconds...
+            timeout /t 5 /nobreak >nul
+            goto :port_ok
+        )
+    )
+    echo.
+    echo  ▓ All fallback ports 8189-8199 appear busy. Please free a port and try again.
+    echo.
+    pause
+    exit /b 1
+)
+
+:port_ok
+goto :start_comfyui
+
+:start_comfyui
+echo.
+echo  ╔═══════════════════════════════════════════════════════════╗
+echo  ║                  Starting ComfyUI...                      ║
+echo  ╚═══════════════════════════════════════════════════════════╝
+echo.
+set GUI_URL=http://127.0.0.1:%COMFY_PORT%
+REM Spawn a background helper that opens the browser when the server is ready
+start "" cmd /c "for /l %%i in (1,1,20) do (powershell -NoProfile -Command \"try{(Invoke-WebRequest -Uri '%GUI_URL%' -Method Head -TimeoutSec 1)>$null; exit 0}catch{exit 1}\" ^& if not errorlevel 1 goto open ^& timeout /t 1 ^>nul) ^& :open ^& start \"\" \"%GUI_URL%\""
+python main.py --port %COMFY_PORT%
+if errorlevel 1 (
+    echo.
+    echo  ╔═══════════════════════════════════════════════════════════╗
+    echo  ║                    ComfyUI Crashed                        ║
+    echo  ╚═══════════════════════════════════════════════════════════╝
+    echo.
+    echo  ▓ ComfyUI encountered an error and stopped. Here's what might help:
+    echo.
+    echo  ▓ Error: "Port already in use"
+    echo    Solution: Close other ComfyUI instances or let this script auto-select a free port.
+    echo.
+    echo  ▓ Error: "Torch not compiled with CUDA enabled"
+    echo    Solution: You need to install the GPU version of PyTorch (see instructions above)
+    echo.
+    echo  ▓ Error: "ModuleNotFoundError" or "No module named"
+    echo    Solution: Run this script again to install missing packages
+    echo.
+    echo  ▓ Error: "CUDA out of memory" or "OOM"
+    echo    Solution: Your graphics card doesn't have enough memory. Try using smaller models.
+    echo.
+    echo  ▓ For other errors, check the error message above for clues.
+    echo    You can also visit: https://github.com/comfyanonymous/ComfyUI/issues
+    echo.
+    echo  ▓ The full error details are shown above.
+    echo.
+)
+pause
+
+
+
--- a/screenshots/.gitkeep
+++ b/screenshots/.gitkeep
@ -0,0 +1,3 @@
+# This file ensures the screenshots directory is tracked by git
+# Add screenshot files here as they are captured
+
--- a/server.py
+++ b/server.py
@ -686,7 +686,10 @@ class PromptServer():

        @routes.get("/object_info")
        async def get_object_info(request):
-            seed_assets(["models"])
+            try:
+                seed_assets(["models"])
+            except Exception as e:
+                logging.error(f"Failed to seed assets: {e}")
            with folder_paths.cache_helper:
                out = {}
                for x in nodes.NODE_CLASS_MAPPINGS:
--- a/tests-unit/comfy_api_nodes_test/mapper_utils_test.py
+++ b/tests-unit/comfy_api_nodes_test/mapper_utils_test.py
@ -1,297 +0,0 @@
-from typing import Optional
-from enum import Enum
-
-from pydantic import BaseModel, Field
-
-from comfy.comfy_types.node_typing import IO
-from comfy_api_nodes.mapper_utils import model_field_to_node_input
-
-
-def test_model_field_to_float_input():
-    """Tests mapping a float field with constraints."""
-
-    class ModelWithFloatField(BaseModel):
-        cfg_scale: Optional[float] = Field(
-            default=0.5,
-            description="Flexibility in video generation",
-            ge=0.0,
-            le=1.0,
-            multiple_of=0.001,
-        )
-
-    expected_output = (
-        IO.FLOAT,
-        {
-            "default": 0.5,
-            "tooltip": "Flexibility in video generation",
-            "min": 0.0,
-            "max": 1.0,
-            "step": 0.001,
-        },
-    )
-
-    actual_output = model_field_to_node_input(
-        IO.FLOAT, ModelWithFloatField, "cfg_scale"
-    )
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]
-
-
-def test_model_field_to_float_input_no_constraints():
-    """Tests mapping a float field with no constraints."""
-
-    class ModelWithFloatField(BaseModel):
-        cfg_scale: Optional[float] = Field(default=0.5)
-
-    expected_output = (
-        IO.FLOAT,
-        {
-            "default": 0.5,
-        },
-    )
-
-    actual_output = model_field_to_node_input(
-        IO.FLOAT, ModelWithFloatField, "cfg_scale"
-    )
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]
-
-
-def test_model_field_to_int_input():
-    """Tests mapping an int field with constraints."""
-
-    class ModelWithIntField(BaseModel):
-        num_frames: Optional[int] = Field(
-            default=10,
-            description="Number of frames to generate",
-            ge=1,
-            le=100,
-            multiple_of=1,
-        )
-
-    expected_output = (
-        IO.INT,
-        {
-            "default": 10,
-            "tooltip": "Number of frames to generate",
-            "min": 1,
-            "max": 100,
-            "step": 1,
-        },
-    )
-
-    actual_output = model_field_to_node_input(IO.INT, ModelWithIntField, "num_frames")
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]
-
-
-def test_model_field_to_string_input():
-    """Tests mapping a string field."""
-
-    class ModelWithStringField(BaseModel):
-        prompt: Optional[str] = Field(
-            default="A beautiful sunset over a calm ocean",
-            description="A prompt for the video generation",
-        )
-
-    expected_output = (
-        IO.STRING,
-        {
-            "default": "A beautiful sunset over a calm ocean",
-            "tooltip": "A prompt for the video generation",
-        },
-    )
-
-    actual_output = model_field_to_node_input(IO.STRING, ModelWithStringField, "prompt")
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]
-
-
-def test_model_field_to_string_input_multiline():
-    """Tests mapping a string field."""
-
-    class ModelWithStringField(BaseModel):
-        prompt: Optional[str] = Field(
-            default="A beautiful sunset over a calm ocean",
-            description="A prompt for the video generation",
-        )
-
-    expected_output = (
-        IO.STRING,
-        {
-            "default": "A beautiful sunset over a calm ocean",
-            "tooltip": "A prompt for the video generation",
-            "multiline": True,
-        },
-    )
-
-    actual_output = model_field_to_node_input(
-        IO.STRING, ModelWithStringField, "prompt", multiline=True
-    )
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]
-
-
-def test_model_field_to_combo_input():
-    """Tests mapping a combo field."""
-
-    class MockEnum(str, Enum):
-        option_1 = "option 1"
-        option_2 = "option 2"
-        option_3 = "option 3"
-
-    class ModelWithComboField(BaseModel):
-        model_name: Optional[MockEnum] = Field("option 1", description="Model Name")
-
-    expected_output = (
-        IO.COMBO,
-        {
-            "options": ["option 1", "option 2", "option 3"],
-            "default": "option 1",
-            "tooltip": "Model Name",
-        },
-    )
-
-    actual_output = model_field_to_node_input(
-        IO.COMBO, ModelWithComboField, "model_name", enum_type=MockEnum
-    )
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]
-
-
-def test_model_field_to_combo_input_no_options():
-    """Tests mapping a combo field with no options."""
-
-    class ModelWithComboField(BaseModel):
-        model_name: Optional[str] = Field(description="Model Name")
-
-    expected_output = (
-        IO.COMBO,
-        {
-            "tooltip": "Model Name",
-        },
-    )
-
-    actual_output = model_field_to_node_input(
-        IO.COMBO, ModelWithComboField, "model_name"
-    )
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]
-
-
-def test_model_field_to_image_input():
-    """Tests mapping an image field."""
-
-    class ModelWithImageField(BaseModel):
-        image: Optional[str] = Field(
-            default=None,
-            description="An image for the video generation",
-        )
-
-    expected_output = (
-        IO.IMAGE,
-        {
-            "default": None,
-            "tooltip": "An image for the video generation",
-        },
-    )
-
-    actual_output = model_field_to_node_input(IO.IMAGE, ModelWithImageField, "image")
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]
-
-
-def test_model_field_to_node_input_no_description():
-    """Tests mapping a field with no description."""
-
-    class ModelWithNoDescriptionField(BaseModel):
-        field: Optional[str] = Field(default="default value")
-
-    expected_output = (
-        IO.STRING,
-        {
-            "default": "default value",
-        },
-    )
-
-    actual_output = model_field_to_node_input(
-        IO.STRING, ModelWithNoDescriptionField, "field"
-    )
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]
-
-
-def test_model_field_to_node_input_no_default():
-    """Tests mapping a field with no default."""
-
-    class ModelWithNoDefaultField(BaseModel):
-        field: Optional[str] = Field(description="A field with no default")
-
-    expected_output = (
-        IO.STRING,
-        {
-            "tooltip": "A field with no default",
-        },
-    )
-
-    actual_output = model_field_to_node_input(
-        IO.STRING, ModelWithNoDefaultField, "field"
-    )
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]
-
-
-def test_model_field_to_node_input_no_metadata():
-    """Tests mapping a field with no metadata or properties defined on the schema."""
-
-    class ModelWithNoMetadataField(BaseModel):
-        field: Optional[str] = Field()
-
-    expected_output = (
-        IO.STRING,
-        {},
-    )
-
-    actual_output = model_field_to_node_input(
-        IO.STRING, ModelWithNoMetadataField, "field"
-    )
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]
-
-
-def test_model_field_to_node_input_default_is_none():
-    """
-    Tests mapping a field with a default of `None`.
-    I.e., the default field should be included as the schema explicitly sets it to `None`.
-    """
-
-    class ModelWithNoneDefaultField(BaseModel):
-        field: Optional[str] = Field(
-            default=None, description="A field with a default of None"
-        )
-
-    expected_output = (
-        IO.STRING,
-        {
-            "default": None,
-            "tooltip": "A field with a default of None",
-        },
-    )
-
-    actual_output = model_field_to_node_input(
-        IO.STRING, ModelWithNoneDefaultField, "field"
-    )
-
-    assert actual_output[0] == expected_output[0]
-    assert actual_output[1] == expected_output[1]