diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 000000000..bd6a3e5e8
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,272 @@
+## Engineering Style
+
+- Keep changes small and direct. Most fixes should touch the narrowest code path
+ that explains the bug, performance issue, dtype issue, model-format issue, or
+ user-facing behavior.
+- Change the least amount of files possible. A change that touches many files is
+ more likely to be a bad change than a good one unless the broader scope is
+ directly required.
+- Prefer practical fixes over broad architecture work. Add abstractions only
+ when they remove real repeated logic or match an existing ComfyUI pattern.
+- Prefer fewer dependencies. Do not add new dependencies to ComfyUI unless they
+ are absolutely necessary.
+- Delete obsolete code aggressively when newer infrastructure makes it useless.
+ Remove dead fallbacks, migration paths, unused options, debug prints, and
+ compatibility branches that are no longer needed. Do not leave dead branches,
+ unreachable code, or functions that are never called. If code is not
+ necessary for the current behavior, remove it.
+- Revert or disable problematic behavior quickly when it breaks users. It is
+ better to remove a broken feature path than keep a complicated partial fix.
+- Preserve existing APIs, node names, model-loading behavior, file layout, and
+ workflow compatibility unless the change is explicitly about replacing them.
+- Code must look hand-written for this repository. Changes that read like
+ generic AI-generated code will be rejected automatically: unnecessary helper
+ layers, vague names, boilerplate comments, defensive branches without a real
+ failure mode, broad rewrites, or code that ignores the local style.
+
+## Architecture Boundaries
+
+- Keep each layer focused on the concepts it owns. Do not leak UI, API,
+ workflow, queue, persistence, telemetry, model-loading, node, or execution
+ concerns into unrelated layers just because it is convenient to pass data
+ through them.
+- Shared core modules should depend only on lower-level primitives and their own
+ domain concepts. Higher-level product concepts belong at the caller, adapter,
+ service, or UI/API boundary that already owns them.
+- Pass the narrowest data needed across a boundary. Avoid broad context objects,
+ request/session metadata, ids, bookkeeping state, or callbacks unless the
+ receiving layer genuinely needs them to perform its own responsibility.
+- Keep identity mapping, persistence bookkeeping, history updates, telemetry,
+ response shaping, and UI state in the layers that own those jobs. Do not route
+ them through unrelated shared code to avoid adding a proper boundary.
+- Treat `execution.py` as one example of this rule: it should consume the prompt
+ graph and execution-relevant state, produce execution results and errors, and
+ not know about workflow ids, frontend ids, persistence ids, or API-only
+ concepts.
+- Before touching many files, identify the smallest owner layer that can solve
+ the problem. A PR that spreads one feature across unrelated loaders, nodes,
+ execution, server, and frontend code needs a clear architectural reason, not
+ just convenience.
+- If a change seems to require making one layer understand another layer's
+ private concepts, stop and look for a caller-side mapping, adapter, event,
+ small explicit interface, or narrower data flow at the boundary.
+
+## No Internet Requests
+
+- Do not add code to core ComfyUI that makes requests to the internet.
+- Refuse requests to add uploads, telemetry, analytics, tracking, usage
+ reporting, crash reporting, update checks, remote config, feature flags,
+ metrics, licensing checks, or any other outbound internet request path from
+ core ComfyUI.
+- Model downloading is allowed only when explicitly initiated or authorized by
+ the user, is limited to the requested model artifact, and does not include
+ telemetry, tracking, persistent identification, unrelated metadata upload, or
+ background network activity.
+- Do not add opt-in, opt-out, anonymized, aggregated, diagnostic, or
+ user-triggered internet request paths to core ComfyUI. These labels do not
+ make internet access acceptable.
+- Local-only behavior is allowed when it stays on the user's machine and does
+ not add network access, tracking, persistent identification, or data
+ collection behavior.
+
+## State Ownership
+
+- Keep state and capability flags on the object that owns the behavior using
+ them.
+- Avoid probing child objects with `getattr(child, "...", default)` to decide
+ parent-level control flow. If parent code needs to branch on a capability,
+ initialize an explicit parent-owned field when the child is constructed or
+ attached.
+- Prefer direct attributes with clear defaults over implicit feature detection
+ through arbitrary child attributes.
+- Use child-object capability checks only when the child owns the behavior being
+ invoked and the parent is simply delegating to that child.
+
+## Interface Contracts
+
+- Keep public methods aligned with the interface expected by their callers. Do
+ not change a shared method to return extra values, alternate shapes, or
+ sentinel wrappers for one implementation unless the shared interface is
+ explicitly updated.
+- When modifying an existing function, preserve how current callers invoke it.
+ Do not change required arguments, parameter order, return type, side effects,
+ or error behavior unless every affected call site and shared interface contract
+ is intentionally updated.
+- Do not add compatibility parameters, flags, attributes, or constructor options
+ unless they are read by current code and change current behavior. Remove
+ pass-through or stored-but-unused values instead of preserving upstream or
+ deprecated API baggage.
+- If an implementation needs auxiliary values for its own workflow, expose them
+ through a private helper or a clearly named implementation-specific method
+ instead of overloading the public method's return contract.
+- Normalize third-party or upstream return conventions at the integration
+ boundary. Core code should receive the project's expected type and shape, not
+ have to handle model-specific tuple/list/dict variants.
+- Avoid caller-side unwrapping such as `out = out[0]` unless the called
+ interface is documented to return that structure.
+
+## Autograd and Model Freezing
+
+- Do not add `torch.no_grad`, `torch.inference_mode`, or inference-mode helper
+ wrappers in ComfyUI code. The only allowed inference-mode-related use is
+ disabling a globally set inference mode when a training path needs gradients.
+- Do not add freeze, unfreeze, or trainability toggles to model classes. ComfyUI
+ models are always treated as frozen for inference, so explicit freeze
+ functionality is redundant and should not be added.
+- Remove training-only behavior such as dropout from inference model code, but
+ preserve checkpoint and state-dict compatibility when doing so. If deleting a
+ module would change state-dict keys, module ordering, or checkpoint loading
+ behavior, replace it with a no-op such as `nn.Identity` instead of removing the
+ slot outright.
+
+## Python Style
+
+- Keep imports at module scope. Avoid inline imports unless they are already part
+ of an established optional-backend probe or are needed to avoid an import
+ cycle.
+- Do not add unnecessary `try`/`except` blocks. Use them for optional dependency,
+ platform, or backend capability detection only when the program has a useful
+ fallback. Prefer specific exception types when changing new code.
+- Remove any workarounds for PyTorch versions that ComfyUI no longer officially
+ supports. Deprecated workarounds include catching an exception and rerunning
+ the same op with the input cast to float. If a workaround does not have a
+ comment naming the exact PyTorch version or versions that still need it,
+ remove it.
+- Let unsupported model formats, invalid quantization metadata, and bad states
+ fail with clear errors instead of silently producing lower quality output.
+- Match the existing local style in the file you edit. This codebase tolerates
+ long lines, simple helper functions, module-level state, and direct tensor
+ operations when they make the code easier to follow.
+- Keep comments sparse and useful. Strip useless comments that restate the code
+ or describe obvious behavior. Short TODOs are fine when they name the concrete
+ missing follow-up.
+
+## Model, Device, and Memory Behavior
+
+- Treat dtype, device placement, VRAM usage, and offloading behavior as core
+ correctness concerns. Check CPU, CUDA, ROCm, MPS, DirectML, XPU, NPU, and low
+ VRAM implications when touching shared execution or loading code.
+- Prefer native ComfyUI formats and existing quantization/offload helpers over
+ adding parallel code paths. Use `comfy.quant_ops`, `comfy.model_management`,
+ `comfy.memory_management`, `comfy.pinned_memory`, `comfy_aimdo`, and
+ `comfy-kitchen` helpers where they already solve the problem.
+- Use optimized comfy-kitchen ops in places where they improve performance
+ without changing the expected dtype, device, memory, or interface behavior.
+- All models should use the optimized attention function selected by ComfyUI.
+ Treat optimized backend functions, dispatch helpers, and capability-selected
+ callables as opaque. Higher-level code must not inspect function identity,
+ names, modules, or implementation details to decide behavior.
+- Apply the same opacity rule to similar patterns beyond attention: callers
+ should depend on the documented interface and result contract, not on which
+ backend implementation was selected underneath.
+- Do not use custom inference ops that only duplicate an existing op while
+ upcasting to float32, such as custom RMSNorm variants. Use the generic ComfyUI
+ ops and/or native torch ops instead.
+- If a model class `__init__` has an `operations` parameter, assume
+ `operations` is never `None`. Do not add fallback branches or default torch
+ ops for a missing `operations` object.
+- Do not add unnecessary parameters to model, model block, or model ops related
+ classes. Constructor and forward signatures should carry only values that are
+ actually needed by that object for inference.
+- Reuse existing model classes, blocks, ops, and helper modules when appropriate.
+ Before implementing a new version of a model component, search the existing
+ model code for a class or helper that already provides the behavior.
+- Avoid adding `einops` usage in core inference code. Use native torch tensor
+ ops such as `reshape`, `view`, `permute`, `transpose`, `flatten`, `unflatten`,
+ `unsqueeze`, and `squeeze` instead.
+- Do not use tensors as general-purpose Python data structures. Keep metadata,
+ bookkeeping, counters, flags, shape math, padding math, index planning, memory
+ estimates, and control-flow decisions in plain Python values unless the data
+ must participate directly in tensor computation. Avoid creating temporary
+ tensors just to use tensor methods for scalar or structural calculations.
+- Avoid unnecessary casts and transfers. Preserve the intended compute dtype,
+ storage dtype, bias dtype, and original tensor shape metadata.
+- Assume inputs to the main model forward are already in the compute dtype by
+ default, except integer inputs such as some model timestep tensors. Do not add
+ defensive or convenience casts in model code; it is better for invalid dtype
+ plumbing to error clearly than to hide it with unnecessary casts.
+- Raw model parameters that are not owned by an op and may be initialized in a
+ dtype different from the compute dtype should be cast at use in forward or
+ inference code with `comfy.ops.cast_to_input` or
+ `comfy.model_management.cast_to` to avoid dtype mismatches.
+- Model code should not care what dtype it is initialized in, and model
+ `__init__` methods should not contain workarounds for specific dtypes. Dtype
+ workaround code, such as making a model work with fp16 compute, belongs in the
+ execution or model-management layer that owns compute policy.
+- Model code should not perform unnecessary device-to-CPU or CPU-to-device
+ transfers. New allocations must be created on the correct device and dtype;
+ never allocate on CPU and then move to GPU, or allocate in one dtype and then
+ convert to another.
+- Model code itself should not perform memory management. Loading, unloading,
+ offloading, device movement, VRAM policy, cache lifetime, and cleanup belong
+ in the relevant model-management and execution layers, not inside model
+ implementations.
+- Do not add global, module-level, class-level, singleton, or model-owned stores
+ for tensors or other large memory that persist across executions. Temporary
+ caches must be scoped to a single execution or forward/encode/decode call:
+ allocate them in the owning top-level call, pass them explicitly through the
+ call stack, and let them be discarded when that call returns.
+- Follow the Wan VAE temporal cache pattern for temporary caches: create a local
+ cache such as `feat_map` for the encode/decode operation, pass it into the
+ blocks that need it, and do not retain it on the model or in global state.
+- In model init code, prefer `torch.empty` for parameter/buffer placeholders
+ that are populated from the model state dict instead of zero-initializing with
+ `torch.zeros` or similar. If an allocation is not loaded from the state dict
+ and is useless for inference, do not include it.
+- `nn.Parameter` tensors that are stored in and populated from the model state
+ dict should be initialized with `torch.empty`, not with zero, random, or
+ otherwise meaningful initialization.
+- Model initialization should describe module structure, not fabricate
+ checkpoint-owned tensor contents. Parameters and buffers that are loaded from
+ the state dict must not be manually initialized, reassigned, or filled with
+ fallback values unless that value is actually used when no checkpoint key
+ exists.
+- When slicing large tensors, copy the slice if the sliced tensor's lifetime
+ exceeds the current function scope. Do not keep a long-lived view into a large
+ backing tensor when a smaller copy would release memory sooner.
+- Use fused or compound torch operations such as `addcmul` when they naturally
+ match the math. Reducing Python and torch dispatch overhead is a valid
+ optimization when it does not obscure the code or change dtype/device
+ behavior.
+- Avoid caches that persist across different executions as much as possible.
+ Persistent caches are acceptable only when they use a very minimal amount of
+ memory and have a clear ownership and invalidation story.
+- When optimizing, favor small measurable changes: fewer allocations, fewer
+ device transfers, less peak memory, better batching, or use of a faster
+ existing backend op.
+
+## Nodes and User-Facing Behavior
+
+- Follow existing node conventions: `INPUT_TYPES`, `RETURN_TYPES`, `FUNCTION`,
+ `CATEGORY`, and registration through the local mapping used by that file.
+- Keep node changes backward compatible by default. Add inputs with sensible
+ defaults and avoid changing output types unless the request requires it.
+- Model implementations should add the minimal number of ComfyUI nodes required
+ to run the model. Reuse existing nodes as much as possible; adapting the model
+ to work with existing nodes is strongly preferred over creating new nodes.
+- Node-level code must not patch model code directly. Any node behavior that
+ modifies, wraps, hooks, or changes model behavior must go through the model
+ patcher class instead of reaching into model internals.
+- The official mascot of ComfyUI is a very cute anime girl with massive fennec
+ ears, a big fluffy tail, long blonde wavy hair, and blue eyes. Feel free to
+ use her in ComfyUI materials, UI text, examples, tests, generated assets, or
+ comments, but do not disrespect her.
+- Warning and info messages should be short and actionable. Remove noisy or
+ misleading messages rather than adding more logging.
+- Documentation and README edits should be concise, factual, and tied to the
+ changed behavior.
+
+## Commit and Review Habits
+
+- If asked to write commit messages, use short direct subjects like the existing
+ history: `Fix ...`, `Add ...`, `Support ...`, `Remove ...`, `Update ...`,
+ `Make ...`, `Use ...`, `Disable ...`, `Bump ...`, or `Revert ...`.
+- Keep PR descriptions short and reviewable. State the problem, the behavioral
+ change, and the tests run; avoid long narrative explanations, implementation
+ diaries, or exhaustive file-by-file summaries unless the reviewer explicitly
+ needs that context.
+- Prefer one coherent behavioral change per commit. Dependency pins, tests, and
+ the code that needs them may be in the same commit when they are inseparable.
+- In reviews, prioritize real user impact: crashes, wrong dtype/device behavior,
+ memory regressions, broken model loading, workflow incompatibility, and noisy
+ or misleading user-facing output.
diff --git a/app/assets/api/routes.py b/app/assets/api/routes.py
index 7ef462f5c..53c84eff3 100644
--- a/app/assets/api/routes.py
+++ b/app/assets/api/routes.py
@@ -306,12 +306,15 @@ async def download_asset_content(request: web.Request) -> web.Response:
404, "FILE_NOT_FOUND", "Underlying file not found on disk."
)
- _DANGEROUS_MIME_TYPES = {
- "text/html", "text/html-sandboxed", "application/xhtml+xml",
- "text/javascript", "text/css",
- }
- if content_type in _DANGEROUS_MIME_TYPES:
+ # User-controlled asset content must never render inline in the app origin
+ # (stored XSS via SVG/HTML/XML). Force dangerous types to download and
+ # override any requested inline disposition. Centralised through
+ # folder_paths.is_dangerous_content_type so this can't drift from /view and
+ # /userdata (the previous inline set here omitted image/svg+xml and missed
+ # the charset/casing/+xml-dialect bypasses).
+ if folder_paths.is_dangerous_content_type(content_type):
content_type = "application/octet-stream"
+ disposition = "attachment"
safe_name = (filename or "").replace("\r", "").replace("\n", "")
encoded = urllib.parse.quote(safe_name)
diff --git a/app/model_manager.py b/app/model_manager.py
index 8f6e34b33..b0329ce17 100644
--- a/app/model_manager.py
+++ b/app/model_manager.py
@@ -50,21 +50,45 @@ class ModelFileManager:
@routes.get("/experiment/models/preview/{folder}/{path_index}/{filename:.*}")
async def get_model_preview(request):
folder_name = request.match_info.get("folder", None)
- path_index = int(request.match_info.get("path_index", None))
filename = request.match_info.get("filename", None)
if folder_name not in folder_paths.folder_names_and_paths:
return web.Response(status=404)
+ # The "{filename:.*}" capture also matches the empty string, which
+ # would resolve to the folder itself; reject it explicitly.
+ if not filename:
+ return web.Response(status=400)
+
+ try:
+ path_index = int(request.match_info.get("path_index", None))
+ except (TypeError, ValueError):
+ return web.Response(status=400)
+
folders = folder_paths.folder_names_and_paths[folder_name]
+ if path_index < 0 or path_index >= len(folders[0]):
+ return web.Response(status=404)
folder = folders[0][path_index]
- full_filename = os.path.join(folder, filename)
+ full_filename = os.path.normpath(os.path.join(folder, filename))
+
+ # Prevent path traversal: the requested file must stay within the
+ # configured model folder. `filename` is an unrestricted ".*" capture,
+ # so values like "../../../../etc/passwd" would otherwise escape it.
+ if not folder_paths.is_within_directory(folder, full_filename):
+ return web.Response(status=403)
previews = self.get_model_previews(full_filename)
default_preview = previews[0] if len(previews) > 0 else None
if default_preview is None or (isinstance(default_preview, str) and not os.path.isfile(default_preview)):
return web.Response(status=404)
+ # The preview is selected by a glob inside get_model_previews, so a
+ # companion file (e.g. "model.preview.png") could itself be a symlink
+ # resolving outside the model folder. Re-validate the file actually
+ # opened: is_within_directory realpaths it, catching symlink escape.
+ if isinstance(default_preview, str) and not folder_paths.is_within_directory(folder, default_preview):
+ return web.Response(status=403)
+
try:
with Image.open(default_preview) as img:
img_bytes = BytesIO()
diff --git a/app/user_manager.py b/app/user_manager.py
index 7b11e381c..de261ad39 100644
--- a/app/user_manager.py
+++ b/app/user_manager.py
@@ -6,6 +6,7 @@ import glob
import shutil
import logging
import tempfile
+import mimetypes
from aiohttp import web
from urllib import parse
from comfy.cli_args import args
@@ -336,7 +337,20 @@ class UserManager():
if not isinstance(path, str):
return path
- return web.FileResponse(path)
+ # User data files are arbitrary user-supplied content and are never
+ # meant to render inline. Disable MIME sniffing and force a download
+ # so uploaded markup/scripts can't execute in the app origin (stored
+ # XSS). Content-Disposition: attachment is the load-bearing guard;
+ # the content-type override and nosniff are defence in depth.
+ content_type = mimetypes.guess_type(path)[0] or 'application/octet-stream'
+ if folder_paths.is_dangerous_content_type(content_type):
+ content_type = 'application/octet-stream'
+
+ return web.FileResponse(path, headers={
+ "Content-Type": content_type,
+ "X-Content-Type-Options": "nosniff",
+ "Content-Disposition": "attachment",
+ })
@routes.post("/userdata/{file}")
async def post_userdata(request):
diff --git a/comfy/cli_args.py b/comfy/cli_args.py
index e3099a230..4bef096fb 100644
--- a/comfy/cli_args.py
+++ b/comfy/cli_args.py
@@ -240,6 +240,7 @@ database_default_path = os.path.abspath(
)
parser.add_argument("--database-url", type=str, default=f"sqlite:///{database_default_path}", help="Specify the database URL, e.g. for an in-memory database you can use 'sqlite:///:memory:'.")
parser.add_argument("--enable-assets", action="store_true", help="Enable the assets system (API routes, database synchronization, and background scanning).")
+parser.add_argument("--enable-asset-hashing", action="store_true", help="Compute blake3 content hashes when scanning assets. Hashing enables future asset-portability features (deduplication, cross-machine model resolution) but adds startup cost and per-output cost on large models directories. Off by default; enable to opt in.")
parser.add_argument("--feature-flag", type=str, action='append', default=[], metavar="KEY[=VALUE]", help="Set a server feature flag. Use KEY=VALUE to set an explicit value, or bare KEY to set it to true. Can be specified multiple times. Boolean values (true/false) and numbers are auto-converted. Examples: --feature-flag show_signin_button=true or --feature-flag show_signin_button")
parser.add_argument("--list-feature-flags", action="store_true", help="Print the registry of known CLI-settable feature flags as JSON and exit.")
diff --git a/comfy/text_encoders/qwen3vl.py b/comfy/text_encoders/qwen3vl.py
index 59c9aae6d..2082c42e7 100644
--- a/comfy/text_encoders/qwen3vl.py
+++ b/comfy/text_encoders/qwen3vl.py
@@ -167,7 +167,7 @@ class Qwen3VLTokenizer(sd1_clip.SD1Tokenizer):
embed_count = 0
for r in tokens[key_name]:
for i in range(len(r)):
- if r[i][0] == 151655: # <|image_pad|>
+ if isinstance(r[i][0], (int, float)) and r[i][0] == 151655: # <|image_pad|>
if len(images) > embed_count:
r[i] = ({"type": "image", "data": images[embed_count], "original_type": "image"},) + r[i][1:]
embed_count += 1
diff --git a/comfy_api_nodes/apis/gemini.py b/comfy_api_nodes/apis/gemini.py
index caaba8f36..7b2543270 100644
--- a/comfy_api_nodes/apis/gemini.py
+++ b/comfy_api_nodes/apis/gemini.py
@@ -121,6 +121,7 @@ class GeminiGenerationConfig(BaseModel):
topK: int | None = Field(None, ge=1)
topP: float | None = Field(None, ge=0.0, le=1.0)
thinkingConfig: GeminiThinkingConfig | None = Field(None)
+ responseModalities: list[str] | None = Field(None)
class GeminiImageOutputOptions(BaseModel):
diff --git a/comfy_api_nodes/apis/ideogram.py b/comfy_api_nodes/apis/ideogram.py
index c5ad9559f..ee3256e96 100644
--- a/comfy_api_nodes/apis/ideogram.py
+++ b/comfy_api_nodes/apis/ideogram.py
@@ -33,53 +33,6 @@ class IdeogramColorPalette(
)
-class ImageRequest(BaseModel):
- aspect_ratio: Optional[str] = Field(
- None,
- description="Optional. The aspect ratio (e.g., 'ASPECT_16_9', 'ASPECT_1_1'). Cannot be used with resolution. Defaults to 'ASPECT_1_1' if unspecified.",
- )
- color_palette: Optional[Dict[str, Any]] = Field(
- None, description='Optional. Color palette object. Only for V_2, V_2_TURBO.'
- )
- magic_prompt_option: Optional[str] = Field(
- None, description="Optional. MagicPrompt usage ('AUTO', 'ON', 'OFF')."
- )
- model: str = Field(..., description="The model used (e.g., 'V_2', 'V_2A_TURBO')")
- negative_prompt: Optional[str] = Field(
- None,
- description='Optional. Description of what to exclude. Only for V_1, V_1_TURBO, V_2, V_2_TURBO.',
- )
- num_images: Optional[int] = Field(
- 1,
- description='Optional. Number of images to generate (1-8). Defaults to 1.',
- ge=1,
- le=8,
- )
- prompt: str = Field(
- ..., description='Required. The prompt to use to generate the image.'
- )
- resolution: Optional[str] = Field(
- None,
- description="Optional. Resolution (e.g., 'RESOLUTION_1024_1024'). Only for model V_2. Cannot be used with aspect_ratio.",
- )
- seed: Optional[int] = Field(
- None,
- description='Optional. A number between 0 and 2147483647.',
- ge=0,
- le=2147483647,
- )
- style_type: Optional[str] = Field(
- None,
- description="Optional. Style type ('AUTO', 'GENERAL', 'REALISTIC', 'DESIGN', 'RENDER_3D', 'ANIME'). Only for models V_2 and above.",
- )
-
-
-class IdeogramGenerateRequest(BaseModel):
- image_request: ImageRequest = Field(
- ..., description='The image generation request parameters.'
- )
-
-
class Datum(BaseModel):
is_image_safe: Optional[bool] = Field(
None, description='Indicates whether the image is considered safe.'
@@ -113,20 +66,6 @@ class StyleCode(RootModel[str]):
root: str = Field(..., pattern='^[0-9A-Fa-f]{8}$')
-class Datum1(BaseModel):
- is_image_safe: Optional[bool] = None
- prompt: Optional[str] = None
- resolution: Optional[str] = None
- seed: Optional[int] = None
- style_type: Optional[str] = None
- url: Optional[str] = None
-
-
-class IdeogramV3IdeogramResponse(BaseModel):
- created: Optional[datetime] = None
- data: Optional[List[Datum1]] = None
-
-
class RenderingSpeed1(str, Enum):
TURBO = 'TURBO'
DEFAULT = 'DEFAULT'
diff --git a/comfy_api_nodes/nodes_gemini.py b/comfy_api_nodes/nodes_gemini.py
index a63625ada..aa992802d 100644
--- a/comfy_api_nodes/nodes_gemini.py
+++ b/comfy_api_nodes/nodes_gemini.py
@@ -13,7 +13,7 @@ import torch
from typing_extensions import override
import folder_paths
-from comfy_api.latest import IO, ComfyExtension, Input, Types
+from comfy_api.latest import IO, ComfyExtension, Input, InputImpl, Types
from comfy_api_nodes.apis.gemini import (
GeminiContent,
GeminiFileData,
@@ -37,6 +37,7 @@ from comfy_api_nodes.util import (
audio_to_base64_string,
bytesio_to_image_tensor,
download_url_to_image_tensor,
+ download_url_to_video_output,
get_number_of_images,
sync_op,
tensor_to_base64_string,
@@ -45,6 +46,7 @@ from comfy_api_nodes.util import (
upload_images_to_comfyapi,
upload_video_to_comfyapi,
validate_string,
+ validate_video_duration,
video_to_base64_string,
)
@@ -229,10 +231,29 @@ async def get_image_from_response(response: GeminiGenerateContentResponse, thoug
return torch.cat(image_tensors, dim=0)
+async def get_video_from_response(
+ response: GeminiGenerateContentResponse, cls: type[IO.ComfyNode] | None = None
+) -> InputImpl.VideoFromFile:
+ parts = get_parts_by_type(response, "video/*")
+ for part in parts:
+ if part.inlineData and part.inlineData.data:
+ return InputImpl.VideoFromFile(BytesIO(base64.b64decode(part.inlineData.data)))
+ if part.fileData and part.fileData.fileUri:
+ return await download_url_to_video_output(part.fileData.fileUri, cls=cls)
+ model_message = get_text_from_response(response).strip()
+ if model_message:
+ raise ValueError(f"Gemini did not generate a video. Model response: {model_message}")
+ raise ValueError(
+ "Gemini did not generate a video. Try rephrasing your prompt, "
+ "shortening the requested duration, or reducing the number of input images/videos."
+ )
+
+
def calculate_tokens_price(response: GeminiGenerateContentResponse) -> float | None:
if not response.modelVersion:
return None
# Define prices (Cost per 1,000,000 tokens), see https://cloud.google.com/vertex-ai/generative-ai/pricing
+ output_video_tokens_price = 0.0
if response.modelVersion == "gemini-2.5-pro":
input_tokens_price = 1.25
output_text_tokens_price = 10.0
@@ -249,18 +270,27 @@ def calculate_tokens_price(response: GeminiGenerateContentResponse) -> float | N
input_tokens_price = 2
output_text_tokens_price = 12.0
output_image_tokens_price = 0.0
- elif response.modelVersion == "gemini-3.1-flash-lite-preview":
+ elif response.modelVersion in ("gemini-3.1-flash-lite-preview", "gemini-3.1-flash-lite"):
input_tokens_price = 0.25
output_text_tokens_price = 1.50
output_image_tokens_price = 0.0
- elif response.modelVersion == "gemini-3-pro-image-preview":
+ elif response.modelVersion in ("gemini-3-pro-image-preview", "gemini-3-pro-image"):
input_tokens_price = 2
output_text_tokens_price = 12.0
output_image_tokens_price = 120.0
- elif response.modelVersion == "gemini-3.1-flash-image-preview":
+ elif response.modelVersion in ("gemini-3.1-flash-image-preview", "gemini-3.1-flash-image"):
input_tokens_price = 0.5
output_text_tokens_price = 3.0
output_image_tokens_price = 60.0
+ elif response.modelVersion == "gemini-3.1-flash-lite-image":
+ input_tokens_price = 0.25
+ output_text_tokens_price = 1.50
+ output_image_tokens_price = 30.0
+ elif response.modelVersion == "gemini-omni-flash-preview":
+ input_tokens_price = 2.145
+ output_text_tokens_price = 12.87
+ output_image_tokens_price = 0.0
+ output_video_tokens_price = 25.025
else:
return None
final_price = response.usageMetadata.promptTokenCount * input_tokens_price
@@ -268,6 +298,8 @@ def calculate_tokens_price(response: GeminiGenerateContentResponse) -> float | N
for i in response.usageMetadata.candidatesTokensDetails:
if i.modality == Modality.IMAGE:
final_price += output_image_tokens_price * i.tokenCount # for Nano Banana models
+ elif i.modality == Modality.VIDEO:
+ final_price += output_video_tokens_price * i.tokenCount # for Omni Flash
else:
final_price += output_text_tokens_price * i.tokenCount
if response.usageMetadata.thoughtsTokenCount:
@@ -1302,7 +1334,7 @@ class GeminiNanoBanana2(IO.ComfyNode):
)
-def _nano_banana_2_v2_model_inputs():
+def _nano_banana_2_v2_model_inputs(resolutions: list[str]):
return [
IO.Combo.Input(
"aspect_ratio",
@@ -1329,8 +1361,8 @@ def _nano_banana_2_v2_model_inputs():
),
IO.Combo.Input(
"resolution",
- options=["1K", "2K", "4K"],
- tooltip="Target output resolution. For 2K/4K the native Gemini upscaler is used.",
+ options=resolutions,
+ tooltip="Target output resolution.",
),
IO.Combo.Input(
"thinking_level",
@@ -1376,7 +1408,11 @@ class GeminiNanoBanana2V2(IO.ComfyNode):
options=[
IO.DynamicCombo.Option(
"Nano Banana 2 (Gemini 3.1 Flash Image)",
- _nano_banana_2_v2_model_inputs(),
+ _nano_banana_2_v2_model_inputs(resolutions=["1K", "2K", "4K"]),
+ ),
+ IO.DynamicCombo.Option(
+ "Nano Banana 2 Lite",
+ _nano_banana_2_v2_model_inputs(resolutions=["1K"]),
),
],
),
@@ -1445,9 +1481,13 @@ class GeminiNanoBanana2V2(IO.ComfyNode):
depends_on=IO.PriceBadgeDepends(widgets=["model", "model.resolution"]),
expr="""
(
- $r := $lookup(widgets, "model.resolution");
- $prices := {"1k": 0.0696, "2k": 0.1014, "4k": 0.154};
- {"type":"usd","usd": $lookup($prices, $r), "format":{"suffix":"/Image","approximate":true}}
+ $contains(widgets.model, "lite")
+ ? {"type":"usd","usd": 0.034, "format":{"suffix":"/Image","approximate":true}}
+ : (
+ $r := $lookup(widgets, "model.resolution");
+ $prices := {"1k": 0.0696, "2k": 0.1014, "4k": 0.154};
+ {"type":"usd","usd": $lookup($prices, $r), "format":{"suffix":"/Image","approximate":true}}
+ )
)
""",
),
@@ -1468,6 +1508,8 @@ class GeminiNanoBanana2V2(IO.ComfyNode):
model_choice = model["model"]
if model_choice == "Nano Banana 2 (Gemini 3.1 Flash Image)":
model_id = "gemini-3.1-flash-image-preview"
+ elif model_choice == "Nano Banana 2 Lite":
+ model_id = "gemini-3.1-flash-lite-image"
else:
model_id = model_choice
@@ -1517,6 +1559,149 @@ class GeminiNanoBanana2V2(IO.ComfyNode):
)
+OMNI_MAX_IMAGES = 14
+OMNI_MAX_VIDEOS = 3
+
+OMNI_MODELS: dict[str, str] = {
+ "Omni Flash": "gemini-omni-flash-preview",
+}
+
+
+def _omni_flash_inputs() -> list[Input]:
+ """Per-model inputs for the Omni video DynamicCombo (prompt + reference media + sampling)."""
+ return [
+ IO.String.Input(
+ "prompt",
+ multiline=True,
+ default="",
+ tooltip="Describe the video to generate. Specify the length and aspect ratio directly in the "
+ 'prompt, e.g. "a 6-second clip in 16:9". Length may be 3-10 seconds; the aspect ratio must be '
+ "16:9 (landscape) or 9:16 (portrait). The output is 720p, 24 FPS, with audio.",
+ ),
+ IO.Autogrow.Input(
+ "images",
+ template=IO.Autogrow.TemplateNames(
+ IO.Image.Input("image"),
+ names=[f"image_{i}" for i in range(1, OMNI_MAX_IMAGES + 1)],
+ min=0,
+ ),
+ tooltip=f"Optional reference image(s) to guide or animate the video. Up to {OMNI_MAX_IMAGES} images.",
+ ),
+ IO.Autogrow.Input(
+ "videos",
+ template=IO.Autogrow.TemplateNames(
+ IO.Video.Input("video"),
+ names=[f"video_{i}" for i in range(1, OMNI_MAX_VIDEOS + 1)],
+ min=0,
+ ),
+ tooltip=f"Optional reference video(s) to guide or edit. Up to {OMNI_MAX_VIDEOS} videos, "
+ f"each up to 10 seconds long.",
+ ),
+ IO.Float.Input(
+ "temperature",
+ default=1.0,
+ min=0.0,
+ max=2.0,
+ step=0.01,
+ tooltip="Controls randomness. Lower is more focused/deterministic, higher is more varied.",
+ advanced=True,
+ ),
+ IO.Float.Input(
+ "top_p",
+ default=0.95,
+ min=0.0,
+ max=1.0,
+ step=0.01,
+ tooltip="Nucleus sampling: sample from the smallest token set whose cumulative probability reaches top_p.",
+ advanced=True,
+ ),
+ ]
+
+
+class GeminiVideoOmni(IO.ComfyNode):
+
+ @classmethod
+ def define_schema(cls):
+ return IO.Schema(
+ node_id="GeminiVideoOmni",
+ display_name="Google Gemini Omni (Video)",
+ category="partner/video/Gemini",
+ essentials_category="Video Generation",
+ description="Generate a video with audio from a text prompt using Google's Gemini Omni Flash model. "
+ "Optionally provide reference images and/or videos to guide or edit the result. Describe the desired "
+ "length (3-10s) and aspect ratio (16:9 or 9:16) directly in the prompt.",
+ inputs=[
+ IO.DynamicCombo.Input(
+ "model",
+ options=[
+ IO.DynamicCombo.Option("Omni Flash", _omni_flash_inputs()),
+ ],
+ tooltip="The Gemini video model used to generate the video.",
+ ),
+ IO.Int.Input(
+ "seed",
+ default=42,
+ min=0,
+ max=2147483647,
+ control_after_generate=True,
+ tooltip="Seed controls whether the node should re-run; "
+ "results are non-deterministic regardless of seed.",
+ ),
+ ],
+ outputs=[
+ IO.Video.Output(),
+ IO.String.Output(),
+ ],
+ hidden=[
+ IO.Hidden.auth_token_comfy_org,
+ IO.Hidden.api_key_comfy_org,
+ IO.Hidden.unique_id,
+ ],
+ is_api_node=True,
+ price_badge=IO.PriceBadge(
+ expr='{"type":"usd","usd":0.146,"format":{"suffix":"/second","approximate":true}}'
+ ),
+ )
+
+ @classmethod
+ async def execute(cls, model: dict, seed: int) -> IO.NodeOutput:
+ prompt = model.get("prompt") or ""
+ validate_string(prompt, strip_whitespace=True, min_length=1)
+ model_id = OMNI_MODELS[model["model"]]
+
+ images = [t for t in (model.get("images") or {}).values() if t is not None]
+ videos = [v for v in (model.get("videos") or {}).values() if v is not None]
+ if sum(get_number_of_images(t) for t in images) > OMNI_MAX_IMAGES:
+ raise ValueError(f"The current maximum number of supported images is {OMNI_MAX_IMAGES}.")
+ if len(videos) > OMNI_MAX_VIDEOS:
+ raise ValueError(f"The current maximum number of supported videos is {OMNI_MAX_VIDEOS}.")
+ for video in videos:
+ validate_video_duration(video, max_duration=10)
+
+ parts: list[GeminiPart] = []
+ if images or videos:
+ parts.extend(await build_gemini_media_parts(cls, images, [], videos))
+ parts.append(GeminiPart(text=prompt))
+ response = await sync_op(
+ cls,
+ ApiEndpoint(path=f"{GEMINI_BASE_ENDPOINT}/{model_id}", method="POST"),
+ data=GeminiGenerateContentRequest(
+ contents=[GeminiContent(role=GeminiRole.user, parts=parts)],
+ generationConfig=GeminiGenerationConfig(
+ responseModalities=["TEXT", "VIDEO"],
+ temperature=model.get("temperature", 1.0),
+ topP=model.get("top_p", 0.95),
+ ),
+ ),
+ response_model=GeminiGenerateContentResponse,
+ price_extractor=calculate_tokens_price,
+ )
+ return IO.NodeOutput(
+ await get_video_from_response(response, cls=cls),
+ get_text_from_response(response),
+ )
+
+
class GeminiExtension(ComfyExtension):
@override
async def get_node_list(self) -> list[type[IO.ComfyNode]]:
@@ -1527,6 +1712,7 @@ class GeminiExtension(ComfyExtension):
GeminiImage2,
GeminiNanoBanana2,
GeminiNanoBanana2V2,
+ GeminiVideoOmni,
GeminiInputFiles,
]
diff --git a/comfy_api_nodes/nodes_ideogram.py b/comfy_api_nodes/nodes_ideogram.py
index 3b914a850..cc0467987 100644
--- a/comfy_api_nodes/nodes_ideogram.py
+++ b/comfy_api_nodes/nodes_ideogram.py
@@ -5,9 +5,7 @@ from PIL import Image
import numpy as np
import torch
from comfy_api_nodes.apis.ideogram import (
- IdeogramGenerateRequest,
IdeogramGenerateResponse,
- ImageRequest,
IdeogramV3Request,
IdeogramV3EditRequest,
IdeogramV4Request,
@@ -21,101 +19,6 @@ from comfy_api_nodes.util import (
validate_string,
)
-V1_V1_RES_MAP = {
- "Auto":"AUTO",
- "512 x 1536":"RESOLUTION_512_1536",
- "576 x 1408":"RESOLUTION_576_1408",
- "576 x 1472":"RESOLUTION_576_1472",
- "576 x 1536":"RESOLUTION_576_1536",
- "640 x 1024":"RESOLUTION_640_1024",
- "640 x 1344":"RESOLUTION_640_1344",
- "640 x 1408":"RESOLUTION_640_1408",
- "640 x 1472":"RESOLUTION_640_1472",
- "640 x 1536":"RESOLUTION_640_1536",
- "704 x 1152":"RESOLUTION_704_1152",
- "704 x 1216":"RESOLUTION_704_1216",
- "704 x 1280":"RESOLUTION_704_1280",
- "704 x 1344":"RESOLUTION_704_1344",
- "704 x 1408":"RESOLUTION_704_1408",
- "704 x 1472":"RESOLUTION_704_1472",
- "720 x 1280":"RESOLUTION_720_1280",
- "736 x 1312":"RESOLUTION_736_1312",
- "768 x 1024":"RESOLUTION_768_1024",
- "768 x 1088":"RESOLUTION_768_1088",
- "768 x 1152":"RESOLUTION_768_1152",
- "768 x 1216":"RESOLUTION_768_1216",
- "768 x 1232":"RESOLUTION_768_1232",
- "768 x 1280":"RESOLUTION_768_1280",
- "768 x 1344":"RESOLUTION_768_1344",
- "832 x 960":"RESOLUTION_832_960",
- "832 x 1024":"RESOLUTION_832_1024",
- "832 x 1088":"RESOLUTION_832_1088",
- "832 x 1152":"RESOLUTION_832_1152",
- "832 x 1216":"RESOLUTION_832_1216",
- "832 x 1248":"RESOLUTION_832_1248",
- "864 x 1152":"RESOLUTION_864_1152",
- "896 x 960":"RESOLUTION_896_960",
- "896 x 1024":"RESOLUTION_896_1024",
- "896 x 1088":"RESOLUTION_896_1088",
- "896 x 1120":"RESOLUTION_896_1120",
- "896 x 1152":"RESOLUTION_896_1152",
- "960 x 832":"RESOLUTION_960_832",
- "960 x 896":"RESOLUTION_960_896",
- "960 x 1024":"RESOLUTION_960_1024",
- "960 x 1088":"RESOLUTION_960_1088",
- "1024 x 640":"RESOLUTION_1024_640",
- "1024 x 768":"RESOLUTION_1024_768",
- "1024 x 832":"RESOLUTION_1024_832",
- "1024 x 896":"RESOLUTION_1024_896",
- "1024 x 960":"RESOLUTION_1024_960",
- "1024 x 1024":"RESOLUTION_1024_1024",
- "1088 x 768":"RESOLUTION_1088_768",
- "1088 x 832":"RESOLUTION_1088_832",
- "1088 x 896":"RESOLUTION_1088_896",
- "1088 x 960":"RESOLUTION_1088_960",
- "1120 x 896":"RESOLUTION_1120_896",
- "1152 x 704":"RESOLUTION_1152_704",
- "1152 x 768":"RESOLUTION_1152_768",
- "1152 x 832":"RESOLUTION_1152_832",
- "1152 x 864":"RESOLUTION_1152_864",
- "1152 x 896":"RESOLUTION_1152_896",
- "1216 x 704":"RESOLUTION_1216_704",
- "1216 x 768":"RESOLUTION_1216_768",
- "1216 x 832":"RESOLUTION_1216_832",
- "1232 x 768":"RESOLUTION_1232_768",
- "1248 x 832":"RESOLUTION_1248_832",
- "1280 x 704":"RESOLUTION_1280_704",
- "1280 x 720":"RESOLUTION_1280_720",
- "1280 x 768":"RESOLUTION_1280_768",
- "1280 x 800":"RESOLUTION_1280_800",
- "1312 x 736":"RESOLUTION_1312_736",
- "1344 x 640":"RESOLUTION_1344_640",
- "1344 x 704":"RESOLUTION_1344_704",
- "1344 x 768":"RESOLUTION_1344_768",
- "1408 x 576":"RESOLUTION_1408_576",
- "1408 x 640":"RESOLUTION_1408_640",
- "1408 x 704":"RESOLUTION_1408_704",
- "1472 x 576":"RESOLUTION_1472_576",
- "1472 x 640":"RESOLUTION_1472_640",
- "1472 x 704":"RESOLUTION_1472_704",
- "1536 x 512":"RESOLUTION_1536_512",
- "1536 x 576":"RESOLUTION_1536_576",
- "1536 x 640":"RESOLUTION_1536_640",
-}
-
-V1_V2_RATIO_MAP = {
- "1:1":"ASPECT_1_1",
- "4:3":"ASPECT_4_3",
- "3:4":"ASPECT_3_4",
- "16:9":"ASPECT_16_9",
- "9:16":"ASPECT_9_16",
- "2:1":"ASPECT_2_1",
- "1:2":"ASPECT_1_2",
- "3:2":"ASPECT_3_2",
- "2:3":"ASPECT_2_3",
- "4:5":"ASPECT_4_5",
- "5:4":"ASPECT_5_4",
-}
V3_RATIO_MAP = {
"1:3":"1x3",
@@ -229,298 +132,6 @@ async def download_and_process_images(image_urls):
return stacked_tensors
-class IdeogramV1(IO.ComfyNode):
-
- @classmethod
- def define_schema(cls):
- return IO.Schema(
- node_id="IdeogramV1",
- display_name="Ideogram V1",
- category="partner/image/Ideogram",
- description="Generates images using the Ideogram V1 model.",
- inputs=[
- IO.String.Input(
- "prompt",
- multiline=True,
- default="",
- tooltip="Prompt for the image generation",
- ),
- IO.Boolean.Input(
- "turbo",
- default=False,
- tooltip="Whether to use turbo mode (faster generation, potentially lower quality)",
- ),
- IO.Combo.Input(
- "aspect_ratio",
- options=list(V1_V2_RATIO_MAP.keys()),
- default="1:1",
- tooltip="The aspect ratio for image generation.",
- optional=True,
- ),
- IO.Combo.Input(
- "magic_prompt_option",
- options=["AUTO", "ON", "OFF"],
- default="AUTO",
- tooltip="Determine if MagicPrompt should be used in generation",
- optional=True,
- advanced=True,
- ),
- IO.Int.Input(
- "seed",
- default=0,
- min=0,
- max=2147483647,
- step=1,
- control_after_generate=True,
- display_mode=IO.NumberDisplay.number,
- optional=True,
- ),
- IO.String.Input(
- "negative_prompt",
- multiline=True,
- default="",
- tooltip="Description of what to exclude from the image",
- optional=True,
- ),
- IO.Int.Input(
- "num_images",
- default=1,
- min=1,
- max=8,
- step=1,
- display_mode=IO.NumberDisplay.number,
- optional=True,
- ),
- ],
- outputs=[
- IO.Image.Output(),
- ],
- hidden=[
- IO.Hidden.auth_token_comfy_org,
- IO.Hidden.api_key_comfy_org,
- IO.Hidden.unique_id,
- ],
- is_api_node=True,
- price_badge=IO.PriceBadge(
- depends_on=IO.PriceBadgeDepends(widgets=["num_images", "turbo"]),
- expr="""
- (
- $n := widgets.num_images;
- $base := (widgets.turbo = true) ? 0.0286 : 0.0858;
- {"type":"usd","usd": $round($base * $n, 2)}
- )
- """,
- ),
- )
-
- @classmethod
- async def execute(
- cls,
- prompt,
- turbo=False,
- aspect_ratio="1:1",
- magic_prompt_option="AUTO",
- seed=0,
- negative_prompt="",
- num_images=1,
- ):
- # Determine the model based on turbo setting
- aspect_ratio = V1_V2_RATIO_MAP.get(aspect_ratio, None)
- model = "V_1_TURBO" if turbo else "V_1"
-
- response = await sync_op(
- cls,
- ApiEndpoint(path="/proxy/ideogram/generate", method="POST"),
- response_model=IdeogramGenerateResponse,
- data=IdeogramGenerateRequest(
- image_request=ImageRequest(
- prompt=prompt,
- model=model,
- num_images=num_images,
- seed=seed,
- aspect_ratio=aspect_ratio if aspect_ratio != "ASPECT_1_1" else None,
- magic_prompt_option=(magic_prompt_option if magic_prompt_option != "AUTO" else None),
- negative_prompt=negative_prompt if negative_prompt else None,
- )
- ),
- max_retries=1,
- )
-
- if not response.data or len(response.data) == 0:
- raise Exception("No images were generated in the response")
-
- image_urls = [image_data.url for image_data in response.data if image_data.url]
- if not image_urls:
- raise Exception("No image URLs were generated in the response")
- return IO.NodeOutput(await download_and_process_images(image_urls))
-
-
-class IdeogramV2(IO.ComfyNode):
-
- @classmethod
- def define_schema(cls):
- return IO.Schema(
- node_id="IdeogramV2",
- display_name="Ideogram V2",
- category="partner/image/Ideogram",
- description="Generates images using the Ideogram V2 model.",
- inputs=[
- IO.String.Input(
- "prompt",
- multiline=True,
- default="",
- tooltip="Prompt for the image generation",
- ),
- IO.Boolean.Input(
- "turbo",
- default=False,
- tooltip="Whether to use turbo mode (faster generation, potentially lower quality)",
- ),
- IO.Combo.Input(
- "aspect_ratio",
- options=list(V1_V2_RATIO_MAP.keys()),
- default="1:1",
- tooltip="The aspect ratio for image generation. Ignored if resolution is not set to AUTO.",
- optional=True,
- ),
- IO.Combo.Input(
- "resolution",
- options=list(V1_V1_RES_MAP.keys()),
- default="Auto",
- tooltip="The resolution for image generation. "
- "If not set to AUTO, this overrides the aspect_ratio setting.",
- optional=True,
- ),
- IO.Combo.Input(
- "magic_prompt_option",
- options=["AUTO", "ON", "OFF"],
- default="AUTO",
- tooltip="Determine if MagicPrompt should be used in generation",
- optional=True,
- advanced=True,
- ),
- IO.Int.Input(
- "seed",
- default=0,
- min=0,
- max=2147483647,
- step=1,
- control_after_generate=True,
- display_mode=IO.NumberDisplay.number,
- optional=True,
- ),
- IO.Combo.Input(
- "style_type",
- options=["AUTO", "GENERAL", "REALISTIC", "DESIGN", "RENDER_3D", "ANIME"],
- default="NONE",
- tooltip="Style type for generation (V2 only)",
- optional=True,
- advanced=True,
- ),
- IO.String.Input(
- "negative_prompt",
- multiline=True,
- default="",
- tooltip="Description of what to exclude from the image",
- optional=True,
- ),
- IO.Int.Input(
- "num_images",
- default=1,
- min=1,
- max=8,
- step=1,
- display_mode=IO.NumberDisplay.number,
- optional=True,
- ),
- #"color_palette": (
- # IO.STRING,
- # {
- # "multiline": False,
- # "default": "",
- # "tooltip": "Color palette preset name or hex colors with weights",
- # },
- #),
- ],
- outputs=[
- IO.Image.Output(),
- ],
- hidden=[
- IO.Hidden.auth_token_comfy_org,
- IO.Hidden.api_key_comfy_org,
- IO.Hidden.unique_id,
- ],
- is_api_node=True,
- price_badge=IO.PriceBadge(
- depends_on=IO.PriceBadgeDepends(widgets=["num_images", "turbo"]),
- expr="""
- (
- $n := widgets.num_images;
- $base := (widgets.turbo = true) ? 0.0715 : 0.1144;
- {"type":"usd","usd": $round($base * $n, 2)}
- )
- """,
- ),
- )
-
- @classmethod
- async def execute(
- cls,
- prompt,
- turbo=False,
- aspect_ratio="1:1",
- resolution="Auto",
- magic_prompt_option="AUTO",
- seed=0,
- style_type="NONE",
- negative_prompt="",
- num_images=1,
- color_palette="",
- ):
- aspect_ratio = V1_V2_RATIO_MAP.get(aspect_ratio, None)
- resolution = V1_V1_RES_MAP.get(resolution, None)
- # Determine the model based on turbo setting
- model = "V_2_TURBO" if turbo else "V_2"
-
- # Handle resolution vs aspect_ratio logic
- # If resolution is not AUTO, it overrides aspect_ratio
- final_resolution = None
- final_aspect_ratio = None
-
- if resolution != "AUTO":
- final_resolution = resolution
- else:
- final_aspect_ratio = aspect_ratio if aspect_ratio != "ASPECT_1_1" else None
-
- response = await sync_op(
- cls,
- endpoint=ApiEndpoint(path="/proxy/ideogram/generate", method="POST"),
- response_model=IdeogramGenerateResponse,
- data=IdeogramGenerateRequest(
- image_request=ImageRequest(
- prompt=prompt,
- model=model,
- num_images=num_images,
- seed=seed,
- aspect_ratio=final_aspect_ratio,
- resolution=final_resolution,
- magic_prompt_option=(magic_prompt_option if magic_prompt_option != "AUTO" else None),
- style_type=style_type if style_type != "NONE" else None,
- negative_prompt=negative_prompt if negative_prompt else None,
- color_palette=color_palette if color_palette else None,
- )
- ),
- max_retries=1,
- )
- if not response.data or len(response.data) == 0:
- raise Exception("No images were generated in the response")
-
- image_urls = [image_data.url for image_data in response.data if image_data.url]
- if not image_urls:
- raise Exception("No image URLs were generated in the response")
- return IO.NodeOutput(await download_and_process_images(image_urls))
-
-
class IdeogramV3(IO.ComfyNode):
@classmethod
@@ -917,8 +528,6 @@ class IdeogramExtension(ComfyExtension):
@override
async def get_node_list(self) -> list[type[IO.ComfyNode]]:
return [
- IdeogramV1,
- IdeogramV2,
IdeogramV3,
IdeogramV4,
]
diff --git a/comfy_extras/nodes_cond.py b/comfy_extras/nodes_cond.py
index b745a43af..c8091b7a4 100644
--- a/comfy_extras/nodes_cond.py
+++ b/comfy_extras/nodes_cond.py
@@ -8,7 +8,8 @@ class CLIPTextEncodeControlnet(io.ComfyNode):
def define_schema(cls) -> io.Schema:
return io.Schema(
node_id="CLIPTextEncodeControlnet",
- category="experimental/conditioning",
+ display_name="CLIP Text Encode (Controlnet)",
+ category="model/conditioning",
inputs=[
io.Clip.Input("clip"),
io.Conditioning.Input("conditioning"),
@@ -35,11 +36,12 @@ class T5TokenizerOptions(io.ComfyNode):
def define_schema(cls) -> io.Schema:
return io.Schema(
node_id="T5TokenizerOptions",
- category="experimental/conditioning",
+ display_name="T5 Tokenizer Options",
+ category="model/conditioning",
inputs=[
io.Clip.Input("clip"),
- io.Int.Input("min_padding", default=0, min=0, max=10000, step=1, advanced=True),
- io.Int.Input("min_length", default=0, min=0, max=10000, step=1, advanced=True),
+ io.Int.Input("min_padding", default=0, min=0, max=10000, step=1),
+ io.Int.Input("min_length", default=0, min=0, max=10000, step=1),
],
outputs=[io.Clip.Output()],
is_experimental=True,
diff --git a/comfy_extras/nodes_custom_sampler.py b/comfy_extras/nodes_custom_sampler.py
index c9d7e06fc..56ef5f526 100644
--- a/comfy_extras/nodes_custom_sampler.py
+++ b/comfy_extras/nodes_custom_sampler.py
@@ -1070,7 +1070,7 @@ class AddNoise(io.ComfyNode):
def define_schema(cls):
return io.Schema(
node_id="AddNoise",
- category="experimental/custom_sampling/noise",
+ category="model/sampling/noise",
is_experimental=True,
inputs=[
io.Model.Input("model"),
@@ -1120,7 +1120,7 @@ class ManualSigmas(io.ComfyNode):
return io.Schema(
node_id="ManualSigmas",
search_aliases=["custom noise schedule", "define sigmas"],
- category="experimental/custom_sampling",
+ category="model/sampling/sigmas",
is_experimental=True,
inputs=[
io.String.Input("sigmas", default="1, 0.5", multiline=False)
diff --git a/comfy_extras/nodes_photomaker.py b/comfy_extras/nodes_photomaker.py
index 8a2248572..72fad1673 100644
--- a/comfy_extras/nodes_photomaker.py
+++ b/comfy_extras/nodes_photomaker.py
@@ -123,7 +123,8 @@ class PhotoMakerLoader(io.ComfyNode):
def define_schema(cls):
return io.Schema(
node_id="PhotoMakerLoader",
- category="experimental/photomaker",
+ display_name="Load PhotoMaker Model",
+ category="model/loaders",
inputs=[
io.Combo.Input("photomaker_model_name", options=folder_paths.get_filename_list("photomaker")),
],
@@ -149,7 +150,8 @@ class PhotoMakerEncode(io.ComfyNode):
def define_schema(cls):
return io.Schema(
node_id="PhotoMakerEncode",
- category="experimental/photomaker",
+ display_name="PhotoMaker Encode",
+ category="model/conditioning/photomaker",
inputs=[
io.Photomaker.Input("photomaker"),
io.Image.Input("image"),
diff --git a/comfy_extras/nodes_stable_cascade.py b/comfy_extras/nodes_stable_cascade.py
index 6a78ffb47..ddfb4f2b0 100644
--- a/comfy_extras/nodes_stable_cascade.py
+++ b/comfy_extras/nodes_stable_cascade.py
@@ -119,7 +119,7 @@ class StableCascade_SuperResolutionControlnet(io.ComfyNode):
def define_schema(cls):
return io.Schema(
node_id="StableCascade_SuperResolutionControlnet",
- category="experimental/stable_cascade",
+ category="experimental/stable cascade",
is_experimental=True,
inputs=[
io.Image.Input("image"),
diff --git a/comfy_extras/nodes_triposplat.py b/comfy_extras/nodes_triposplat.py
index 7bf4703fe..c892213e4 100644
--- a/comfy_extras/nodes_triposplat.py
+++ b/comfy_extras/nodes_triposplat.py
@@ -143,7 +143,7 @@ class VAEDecodeTripoSplat(IO.ComfyNode):
return IO.Schema(
node_id="VAEDecodeTripoSplat",
display_name="TripoSplat Decode",
- category="3d/latent",
+ category="model/latent/triposplat",
description="Decode the sampled TripoSplat latent into a 3D gaussian splat. "
"Modify the number of gaussians to vary the density.",
inputs=[
@@ -188,7 +188,7 @@ class TripoSplatSamplingPreview(IO.ComfyNode):
return IO.Schema(
node_id="TripoSplatSamplingPreview",
display_name="TripoSplat Sampling Preview",
- category="3d/latent",
+ category="model/latent/triposplat",
description="Patch the TripoSplat model for the standard Ksampler node to show a live decoded "
"gaussian splat preview at each step.",
inputs=[
diff --git a/comfyui_version.py b/comfyui_version.py
index f8db561ba..8e9967f1b 100644
--- a/comfyui_version.py
+++ b/comfyui_version.py
@@ -1,3 +1,3 @@
# This file is automatically generated by the build process when version is
# updated in pyproject.toml.
-__version__ = "0.26.0"
+__version__ = "0.27.0"
diff --git a/folder_paths.py b/folder_paths.py
index 7304e1b73..ee048b0f2 100644
--- a/folder_paths.py
+++ b/folder_paths.py
@@ -264,6 +264,59 @@ def annotated_filepath(name: str) -> tuple[str, str | None]:
return name, base_dir
+# Content types a browser may execute or render inline. File endpoints that
+# serve user-controlled content must force these to download (and ideally set
+# Content-Disposition: attachment) to avoid stored XSS. Centralised here so the
+# /view and /userdata handlers can't drift apart. mimetypes.guess_type may
+# return either the text/* or application/* spelling depending on platform, so
+# both are listed.
+DANGEROUS_CONTENT_TYPES = {
+ 'text/html', 'text/html-sandboxed', 'application/xhtml+xml',
+ 'text/javascript', 'application/javascript', 'application/x-javascript',
+ 'application/ecmascript', 'text/css',
+ 'image/svg+xml', 'application/xml', 'text/xml',
+ # message/rfc822 (.mht/.mhtml) can carry script in some browsers.
+ 'message/rfc822',
+}
+
+
+def is_dangerous_content_type(content_type: str | None) -> bool:
+ """Return True if a browser may execute or render `content_type` inline.
+
+ Normalises before matching so the check can't be slipped past with a
+ charset/boundary parameter (``text/html; charset=utf-8``) or casing
+ (``TEXT/HTML``). Any XML dialect (``*+xml`` or ``*/xml``) is treated as
+ dangerous because XML can carry inline script via stylesheet/entity tricks,
+ which also covers the ``application/{xslt,rss,atom,rdf}+xml`` family without
+ enumerating each one. Endpoints serving user-controlled content should route
+ a dangerous type to ``application/octet-stream`` + ``Content-Disposition:
+ attachment`` + ``X-Content-Type-Options: nosniff``.
+ """
+ if not content_type:
+ return False
+ normalized = content_type.split(';', 1)[0].strip().lower()
+ if normalized in DANGEROUS_CONTENT_TYPES:
+ return True
+ return normalized.endswith('+xml') or normalized.endswith('/xml')
+
+
+def is_within_directory(directory: str, target: str) -> bool:
+ """Return True if `target` resolves to a path inside `directory`.
+
+ Uses realpath on both operands so that a symlink placed inside `directory`
+ that points elsewhere cannot escape the containment check at open time.
+ """
+ try:
+ directory = os.path.realpath(directory)
+ target = os.path.realpath(target)
+ return os.path.commonpath((directory, target)) == directory
+ except ValueError:
+ # ValueError is raised by realpath() on a path with an embedded null
+ # byte, and by commonpath() on Windows when the paths are on different
+ # drives. In either case the target is not safely within the directory.
+ return False
+
+
def get_annotated_filepath(name: str, default_dir: str | None=None) -> str:
name, base_dir = annotated_filepath(name)
@@ -273,7 +326,12 @@ def get_annotated_filepath(name: str, default_dir: str | None=None) -> str:
else:
base_dir = get_input_directory() # fallback path
- return os.path.join(base_dir, name)
+ filepath = os.path.abspath(os.path.join(base_dir, name))
+ # Prevent path traversal: the resolved path must stay within base_dir.
+ # repr() the name in the message so a crafted value can't inject log lines.
+ if not is_within_directory(base_dir, filepath):
+ raise ValueError("Invalid file path: {!r}".format(name))
+ return filepath
def exists_annotated_filepath(name) -> bool:
@@ -282,7 +340,10 @@ def exists_annotated_filepath(name) -> bool:
if base_dir is None:
base_dir = get_input_directory() # fallback path
- filepath = os.path.join(base_dir, name)
+ filepath = os.path.abspath(os.path.join(base_dir, name))
+ # Treat traversal attempts as non-existent rather than probing the filesystem.
+ if not is_within_directory(base_dir, filepath):
+ return False
return os.path.exists(filepath)
diff --git a/main.py b/main.py
index aa4ee2adb..20ec83c9e 100644
--- a/main.py
+++ b/main.py
@@ -403,7 +403,7 @@ def prompt_worker(q, server_instance):
hook_breaker_ac10a0.restore_functions()
if not asset_seeder.is_disabled():
- asset_seeder.enqueue_enrich(roots=("output",), compute_hashes=True)
+ asset_seeder.enqueue_enrich(roots=("output",), compute_hashes=args.enable_asset_hashing)
asset_seeder.resume()
@@ -458,7 +458,7 @@ def setup_database():
if dependencies_available():
init_db()
if args.enable_assets:
- if asset_seeder.start(roots=("models", "input", "output"), prune_first=True, compute_hashes=True):
+ if asset_seeder.start(roots=("models", "input", "output"), prune_first=True, compute_hashes=args.enable_asset_hashing):
logging.info("Background asset scan initiated for models, input, output")
except Exception as e:
if "database is locked" in str(e):
diff --git a/nodes.py b/nodes.py
index 77c577b9a..9043a8d0a 100644
--- a/nodes.py
+++ b/nodes.py
@@ -349,7 +349,7 @@ class VAEDecodeTiled:
RETURN_TYPES = ("IMAGE",)
FUNCTION = "decode"
- CATEGORY = "experimental"
+ CATEGORY = "model/latent"
def decode(self, vae, samples, tile_size, overlap=64, temporal_size=64, temporal_overlap=8):
if tile_size < overlap * 4:
@@ -396,7 +396,7 @@ class VAEEncodeTiled:
RETURN_TYPES = ("LATENT",)
FUNCTION = "encode"
- CATEGORY = "experimental"
+ CATEGORY = "model/latent"
def encode(self, vae, pixels, tile_size, overlap, temporal_size=64, temporal_overlap=8):
t = vae.encode_tiled(pixels, tile_x=tile_size, tile_y=tile_size, overlap=overlap, tile_t=temporal_size, overlap_t=temporal_overlap)
@@ -514,7 +514,7 @@ class SaveLatent:
OUTPUT_NODE = True
- CATEGORY = "experimental"
+ CATEGORY = "model/latent"
def save(self, samples, filename_prefix="ComfyUI", prompt=None, extra_pnginfo=None):
full_output_folder, filename, counter, subfolder, filename_prefix = folder_paths.get_save_image_path(filename_prefix, self.output_dir)
@@ -559,7 +559,7 @@ class LoadLatent:
files = [f for f in os.listdir(input_dir) if os.path.isfile(os.path.join(input_dir, f)) and f.endswith(".latent")]
return {"required": {"latent": [sorted(files), ]}, }
- CATEGORY = "experimental"
+ CATEGORY = "model/latent"
RETURN_TYPES = ("LATENT", )
FUNCTION = "load"
@@ -2155,6 +2155,8 @@ NODE_DISPLAY_NAME_MAPPINGS = {
"GLIGENTextBoxApply": "Apply GLIGEN Text Box",
"ConditioningZeroOut": "Conditioning Zero Out",
# Latent
+ "LoadLatent": "Load Latent",
+ "SaveLatent": "Save Latent",
"VAEEncodeForInpaint": "VAE Encode (for Inpainting)",
"SetLatentNoiseMask": "Set Latent Noise Mask",
"VAEDecode": "VAE Decode",
@@ -2189,7 +2191,6 @@ NODE_DISPLAY_NAME_MAPPINGS = {
"ImageSharpen": "Sharpen Image",
"ImageScaleToTotalPixels": "Scale Image to Total Pixels",
"GetImageSize": "Get Image Size",
- # experimental
"VAEDecodeTiled": "VAE Decode (Tiled)",
"VAEEncodeTiled": "VAE Encode (Tiled)",
}
diff --git a/pyproject.toml b/pyproject.toml
index 2e8a85d3f..8c17e410e 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
[project]
name = "ComfyUI"
-version = "0.26.0"
+version = "0.27.0"
readme = "README.md"
license = { file = "LICENSE" }
requires-python = ">=3.10"
diff --git a/requirements.txt b/requirements.txt
index eb7230b49..1d9fe4137 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,5 +1,5 @@
comfyui-frontend-package==1.45.20
-comfyui-workflow-templates==0.10.7
+comfyui-workflow-templates==0.11.1
comfyui-embedded-docs==0.5.6
torch
torchsde
@@ -22,7 +22,7 @@ alembic
SQLAlchemy>=2.0.0
filelock
av>=16.0.0
-comfy-kitchen==0.2.15
+comfy-kitchen==0.2.16
comfy-aimdo==0.4.10
requests
simpleeval>=1.0.0
diff --git a/server.py b/server.py
index 361850f38..461ebe2f6 100644
--- a/server.py
+++ b/server.py
@@ -127,6 +127,7 @@ def create_cors_middleware(allowed_origin: str):
return cors_middleware
+
def is_loopback(host):
if host is None:
return False
@@ -616,15 +617,30 @@ class PromptServer():
or 'application/octet-stream'
)
- # For security, force certain mimetypes to download instead of display
- if content_type in {'text/html', 'text/html-sandboxed', 'application/xhtml+xml', 'text/javascript', 'text/css'}:
- content_type = 'application/octet-stream' # Forces download
+ # For security, force renderable/active types (HTML, JS,
+ # CSS, SVG, XML — anything that can carry inline '
+ files = {"file": ("evil.svg", svg, "image/svg+xml")}
+ form_data = {
+ "tags": json.dumps(["models", "checkpoints", "unit-tests", "svgxss"]),
+ "name": "evil.svg",
+ }
+ up = http.post(api_base + "/api/assets", files=files, data=form_data, timeout=120)
+ body = up.json()
+ assert up.status_code in (200, 201), body
+ aid = body["id"]
+ try:
+ r = http.get(f"{api_base}/api/assets/{aid}/content?disposition=inline", timeout=120)
+ r.content
+ assert r.status_code == 200
+ ct = r.headers.get("Content-Type", "").lower()
+ cd = r.headers.get("Content-Disposition", "").lower()
+ assert "svg" not in ct, f"SVG served with a renderable content type: {ct!r}"
+ assert ct.startswith("application/octet-stream"), f"expected octet-stream, got {ct!r}"
+ assert "attachment" in cd, f"inline disposition not overridden to attachment: {cd!r}"
+ assert r.headers.get("X-Content-Type-Options", "").lower() == "nosniff"
+ finally:
+ with contextlib.suppress(Exception):
+ http.delete(f"{api_base}/api/assets/{aid}", timeout=30)
+
+
def test_download_attachment_and_inline(http: requests.Session, api_base: str, seeded_asset: dict):
aid = seeded_asset["id"]
diff --git a/tests-unit/comfy_test/folder_path_test.py b/tests-unit/comfy_test/folder_path_test.py
index 775e15c36..3b398e60b 100644
--- a/tests-unit/comfy_test/folder_path_test.py
+++ b/tests-unit/comfy_test/folder_path_test.py
@@ -53,8 +53,11 @@ def test_annotated_filepath():
def test_get_annotated_filepath():
default_dir = "/default/dir"
- assert folder_paths.get_annotated_filepath("test.txt", default_dir) == os.path.join(default_dir, "test.txt")
- assert folder_paths.get_annotated_filepath("test.txt [output]") == os.path.join(folder_paths.get_output_directory(), "test.txt")
+ # get_annotated_filepath now normalizes with os.path.abspath (part of the
+ # GHSA-779p traversal hardening), so compare against the normalized form —
+ # on Windows abspath also prepends the current drive letter.
+ assert folder_paths.get_annotated_filepath("test.txt", default_dir) == os.path.abspath(os.path.join(default_dir, "test.txt"))
+ assert folder_paths.get_annotated_filepath("test.txt [output]") == os.path.abspath(os.path.join(folder_paths.get_output_directory(), "test.txt"))
def test_add_model_folder_path_append(clear_folder_paths):
folder_paths.add_model_folder_path("test_folder", "/default/path", is_default=True)
diff --git a/tests-unit/security_test/__init__.py b/tests-unit/security_test/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/tests-unit/security_test/test_ghsa_779p_02_preview_traversal.py b/tests-unit/security_test/test_ghsa_779p_02_preview_traversal.py
new file mode 100644
index 000000000..f17fd26ea
--- /dev/null
+++ b/tests-unit/security_test/test_ghsa_779p_02_preview_traversal.py
@@ -0,0 +1,192 @@
+"""CI unit tests for FIX #2 of GHSA-779p-m5rp-r4h4.
+
+Path traversal / hardening in app/model_manager.py get_model_preview
+(route /experiment/models/preview/{folder}/{path_index}/{filename:.*}).
+
+Reference: https://github.com/Comfy-Org/ComfyUI/security/advisories/GHSA-779p-m5rp-r4h4
+"""
+import pytest
+import yarl
+from io import BytesIO
+from PIL import Image
+from aiohttp import web
+from unittest.mock import patch
+from app.model_manager import ModelFileManager
+
+pytestmark = (
+ pytest.mark.asyncio
+) # This applies the asyncio mark to all test functions in the module
+
+@pytest.fixture
+def model_manager():
+ return ModelFileManager()
+
+@pytest.fixture
+def app(model_manager):
+ app = web.Application()
+ routes = web.RouteTableDef()
+ model_manager.add_routes(routes)
+ app.add_routes(routes)
+ return app
+
+
+async def test_legit_preview_returns_200(aiohttp_client, app, tmp_path):
+ """Sanity: a real preview PNG inside the model folder is served as webp 200."""
+ img = Image.new('RGB', (16, 16), color=(255, 0, 128))
+ img.save(tmp_path / "test_model.png", format='PNG')
+
+ with patch('folder_paths.folder_names_and_paths', {
+ 'test_folder': ([str(tmp_path)], None)
+ }):
+ client = await aiohttp_client(app)
+ response = await client.get('/experiment/models/preview/test_folder/0/test_model.png')
+
+ assert response.status == 200
+ assert response.content_type == 'image/webp'
+
+ img_bytes = BytesIO(await response.read())
+ served = Image.open(img_bytes)
+ assert served.format
+ assert served.format.lower() == 'webp'
+ served.close()
+
+
+async def test_non_integer_path_index_returns_400(aiohttp_client, app, tmp_path):
+ """A non-integer path_index segment must be rejected with 400."""
+ with patch('folder_paths.folder_names_and_paths', {
+ 'test_folder': ([str(tmp_path)], None)
+ }):
+ client = await aiohttp_client(app)
+ response = await client.get('/experiment/models/preview/test_folder/abc/test_model.png')
+
+ assert response.status == 400
+
+
+async def test_out_of_range_path_index_returns_404(aiohttp_client, app, tmp_path):
+ """A path_index beyond the configured folder list must return 404."""
+ with patch('folder_paths.folder_names_and_paths', {
+ 'test_folder': ([str(tmp_path)], None)
+ }):
+ client = await aiohttp_client(app)
+ response = await client.get('/experiment/models/preview/test_folder/99/test_model.png')
+
+ assert response.status == 404
+
+
+async def test_empty_filename_returns_400(aiohttp_client, app, tmp_path):
+ """The "{filename:.*}" capture also matches the empty string (trailing
+ slash). It would resolve to the folder itself and must be rejected with 400."""
+ with patch('folder_paths.folder_names_and_paths', {
+ 'test_folder': ([str(tmp_path)], None)
+ }):
+ client = await aiohttp_client(app)
+ response = await client.get('/experiment/models/preview/test_folder/0/')
+
+ assert response.status == 400
+
+
+async def test_path_traversal_in_filename_returns_403(aiohttp_client, app, tmp_path):
+ """Path traversal in {filename} must be rejected with 403 and must NOT read
+ a file outside the configured model directory.
+
+ GOTCHA: aiohttp/yarl collapses literal ``../`` dot-segments out of the URL
+ path before it reaches the handler, which would make this test vacuously
+ pass (the request would hit a different/non-existent route). We percent-encode
+ the dots and slashes (``%2e%2e%2f``) and send the URL with
+ ``yarl.URL(..., encoded=True)`` so the bytes survive client-side normalization
+ untouched; aiohttp's router then percent-decodes them into ``match_info``,
+ delivering the literal ``../`` traversal to the handler's ``{filename:.*}``
+ capture.
+
+ Without the fix the handler computes
+ ``os.path.normpath(os.path.join(folder, "../../../../etc/hosts"))``, which
+ escapes ``tmp_path`` and would be passed straight to get_model_previews ->
+ Image.open, serving bytes from outside the model dir (200/served bytes). The
+ is_within_directory() containment check is the load-bearing fix that turns
+ that escape into a 403.
+ """
+ # Sanity-anchor: a legit preview exists inside tmp_path, so a 200 path is
+ # genuinely reachable — proving the 403 below is the containment check
+ # firing, not an unrelated 404.
+ img = Image.new('RGB', (16, 16), color=(255, 0, 128))
+ img.save(tmp_path / "test_model.png", format='PNG')
+
+ # Percent-encoded "../../../../etc/hosts" so yarl does not collapse the
+ # dot-segments before the request leaves the client.
+ encoded_traversal = '%2e%2e%2f' * 4 + 'etc%2fhosts'
+ raw_path = '/experiment/models/preview/test_folder/0/' + encoded_traversal
+ url = yarl.URL(raw_path, encoded=True)
+
+ with patch('folder_paths.folder_names_and_paths', {
+ 'test_folder': ([str(tmp_path)], None)
+ }):
+ client = await aiohttp_client(app)
+ response = await client.get(url)
+
+ # Confirm the traversal actually reached the handler intact: a 200 here
+ # would mean either normalization stripped the ``../`` (vacuous pass) or
+ # the containment check failed open and served outside-dir bytes.
+ assert response.status == 403, (
+ f"expected 403 from is_within_directory() containment check, "
+ f"got {response.status}; traversal may have been normalized away "
+ f"or the fix failed open"
+ )
+ body = await response.read()
+ assert body == b"", "403 response must not carry any file bytes"
+
+
+async def test_symlink_companion_preview_returns_403(aiohttp_client, app, tmp_path):
+ """A companion preview file is selected by a glob inside get_model_previews
+ and then opened. If that companion is a symlink whose path is in-dir but
+ whose target escapes the model folder, it must be rejected with 403 — not
+ served. The requested path itself stays in-dir (so the first containment
+ check passes); the load-bearing fix is the SECOND is_within_directory check
+ on the file actually opened.
+ """
+ model_dir = tmp_path / "models"
+ model_dir.mkdir()
+ secret_dir = tmp_path / "secret"
+ secret_dir.mkdir()
+ # A real image OUTSIDE the model dir — valid, so without the fix Image.open
+ # would succeed and its bytes would be served (200).
+ secret = secret_dir / "secret.png"
+ Image.new('RGB', (8, 8), color=(0, 0, 0)).save(secret, format='PNG')
+ # Companion preview, in-dir by name but a symlink escaping the model dir.
+ # (No real model file is needed — get_model_previews globs companions by
+ # basename, and omitting a .safetensors avoids the metadata-header read.)
+ companion = model_dir / "model.preview.png"
+ try:
+ companion.symlink_to(secret)
+ except (OSError, NotImplementedError):
+ pytest.skip("symlinks not supported on this platform/filesystem")
+
+ with patch('folder_paths.folder_names_and_paths', {
+ 'test_folder': ([str(model_dir)], None)
+ }):
+ client = await aiohttp_client(app)
+ response = await client.get('/experiment/models/preview/test_folder/0/model.safetensors')
+
+ assert response.status == 403, (
+ f"expected 403 — the globbed companion preview is a symlink resolving "
+ f"outside the model dir and must not be served; got {response.status}"
+ )
+ assert await response.read() == b""
+
+
+async def test_null_byte_in_filename_no_500(aiohttp_client, app, tmp_path):
+ """A NUL byte in the filename must yield a clean client rejection, not a 500
+ from an uncaught ValueError in is_within_directory's realpath() call."""
+ raw_path = '/experiment/models/preview/test_folder/0/' + 'a%00b'
+ url = yarl.URL(raw_path, encoded=True)
+
+ with patch('folder_paths.folder_names_and_paths', {
+ 'test_folder': ([str(tmp_path)], None)
+ }):
+ client = await aiohttp_client(app)
+ response = await client.get(url)
+
+ assert response.status != 500, (
+ f"NUL byte produced a 500 (uncaught ValueError); expected a clean "
+ f"4xx rejection, got {response.status}"
+ )
+ assert 400 <= response.status < 500
diff --git a/tests-unit/security_test/test_ghsa_779p_03_annotated_traversal.py b/tests-unit/security_test/test_ghsa_779p_03_annotated_traversal.py
new file mode 100644
index 000000000..88102760c
--- /dev/null
+++ b/tests-unit/security_test/test_ghsa_779p_03_annotated_traversal.py
@@ -0,0 +1,165 @@
+"""Security tests for GHSA-779p-m5rp-r4h4 — FIX #3.
+
+Path traversal in folder_paths.get_annotated_filepath / exists_annotated_filepath,
+plus the shared is_within_directory() containment helper.
+
+These are pure-function tests (no running server). The input/output/temp
+directories are pointed at tmp_path via the folder_paths setters, so a crafted
+name containing `../`, an absolute path, or a symlink that escapes the base
+directory must be rejected.
+
+Reference: https://github.com/Comfy-Org/ComfyUI/security/advisories/GHSA-779p-m5rp-r4h4
+"""
+import os
+
+import pytest
+
+import folder_paths
+from comfy.options import enable_args_parsing
+enable_args_parsing()
+
+
+@pytest.fixture
+def sandbox(tmp_path):
+ """Point folder_paths' input/output/temp dirs at a real temp sandbox.
+
+ Yields the realpath'd base, input, output and temp directories. The original
+ directory values are restored afterward so tests stay isolated.
+ """
+ base = os.path.realpath(str(tmp_path))
+ input_dir = os.path.join(base, "input")
+ output_dir = os.path.join(base, "output")
+ temp_dir = os.path.join(base, "temp")
+ for d in (input_dir, output_dir, temp_dir):
+ os.makedirs(d, exist_ok=True)
+
+ orig_input = folder_paths.get_input_directory()
+ orig_output = folder_paths.get_output_directory()
+ orig_temp = folder_paths.get_temp_directory()
+
+ folder_paths.set_input_directory(input_dir)
+ folder_paths.set_output_directory(output_dir)
+ folder_paths.set_temp_directory(temp_dir)
+
+ yield {
+ "base": base,
+ "input": input_dir,
+ "output": output_dir,
+ "temp": temp_dir,
+ }
+
+ folder_paths.set_input_directory(orig_input)
+ folder_paths.set_output_directory(orig_output)
+ folder_paths.set_temp_directory(orig_temp)
+
+
+# ---------------------------------------------------------------------------
+# is_within_directory() — the shared containment helper
+# ---------------------------------------------------------------------------
+
+def test_is_within_directory_legit_child(sandbox):
+ base = sandbox["input"]
+ child = os.path.join(base, "sub", "image.png")
+ assert folder_paths.is_within_directory(base, child) is True
+
+
+def test_is_within_directory_dotdot_escape(sandbox):
+ base = sandbox["input"]
+ escape = os.path.join(base, "..", "..", "etc", "passwd")
+ assert folder_paths.is_within_directory(base, escape) is False
+
+
+def test_is_within_directory_symlink_escape(sandbox):
+ """A symlink created INSIDE base that points OUTSIDE base must not pass.
+
+ This is the key new hardening: is_within_directory realpath()s both operands,
+ so a symlink planted in the base directory can't be used to read files
+ elsewhere. We create a real on-disk symlink and a real secret target to
+ verify the check actually resolves the link.
+ """
+ base = sandbox["input"]
+
+ # A directory living outside the base, holding a secret file.
+ outside = os.path.join(sandbox["base"], "outside_secret_dir")
+ os.makedirs(outside, exist_ok=True)
+ secret = os.path.join(outside, "secret.txt")
+ with open(secret, "w") as f:
+ f.write("top secret")
+
+ # Plant a symlink inside base that points at the outside directory.
+ # symlink creation can require elevated privileges / Developer Mode on
+ # Windows, so skip cleanly where it isn't available (same guard as the
+ # sibling test in test_ghsa_779p_02_preview_traversal.py).
+ link = os.path.join(base, "escape_link")
+ try:
+ os.symlink(outside, link)
+ except (OSError, NotImplementedError):
+ pytest.skip("symlinks not supported on this platform/filesystem")
+
+ # Accessing the secret "through" the in-base symlink must be rejected.
+ target_via_link = os.path.join(link, "secret.txt")
+ assert folder_paths.is_within_directory(base, target_via_link) is False
+
+
+# ---------------------------------------------------------------------------
+# get_annotated_filepath()
+# ---------------------------------------------------------------------------
+
+def test_get_annotated_filepath_legit_name(sandbox):
+ result = folder_paths.get_annotated_filepath("image.png")
+ assert result == os.path.join(sandbox["input"], "image.png")
+ assert folder_paths.is_within_directory(sandbox["input"], result)
+
+
+def test_get_annotated_filepath_input_annotation(sandbox):
+ result = folder_paths.get_annotated_filepath("image.png [input]")
+ assert result == os.path.join(sandbox["input"], "image.png")
+
+
+def test_get_annotated_filepath_output_annotation(sandbox):
+ result = folder_paths.get_annotated_filepath("image.png [output]")
+ assert result == os.path.join(sandbox["output"], "image.png")
+
+
+def test_get_annotated_filepath_temp_annotation(sandbox):
+ result = folder_paths.get_annotated_filepath("image.png [temp]")
+ assert result == os.path.join(sandbox["temp"], "image.png")
+
+
+def test_get_annotated_filepath_dotdot_raises(sandbox):
+ with pytest.raises(ValueError):
+ folder_paths.get_annotated_filepath("../etc/passwd")
+
+
+def test_get_annotated_filepath_dotdot_with_annotation_raises(sandbox):
+ with pytest.raises(ValueError):
+ folder_paths.get_annotated_filepath("../../etc/passwd [output]")
+
+
+def test_get_annotated_filepath_absolute_escape_raises(sandbox):
+ with pytest.raises(ValueError):
+ folder_paths.get_annotated_filepath("/etc/passwd")
+
+
+# ---------------------------------------------------------------------------
+# exists_annotated_filepath()
+# ---------------------------------------------------------------------------
+
+def test_exists_annotated_filepath_existing_legit_file(sandbox):
+ real = os.path.join(sandbox["input"], "real.png")
+ with open(real, "w") as f:
+ f.write("data")
+ assert folder_paths.exists_annotated_filepath("real.png") is True
+
+
+def test_exists_annotated_filepath_traversal_returns_false(sandbox):
+ """A traversal name must return False without raising and without probing
+ outside the base directory (must never reach os.path.exists for the escape).
+ """
+ # /etc/passwd exists on POSIX; the function must still report False because
+ # the resolved path escapes the input directory.
+ assert folder_paths.exists_annotated_filepath("../../../../../../etc/passwd") is False
+
+
+def test_exists_annotated_filepath_absolute_returns_false(sandbox):
+ assert folder_paths.exists_annotated_filepath("/etc/passwd") is False
diff --git a/tests-unit/security_test/test_ghsa_779p_04_userdata_xss.py b/tests-unit/security_test/test_ghsa_779p_04_userdata_xss.py
new file mode 100644
index 000000000..aa1250327
--- /dev/null
+++ b/tests-unit/security_test/test_ghsa_779p_04_userdata_xss.py
@@ -0,0 +1,147 @@
+"""
+CI unit tests for FIX #4 of GHSA-779p-m5rp-r4h4.
+
+Stored-XSS hardening on GET /userdata/{file} in app/user_manager.py.
+
+User data files are arbitrary user-supplied content and must never render
+inline in the app origin. The getuserdata handler:
+ - forces Content-Type to application/octet-stream for any type in
+ folder_paths.DANGEROUS_CONTENT_TYPES (text/html, image/svg+xml,
+ text/javascript, ...),
+ - sets X-Content-Type-Options: nosniff,
+ - sets Content-Disposition: attachment.
+
+These tests pre-create files in tmp_path and GET them back, asserting the
+secure response headers. They mirror the aiohttp_client pattern in
+tests-unit/prompt_server_test/user_manager_test.py.
+"""
+
+import pytest
+import os
+from aiohttp import web
+from app.user_manager import UserManager
+
+pytestmark = (
+ pytest.mark.asyncio
+) # This applies the asyncio mark to all test functions in the module
+
+
+@pytest.fixture
+def user_manager(tmp_path):
+ um = UserManager()
+ um.get_request_user_filepath = lambda req, file, **kwargs: os.path.join(
+ tmp_path, file
+ ) if file else tmp_path
+ return um
+
+
+@pytest.fixture
+def app(user_manager):
+ app = web.Application()
+ routes = web.RouteTableDef()
+ user_manager.add_routes(routes)
+ app.add_routes(routes)
+ return app
+
+
+async def test_html_served_as_octet_stream(aiohttp_client, app, tmp_path):
+ (tmp_path / "evil.html").write_text(
+ ""
+ )
+
+ client = await aiohttp_client(app)
+ resp = await client.get("/userdata/evil.html")
+
+ assert resp.status == 200
+ ct = resp.headers.get("Content-Type", "")
+ # The load-bearing assertion: a .html file must NOT be served as text/html.
+ assert "text/html" not in ct.lower(), (
+ f"Content-Type {ct!r} would let a browser render/execute the file (stored XSS)."
+ )
+ assert ct == "application/octet-stream"
+ assert resp.headers.get("X-Content-Type-Options") == "nosniff"
+ assert "attachment" in resp.headers.get("Content-Disposition", "")
+
+
+async def test_svg_served_as_octet_stream(aiohttp_client, app, tmp_path):
+ (tmp_path / "evil.svg").write_text(
+ ''
+ '"
+ )
+
+ client = await aiohttp_client(app)
+ resp = await client.get("/userdata/evil.svg")
+
+ assert resp.status == 200
+ ct = resp.headers.get("Content-Type", "")
+ # SVG can carry inline