* feat: add cloud-specific fields to OSS openapi.yaml as nullable
Add cross-runtime fields with x-runtime: [cloud] extension and [cloud-only]
description prefix per the convention established in BE-613. All new fields
are nullable and not in required arrays, so they are purely additive.
/api/features response:
- max_upload_size (integer, int64)
- free_tier_credits (integer, int32)
- posthog_api_host (string, uri)
- max_concurrent_jobs (integer, int32)
- workflow_templates_version (string)
- workflow_templates_source (string, enum)
PromptRequest schema:
- workflow_id (string, uuid)
- workflow_version_id (string, uuid)
POST /api/assets:
- id field (uuid) on multipart/form-data for idempotent creation
- application/json alternate content-type for URL-based uploads
POST /api/assets/from-hash:
- mime_type (string) to preserve type without re-inspection
PUT /api/assets/{id}:
- mime_type (string) for overriding auto-detection
GET /api/assets additional query parameters:
- job_ids (string) — filter by associated job UUIDs
- include_public (boolean) — include workspace-public assets
- asset_hash (string) — filter by exact content hash
Resolves: BE-613
Blocks: BE-364, BE-361, BE-363
Co-authored-by: Matt Miller <MillerMedia@users.noreply.github.com>
* fix(openapi): address CodeRabbit feedback (BE-613)
- max_upload_size is set in both runtimes via SERVER_FEATURE_FLAGS;
drop the cloud-only / nullable tagging.
- Require `url` on the application/json POST /api/assets body so the
contract is enforceable by validators and codegen.
---------
Co-authored-by: Matt Miller <MillerMedia@users.noreply.github.com>
* Add Spectral lint CI gate for openapi.yaml
Adds a blocking Spectral lint check that runs on PRs touching
openapi.yaml or the ruleset itself. The ruleset mirrors the one used
for other Comfy-Org service specs: spectral:oas plus conventions for
snake_case properties, camelCase operationIds, and response/schema
shape. Gate runs at --fail-severity=error, which the spec currently
passes with zero errors (a small number of non-blocking
warnings/hints remain for WebSocket 101 responses, the existing loose
error schema, and two snake_case wire fields).
* ci: set least-privilege contents:read permissions on openapi-lint workflow
Per CodeRabbit review on #13410. The job only checks out the repo and
runs Spectral, so contents:read is sufficient and avoids inheriting any
permissive repo/org default token scope.
---------
Co-authored-by: guill <jacob.e.segal@gmail.com>
Add missing 'attachment;' directive to Content-Disposition headers in
server.py to ensure browsers properly download files instead of
attempting to display them inline.
Fixes 4 instances in the file download endpoint.
Co-authored-by: guill <jacob.e.segal@gmail.com>
* chore: update workflow templates to v0.9.69
* Update comfyui-workflow-templates to version 0.9.70
* Downgrade comfyui-workflow-templates to 0.9.69
---------
Co-authored-by: Alexander Piskun <13381981+bigcat88@users.noreply.github.com>
Adds two optional, nullable UUID fields to PromptRequest for runtimes
that wrap workflow execution in a workflow-version entity (the
hosted-cloud runtime does this; local ComfyUI does not). Both fields
are tagged `x-runtime: [cloud]` to mark them as runtime-specific —
local ComfyUI returns `null` (or omits them entirely) and that's
correct behavior, not drift.
## Why these fields belong in the OSS spec
Hosted-cloud's frontend and backend share `openapi.yaml` as their
single source of truth via auto-generated client types. Without the
fields declared in the spec, the cloud runtime has to either:
1. Hand-edit a vendored copy of openapi.yaml (drift between vendor
and upstream — unsustainable).
2. Maintain a separate cloud-only spec file (forks the contract,
defeats the point of a shared OSS spec).
Both options have been tried and both produce maintenance pain. The
shape that scales is: cloud-only fields live in OSS spec under their
intended path, declared nullable, with an explicit `x-runtime` tag so
local-only readers can ignore them programmatically and human readers
can see what each runtime populates.
## About the `x-runtime` extension
This is the first use of `x-runtime` in this spec. Convention:
- `x-runtime: [cloud]` — only the hosted-cloud runtime populates the
field; local returns null or omits.
- `x-runtime: [local]` — only local populates; cloud returns null.
- Tag absent — both runtimes populate the field (the default).
This is a vendor extension (`x-` prefix) and is ignored by spec
validators that don't recognize it, including `kin-openapi`. Local
clients reading the spec see two extra optional nullable fields, which
is forward-compatible with all existing readers.
## What this does not change
- No Python code changes. `PromptRequest` already accepts arbitrary
optional fields (`extra_data: additionalProperties: true` on the
same schema is a stronger guarantee). The Python server already
silently accepts and ignores both fields today.
- No required-fields change. Both fields stay outside `required`,
so older clients that don't know about them keep validating.
- No nullability widening on existing fields.
## Verification
- YAML parses (`yaml.safe_load`).
- `kin-openapi` `loader.LoadFromFile` accepts the modified spec.
- `openapi3filter.ValidateRequest` on a PromptRequest with both
fields set to `null`, set to a valid UUID, or omitted — all pass.
* fix(spec): mark DeviceStats.index and NodeInfo.essentials_category as nullable
Two fields in openapi.yaml are declared as required/non-nullable but
the Python implementation legitimately returns `null` for them, so any
client that response-validates against the spec will fail.
`DeviceStats.index` (used by GET /api/system_stats):
- server.py emits `"index": device.index` unconditionally
- For the CPU device (--cpu mode), `torch.device("cpu").index` is `None`
- → JSON response includes `"index": null` for CPU devices
`NodeInfo.essentials_category` (used by GET /api/object_info):
- The V3 schema-based path (comfy_api/latest/_io.py:1654) unconditionally
passes `essentials_category=self.essentials_category` into NodeInfoV1
and serializes via dataclasses.asdict(), so the key is always present
- Schema's `essentials_category` defaults to `None` for nodes that
don't set it in `define_schema` (e.g. the APG node)
- → JSON response includes `"essentials_category": null` for those nodes
- (The V1 path in server.py uses `hasattr` and so omits the key
entirely when not set, but the V3 path is the one that produces nulls)
Both fields keep their existing `required` status — they're always
present in the response, the value is just nullable. Descriptions
expanded to spell out when `null` is expected.
* docs(spec): clarify essentials_category presence rules
The previous description said "null for nodes that don't set
ESSENTIALS_CATEGORY (V1)" — that's wrong. server.py:739-740 uses
`hasattr` and OMITS the key when the V1 attribute isn't defined; null
only happens if the attribute is explicitly set to None. Spell out
all three legal shapes (string / null / absent) and which path
produces which.
If the same weight is used multiple times within the same prefetch
window, it should only apply compute state mutations once. Mark the
weight as fully resident on the first pass accordingly.
* initial gemma4 support
* parity with reference implementation
outputs can 100% match transformers with same sdpa flags, checkpoint this and then optimize
* Cleanup, video fixes
* cleanup, enable fused rms norm by default
* update comment
* Cleanup
* Update sd.py
* Various fixes
* Add fp8 scaled embedding support
* small fixes
* Translate think tokens
* Fix image encoder attention mask type
So it works with basic attention
* Handle thinking tokens different only for Gemma4
* Code cleanup
* Update nodes_textgen.py
* Use embed scale class instead of buffer
Slight difference to HF, but technically more accurate and simpler code
* Default to fused rms_norm
* Update gemma4.py
* feat(api-nodes): add Topaz Astra 2 model
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* feat(api-nodes): make Astra 2 the default Topaz upscaler model
Reorder UPSCALER_MODELS_MAP and the upscaler_model dynamic combo so
"Astra 2" appears first, surfacing it as the default selection.
---------
Signed-off-by: bigcat88 <bigcat88@icloud.com>
Co-authored-by: Marwan Mostafa <marawan206@gmail.com>
* mm: Use Aimdo raw allocator for cast buffers
pytorch manages allocation of growing buffers on streams poorly. Pyt
has no windows support for the expandable segments allocator (which is
the right tool for this job), while also segmenting the memory by
stream such that it can be generally re-used. So kick the problem to
aimdo which can just grow a virtual region thats freed per stream.
* plan
* ops: move cpu handler up to the caller
* ops: split up prefetch from weight prep block prefetching API
Split up the casting and weight formating/lora stuff in prep for
arbitrary prefetch support.
* ops: implement block prefetching API
allow a model to construct a prefetch list and operate it for increased
async offload.
* ltxv2: Implement block prefetching
* Implement lora async offload
Implement async offload of loras.