* pinned_memory: remove JIT RAM pressure release
This doesn't work, as freeing intermediates for pins needs to be
higher-priority than freeing pins-for-pins if and when you are going
to do that. So this is too late as pins-for-pins is model load time
and we dont have JIT pins-for-pins.
* cacheing: Add a filter to only free intermediates from inactive wfs
This is to get priorities in amongst pins straight.
* mm: free inactive-ram from RAM cache first
Stuff from inactive workflows should be freed before anything else.
* caching: purge old ModelPatchers first
Dont try and score them, just dump them at the first sign of trouble
if they arent part of the workflow.
On failure (ex: animated webp files) fallback to old pillow code.
This should fix the extra precision in high bit depth images (like 16 bit PNG) being discarded when loaded by Pillow and potentially add support for more image formats.
Comfy-aimdo 0.3.0 contains several major new features.
multi-GPU support
ARM support
AMD support
Refactorings include:
Linkless architecture - linkage is now performed purely at runtime
to stop host library lookups completely and only interact with the
torch-loaded Nvidia stack.
Elimination of cudart integration on linux. Its no consistent with
windows.
Misc bugfixes and minor features.
SolidMask had a hardcoded device="cpu" while other nodes (e.g.
EmptyImage) follow intermediate_device(). This causes a RuntimeError
when MaskComposite combines masks from different device sources
under --gpu-only.
- SolidMask: use intermediate_device() instead of hardcoded "cpu"
- MaskComposite: align source device to destination before operating
Co-authored-by: Alexis Rolland <alexisrolland@hotmail.com>
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
* Change save 3d model's filename prefix to 3d/ComfyUI
As this node has already changed from `Save GLB` to `Save 3D Model`, using the filename prefix `3d` will be better than `mesh`
* use lowercase
---------
Fires on v* tag push (earlier than release.published, which can lag)
and triggers a repository_dispatch on Comfy-Org/cloud with event_type
comfyui_tag_pushed. Legacy desktop dispatch in release-webhook.yml
is left untouched.
Wires comfy-kitchen's TensorCoreAWQW4A16Layout (introduced on
feat/awq-w4a16-modulation) into ComfyUI's MixedPrecisionOps so checkpoints
that tag modulation linears with comfy_quant.format = "awq_w4a16" get
their (qweight, weight_scale, weight_zero) loaded into the kitchen layout
class instead of being dequantized to bf16 plain Linear at conversion time.
quant_ops.py:
- detect TensorCoreAWQW4A16Layout availability and stub it out for the
no-kitchen fallback (mirrors the SVDQuant W4A4 pattern)
- register the layout class + add "awq_w4a16" to QUANT_ALGOS
(storage_t = int8 packed uint4, parameters = {weight_scale, weight_zero},
default group_size = 64)
ops.py: add the awq_w4a16 branch in MixedPrecisionOps.Linear._load_from_state_dict
that constructs Params(scale, zeros, group_size, ...) and wraps qweight
into a QuantizedTensor — F.linear then dispatches to ck.gemv_awq_w4a16
via the layout's aten handlers.
Pairs with comfy-kitchen feat/awq-w4a16-modulation. Targets the ~10 GB
inflation in Qwen-Image-Edit kitchen-native checkpoints, where the
modulation linears (img_mod.1 / txt_mod.1) currently dominate disk + VRAM
because they're materialized as plain bf16 Linear during conversion.
quant_ops.py: register TensorCoreSVDQuantW4A4Layout when comfy-kitchen exposes
it; gate the kitchen CUDA backend on cuda >= 13 (the optimized kitchen CUDA
ops are validated against cu13+ runtimes; on older cu the backend falls back
to eager).
ops.py: handle svdquant_w4a4 quant_format by loading weight_scale / proj_down /
proj_up / smooth_factor into TensorCoreSVDQuantW4A4Layout.Params, with the
img_mlp.net.2 / txt_mlp.net.2 fallback for act_unsigned. Targets the row-major
kitchen-native kernels on feat/svdquant-w4a4-kitchen-native; the verbatim
zgemm path is a sibling branch.
* Add OpenAPI 3.1 specification for ComfyUI API
Adds a comprehensive OpenAPI 3.1 spec documenting all HTTP endpoints
exposed by ComfyUI's server, including prompt execution, queue management,
file uploads, userdata, settings, system stats, object info, assets,
and internal routes.
The spec was validated against the source code with adversarial review
from multiple models, and passes Spectral linting with zero errors.
Also removes openapi.yaml from .gitignore so the spec is tracked.
* Mark /api/history endpoints as deprecated
Address Jacob's review feedback on PR #13397 by explicitly marking the
three /api/history operations as deprecated in the OpenAPI spec:
* GET /api/history -> superseded by GET /api/jobs
* POST /api/history -> superseded by /api/jobs management
* GET /api/history/{prompt_id} -> superseded by GET /api/jobs/{job_id}
Each operation gains deprecated: true plus a description that names the
replacement. A formal sunset timeline (RFC 8594 Deprecation and RFC 8553
Sunset headers, minimum-runway policy) is being defined separately and
will be applied as a follow-up.
* Address Spectral lint findings in openapi.yaml
- Add operation descriptions to 52 endpoints (prompt, queue, upload,
view, models, userdata, settings, assets, internal, etc.)
- Add schema descriptions to 22 component schemas
- Add parameter descriptions to 8 path parameters that were missing them
- Remove 6 unused component schemas: TaskOutput, EmbeddingsResponse,
ExtensionsResponse, LogRawResponse, UserInfo, UserDataFullInfo
No wire/shape changes. Reduces Spectral findings from 92 to 4. The
remaining 4 are real issues (WebSocket 101 on /ws, loose error schema,
and two snake_case warnings on real wire field names) and are worth
addressing separately.
* fix(openapi): address jtreminio oneOf review on /api/userdata
Restructure the UserData response schemas to address the review feedback
on the `oneOf` without a discriminator, and fix two accuracy bugs found
while doing it.
Changes
- GET /api/userdata response: extract the inline `oneOf` to a named
schema (`ListUserdataResponse`) and add the missing third variant
returned when `split=true` and `full_info=false` (array of
`[relative_path, ...path_components]`). Previously only two of the
three actual server response shapes were described.
- UserDataResponse (POST endpoints): correct the description — this
schema is a single item, not a list — and point at the canonical
`GetUserDataResponseFullFile` schema instead of the duplicate
`UserDataResponseFull`. Also removes the malformed blank line in
`UserDataResponseShort`.
- Delete the now-unused `UserDataResponseFull` and
`UserDataResponseShort` schemas (replaced by reuse of
`GetUserDataResponseFullFile` and an inline string variant).
- Add an `x-variant-selector` vendor extension to both `oneOf` sites
documenting which query-parameter combination selects which branch,
since a true OpenAPI `discriminator` is not applicable (the variants
are type-disjoint and the selector lives in the request, not the
response body).
This keeps the shapes the server actually emits (no wire-breaking
change) while making the selection rule explicit for SDK generators
and readers.
---------
Co-authored-by: guill <jacob.e.segal@gmail.com>
Currently if the graph contains a cycle, the just inifitiate recursions,
hits a catch all then throws a generic error against the output node
that seeded the validation. Instead, fail the offending cycling mode
chain and handlng it as an error in its own right.
Co-authored-by: guill <jacob.e.segal@gmail.com>
This was doing an over-estimate of VRAM used by the async allocator when lots
of little small tensors were in play.
Also change the versioning scheme to == so we can roll forward aimdo without
worrying about stable regressions downstream in comfyUI core.
* feat(api-nodes): add SD2 real human support
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* fix: add validation before uploading Assets
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* Add asset_id and group_id displaying on the node
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* extend poll_op to use instead of custom async cycle
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* added the polling for the "Active" status after asset creation
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* updated tooltip for group_id
* allow usage of real human in the ByteDance2FirstLastFrame node
* add reference count limits
* corrected price in status when input assets contain video
Signed-off-by: bigcat88 <bigcat88@icloud.com>
---------
Signed-off-by: bigcat88 <bigcat88@icloud.com>
the mixed_precision ops can have input_scale parameters that are used
in tensor math but arent a weight or bias so dont get proper VRAM
management. Treat these as force-castable parameters like the non comfy
weight, random params are buffers already are.
On Windows with aimdo enabled, disable_weight_init.Linear uses lazy
initialization that sets weight and bias to None to avoid unnecessary
memory allocation. This caused a crash when copy_() was called on the
None weight attribute in Stable_Zero123.__init__.
Replace copy_() with direct torch.nn.Parameter assignment, which works
correctly on both Windows (aimdo enabled) and other platforms.
* initial RIFE support
* Also support FILM
* Better RAM usage, reduce FILM VRAM peak
* Add model folder placeholder
* Fix oom fallback frame loss
* Remove torch.compile for now
* Rename model input
* Shorter input type name
---------