* pinned_memory: remove JIT RAM pressure release
This doesn't work, as freeing intermediates for pins needs to be
higher-priority than freeing pins-for-pins if and when you are going
to do that. So this is too late as pins-for-pins is model load time
and we dont have JIT pins-for-pins.
* cacheing: Add a filter to only free intermediates from inactive wfs
This is to get priorities in amongst pins straight.
* mm: free inactive-ram from RAM cache first
Stuff from inactive workflows should be freed before anything else.
* caching: purge old ModelPatchers first
Dont try and score them, just dump them at the first sign of trouble
if they arent part of the workflow.
On failure (ex: animated webp files) fallback to old pillow code.
This should fix the extra precision in high bit depth images (like 16 bit PNG) being discarded when loaded by Pillow and potentially add support for more image formats.
Comfy-aimdo 0.3.0 contains several major new features.
multi-GPU support
ARM support
AMD support
Refactorings include:
Linkless architecture - linkage is now performed purely at runtime
to stop host library lookups completely and only interact with the
torch-loaded Nvidia stack.
Elimination of cudart integration on linux. Its no consistent with
windows.
Misc bugfixes and minor features.
SolidMask had a hardcoded device="cpu" while other nodes (e.g.
EmptyImage) follow intermediate_device(). This causes a RuntimeError
when MaskComposite combines masks from different device sources
under --gpu-only.
- SolidMask: use intermediate_device() instead of hardcoded "cpu"
- MaskComposite: align source device to destination before operating
Co-authored-by: Alexis Rolland <alexisrolland@hotmail.com>
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
* Change save 3d model's filename prefix to 3d/ComfyUI
As this node has already changed from `Save GLB` to `Save 3D Model`, using the filename prefix `3d` will be better than `mesh`
* use lowercase
---------
Fires on v* tag push (earlier than release.published, which can lag)
and triggers a repository_dispatch on Comfy-Org/cloud with event_type
comfyui_tag_pushed. Legacy desktop dispatch in release-webhook.yml
is left untouched.
* Add OpenAPI 3.1 specification for ComfyUI API
Adds a comprehensive OpenAPI 3.1 spec documenting all HTTP endpoints
exposed by ComfyUI's server, including prompt execution, queue management,
file uploads, userdata, settings, system stats, object info, assets,
and internal routes.
The spec was validated against the source code with adversarial review
from multiple models, and passes Spectral linting with zero errors.
Also removes openapi.yaml from .gitignore so the spec is tracked.
* Mark /api/history endpoints as deprecated
Address Jacob's review feedback on PR #13397 by explicitly marking the
three /api/history operations as deprecated in the OpenAPI spec:
* GET /api/history -> superseded by GET /api/jobs
* POST /api/history -> superseded by /api/jobs management
* GET /api/history/{prompt_id} -> superseded by GET /api/jobs/{job_id}
Each operation gains deprecated: true plus a description that names the
replacement. A formal sunset timeline (RFC 8594 Deprecation and RFC 8553
Sunset headers, minimum-runway policy) is being defined separately and
will be applied as a follow-up.
* Address Spectral lint findings in openapi.yaml
- Add operation descriptions to 52 endpoints (prompt, queue, upload,
view, models, userdata, settings, assets, internal, etc.)
- Add schema descriptions to 22 component schemas
- Add parameter descriptions to 8 path parameters that were missing them
- Remove 6 unused component schemas: TaskOutput, EmbeddingsResponse,
ExtensionsResponse, LogRawResponse, UserInfo, UserDataFullInfo
No wire/shape changes. Reduces Spectral findings from 92 to 4. The
remaining 4 are real issues (WebSocket 101 on /ws, loose error schema,
and two snake_case warnings on real wire field names) and are worth
addressing separately.
* fix(openapi): address jtreminio oneOf review on /api/userdata
Restructure the UserData response schemas to address the review feedback
on the `oneOf` without a discriminator, and fix two accuracy bugs found
while doing it.
Changes
- GET /api/userdata response: extract the inline `oneOf` to a named
schema (`ListUserdataResponse`) and add the missing third variant
returned when `split=true` and `full_info=false` (array of
`[relative_path, ...path_components]`). Previously only two of the
three actual server response shapes were described.
- UserDataResponse (POST endpoints): correct the description — this
schema is a single item, not a list — and point at the canonical
`GetUserDataResponseFullFile` schema instead of the duplicate
`UserDataResponseFull`. Also removes the malformed blank line in
`UserDataResponseShort`.
- Delete the now-unused `UserDataResponseFull` and
`UserDataResponseShort` schemas (replaced by reuse of
`GetUserDataResponseFullFile` and an inline string variant).
- Add an `x-variant-selector` vendor extension to both `oneOf` sites
documenting which query-parameter combination selects which branch,
since a true OpenAPI `discriminator` is not applicable (the variants
are type-disjoint and the selector lives in the request, not the
response body).
This keeps the shapes the server actually emits (no wire-breaking
change) while making the selection rule explicit for SDK generators
and readers.
---------
Co-authored-by: guill <jacob.e.segal@gmail.com>
* fix: pin SQLAlchemy>=2.0 in requirements.txt (fixes#13036) (#13316)
* Refactor io to IO in nodes_ace.py (#13485)
* Bump comfyui-frontend-package to 1.42.12 (#13489)
* Make the ltx audio vae more native. (#13486)
* feat(api-nodes): add automatic downscaling of videos for ByteDance 2 nodes (#13465)
* Support standalone LTXV audio VAEs (#13499)
* [Partner Nodes] added 4K resolution for Veo models; added Veo 3 Lite model (#13330)
* feat(api nodes): added 4K resolution for Veo models; added Veo 3 Lite model
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* increase poll_interval from 5 to 9
---------
Signed-off-by: bigcat88 <bigcat88@icloud.com>
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
* Bump comfyui-frontend-package to 1.42.14 (#13493)
* Add gpt-image-2 as version option (#13501)
* Allow logging in comfy app files. (#13505)
* chore: update workflow templates to v0.9.59 (#13507)
* fix(veo): reject 4K resolution for veo-3.0 models in Veo3VideoGenerationNode (#13504)
The tooltip on the resolution input states that 4K is not available for
veo-3.1-lite or veo-3.0 models, but the execute guard only rejected the
lite combination. Selecting 4K with veo-3.0-generate-001 or
veo-3.0-fast-generate-001 would fall through and hit the upstream API
with an invalid request.
Broaden the guard to match the documented behavior and update the error
message accordingly.
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
* feat: RIFE and FILM frame interpolation model support (CORE-29) (#13258)
* initial RIFE support
* Also support FILM
* Better RAM usage, reduce FILM VRAM peak
* Add model folder placeholder
* Fix oom fallback frame loss
* Remove torch.compile for now
* Rename model input
* Shorter input type name
---------
* fix: use Parameter assignment for Stable_Zero123 cc_projection weights (fixes#13492) (#13518)
On Windows with aimdo enabled, disable_weight_init.Linear uses lazy
initialization that sets weight and bias to None to avoid unnecessary
memory allocation. This caused a crash when copy_() was called on the
None weight attribute in Stable_Zero123.__init__.
Replace copy_() with direct torch.nn.Parameter assignment, which works
correctly on both Windows (aimdo enabled) and other platforms.
* Derive InterruptProcessingException from BaseException (#13523)
* bump manager version to 4.2.1 (#13516)
* ModelPatcherDynamic: force cast stray weights on comfy layers (#13487)
the mixed_precision ops can have input_scale parameters that are used
in tensor math but arent a weight or bias so dont get proper VRAM
management. Treat these as force-castable parameters like the non comfy
weight, random params are buffers already are.
* Update logging level for invalid version format (#13526)
* [Partner Nodes] add SD2 real human support (#13509)
* feat(api-nodes): add SD2 real human support
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* fix: add validation before uploading Assets
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* Add asset_id and group_id displaying on the node
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* extend poll_op to use instead of custom async cycle
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* added the polling for the "Active" status after asset creation
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* updated tooltip for group_id
* allow usage of real human in the ByteDance2FirstLastFrame node
* add reference count limits
* corrected price in status when input assets contain video
Signed-off-by: bigcat88 <bigcat88@icloud.com>
---------
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* feat: SAM (segment anything) 3.1 support (CORE-34) (#13408)
* [Partner Nodes] GPTImage: fix price badges, add new resolutions (#13519)
* fix(api-nodes): fixed price badges, add new resolutions
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* proper calculate the total run cost when "n > 1"
Signed-off-by: bigcat88 <bigcat88@icloud.com>
---------
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* chore: update workflow templates to v0.9.61 (#13533)
* chore: update embedded docs to v0.4.4 (#13535)
* add 4K resolution to Kling nodes (#13536)
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* Fix LTXV Reference Audio node (#13531)
* comfy-aimdo 0.2.14: Hotfix async allocator estimations (#13534)
This was doing an over-estimate of VRAM used by the async allocator when lots
of little small tensors were in play.
Also change the versioning scheme to == so we can roll forward aimdo without
worrying about stable regressions downstream in comfyUI core.
* Disable sageattention for SAM3 (#13529)
Causes Nans
* execution: Add anti-cycle validation (#13169)
Currently if the graph contains a cycle, the just inifitiate recursions,
hits a catch all then throws a generic error against the output node
that seeded the validation. Instead, fail the offending cycling mode
chain and handlng it as an error in its own right.
Co-authored-by: guill <jacob.e.segal@gmail.com>
* chore: update workflow templates to v0.9.62 (#13539)
---------
Signed-off-by: bigcat88 <bigcat88@icloud.com>
Co-authored-by: Octopus <liyuan851277048@icloud.com>
Co-authored-by: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com>
Co-authored-by: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com>
Co-authored-by: Alexander Piskun <13381981+bigcat88@users.noreply.github.com>
Co-authored-by: Jukka Seppänen <40791699+kijai@users.noreply.github.com>
Co-authored-by: AustinMroz <austin@comfy.org>
Co-authored-by: Daxiong (Lin) <contact@comfyui-wiki.com>
Co-authored-by: Matt Miller <matt@miller-media.com>
Co-authored-by: blepping <157360029+blepping@users.noreply.github.com>
Co-authored-by: Dr.Lt.Data <128333288+ltdrdata@users.noreply.github.com>
Co-authored-by: rattus <46076784+rattus128@users.noreply.github.com>
Co-authored-by: guill <jacob.e.segal@gmail.com>
* Fix Hunyuan 3D 2.1 multi-GPU worksplit: use cond_or_uncond instead of hardcoded chunk(2)
Amp-Thread-ID: https://ampcode.com/threads/T-019da964-2cc8-77f9-9aae-23f65da233db
Co-authored-by: Amp <amp@ampcode.com>
* Add GPU device selection to all loader nodes
- Add get_gpu_device_options() and resolve_gpu_device_option() helpers
in model_management.py for vendor-agnostic GPU device selection
- Add device widget to CheckpointLoaderSimple, UNETLoader, VAELoader
- Expand device options in CLIPLoader, DualCLIPLoader, LTXAVTextEncoderLoader
from [default, cpu] to include gpu:0, gpu:1, etc. on multi-GPU systems
- Wire load_diffusion_model_state_dict and load_state_dict_guess_config
to respect model_options['load_device']
- Graceful fallback: unrecognized devices (e.g. gpu:1 on single-GPU)
silently fall back to default
Amp-Thread-ID: https://ampcode.com/threads/T-019daa41-f394-731a-8955-4cff4f16283a
Co-authored-by: Amp <amp@ampcode.com>
* Add VALIDATE_INPUTS to skip device combo validation for workflow portability
When a workflow saved on a 2-GPU machine (with device=gpu:1) is loaded
on a 1-GPU machine, the combo validation would reject the unknown value.
VALIDATE_INPUTS with the device parameter bypasses combo validation for
that input only, allowing resolve_gpu_device_option to handle the
graceful fallback at runtime.
Amp-Thread-ID: https://ampcode.com/threads/T-019daa41-f394-731a-8955-4cff4f16283a
Co-authored-by: Amp <amp@ampcode.com>
* Set CUDA device context in outer_sample to match model load_device
Custom CUDA kernels (comfy_kitchen fp8 quantization) use
torch.cuda.current_device() for DLPack tensor export. When a model is
loaded on a non-default GPU (e.g. cuda:1), the CUDA context must match
or the kernel fails with 'Can't export tensors on a different CUDA
device index'. Save and restore the previous device around sampling.
Amp-Thread-ID: https://ampcode.com/threads/T-019daa41-f394-731a-8955-4cff4f16283a
Co-authored-by: Amp <amp@ampcode.com>
* Fix code review bugs: negative index guard, CPU offload_device, checkpoint te_model_options
- resolve_gpu_device_option: reject negative indices (gpu:-1)
- UNETLoader: set offload_device when cpu is selected
- CheckpointLoaderSimple: pass te_model_options for CLIP device,
set offload_device for cpu, pass load_device to VAE
- load_diffusion_model_state_dict: respect offload_device from model_options
- load_state_dict_guess_config: respect offload_device, pass load_device to VAE
Amp-Thread-ID: https://ampcode.com/threads/T-019daa41-f394-731a-8955-4cff4f16283a
Co-authored-by: Amp <amp@ampcode.com>
* Fix CUDA device context for CLIP encoding and VAE encode/decode
Add torch.cuda.set_device() calls to match model's load device in:
- CLIP.encode_from_tokens: fixes 'Can't export tensors on a different
CUDA device index' when CLIP is loaded on a non-default GPU
- CLIP.encode_from_tokens_scheduled: same fix for the hooks code path
- CLIP.generate: same fix for text generation
- VAE.decode: fixes VAE decoding on non-default GPU
- VAE.encode: fixes VAE encoding on non-default GPU
Same pattern as the existing outer_sample fix in samplers.py - saves
and restores previous CUDA device in a try/finally block.
Amp-Thread-ID: https://ampcode.com/threads/T-019dabdc-8feb-766f-b4dc-f46ef4d8ff57
Co-authored-by: Amp <amp@ampcode.com>
* Extract cuda_device_context manager, fix tiled VAE methods
Add model_management.cuda_device_context() — a context manager that
saves/restores torch.cuda.current_device when operating on a non-default
GPU. Replaces 6 copies of the manual save/set/restore boilerplate.
Refactored call sites:
- CLIP.encode_from_tokens
- CLIP.encode_from_tokens_scheduled (hooks path)
- CLIP.generate
- VAE.decode
- VAE.encode
- samplers.outer_sample
Bug fixes (newly wrapped):
- VAE.decode_tiled: was missing device context entirely, would fail
on non-default GPU when called from 'VAE Decode (Tiled)' node
- VAE.encode_tiled: same issue for 'VAE Encode (Tiled)' node
Amp-Thread-ID: https://ampcode.com/threads/T-019dabdc-8feb-766f-b4dc-f46ef4d8ff57
Co-authored-by: Amp <amp@ampcode.com>
* Restore CheckpointLoaderSimple, add CheckpointLoaderDevice
Revert CheckpointLoaderSimple to its original form (no device input)
so it remains the simple default loader.
Add new CheckpointLoaderDevice node (advanced/loaders) with separate
model_device, clip_device, and vae_device inputs for per-component
GPU placement in multi-GPU setups.
Amp-Thread-ID: https://ampcode.com/threads/T-019dabdc-8feb-766f-b4dc-f46ef4d8ff57
Co-authored-by: Amp <amp@ampcode.com>
---------
Co-authored-by: Amp <amp@ampcode.com>
Currently if the graph contains a cycle, the just inifitiate recursions,
hits a catch all then throws a generic error against the output node
that seeded the validation. Instead, fail the offending cycling mode
chain and handlng it as an error in its own right.
Co-authored-by: guill <jacob.e.segal@gmail.com>
This was doing an over-estimate of VRAM used by the async allocator when lots
of little small tensors were in play.
Also change the versioning scheme to == so we can roll forward aimdo without
worrying about stable regressions downstream in comfyUI core.
* feat(api-nodes): add SD2 real human support
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* fix: add validation before uploading Assets
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* Add asset_id and group_id displaying on the node
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* extend poll_op to use instead of custom async cycle
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* added the polling for the "Active" status after asset creation
Signed-off-by: bigcat88 <bigcat88@icloud.com>
* updated tooltip for group_id
* allow usage of real human in the ByteDance2FirstLastFrame node
* add reference count limits
* corrected price in status when input assets contain video
Signed-off-by: bigcat88 <bigcat88@icloud.com>
---------
Signed-off-by: bigcat88 <bigcat88@icloud.com>
the mixed_precision ops can have input_scale parameters that are used
in tensor math but arent a weight or bias so dont get proper VRAM
management. Treat these as force-castable parameters like the non comfy
weight, random params are buffers already are.
On Windows with aimdo enabled, disable_weight_init.Linear uses lazy
initialization that sets weight and bias to None to avoid unnecessary
memory allocation. This caused a crash when copy_() was called on the
None weight attribute in Stable_Zero123.__init__.
Replace copy_() with direct torch.nn.Parameter assignment, which works
correctly on both Windows (aimdo enabled) and other platforms.
* initial RIFE support
* Also support FILM
* Better RAM usage, reduce FILM VRAM peak
* Add model folder placeholder
* Fix oom fallback frame loss
* Remove torch.compile for now
* Rename model input
* Shorter input type name
---------
The tooltip on the resolution input states that 4K is not available for
veo-3.1-lite or veo-3.0 models, but the execute guard only rejected the
lite combination. Selecting 4K with veo-3.0-generate-001 or
veo-3.0-fast-generate-001 would fall through and hit the upstream API
with an invalid request.
Broaden the guard to match the documented behavior and update the error
message accordingly.
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>