EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-06-14 11:59:21 +08:00

Author	SHA1	Message	Date
Jedrzej Kosinski	403ff49647	Restore nodes_kling.py removal of max_poll_attempts=280 lost in merge Master commit `cf758bd2` (PR #13663, "chore(api-nodes): increase default timeout for partner API node tasks") removed three explicit max_poll_attempts=280 overrides from nodes_kling.py so the new 480 default in util/client.py would take effect. The May 19 merge of master into worksplit-multigpu (`ff766e5c`) silently discarded those three deletions in the 3-way resolve - nodes_kling.py had no textual conflict but the resolution kept the pre-cf758bd2 lines. The other seven files `cf758bd2` touched were merged correctly; this restores nodes_kling.py to match master. Amp-Thread-ID: https://ampcode.com/threads/T-019e52b4-31ee-72cd-996b-64ecd9420e13 Co-authored-by: Amp <amp@ampcode.com>	2026-05-22 19:44:29 -07:00
Jedrzej Kosinski	2e5211e3e5	Merge branch 'master' into worksplit-multigpu	2026-05-22 18:24:27 -07:00
Matt Miller	187442cca4	openapi: add enum values + FeedbackRequest schema for cloud cutover (PR E) (#14070 ) * openapi: add enum values + FeedbackRequest schema for cloud cutover (PR E) Adds missing cloud-runtime enum values to vendor schemas that the cloud runtime emits but vendor declared as plain strings. Changes: - JobEntry.status: enum [pending, in_progress, completed, failed, cancelled] - JobDetailResponse.status: same enum - BillingStatus: enum [awaiting_payment_method, pending_payment, paid, payment_failed, inactive] - FeedbackRequest schema added (with type enum) - /api/feedback POST: requestBody now $refs FeedbackRequest All cloud-runtime-emitted; no impact on OSS-local semantics. Identified via Comfy-Org/cloud's TestCutoverSafe gate (BE-1106) as the remaining schema-level divergences after PRs A-D landed and got synced. * openapi: add type enum to Workspace schema (cutover follow-up) Cloud's Workspace runtime shape includes a 'type' field with enum [personal, team] that vendor's Workspace was missing. Cloud handlers reference the generated ingest.WorkspaceType Go enum. Same kind of surgical addition as JobEntry.status / BillingStatus / JobDetailResponse.status in this PR — adds cloud-runtime field to existing vendor schema.	2026-05-22 18:23:22 -07:00
Jedrzej Kosinski	e6c65fa7ab	Merge pull request #14068 from Comfy-Org/fix/single-gpu-non-cuda Fix single-GPU non-CUDA regressions on worksplit-multigpu (AMD/ROCm unload, DynamicVRAM crash)	2026-05-22 17:30:57 -07:00
Jedrzej Kosinski	5ffea26de7	Fix single-GPU non-CUDA regressions on worksplit-multigpu Two fixes for single-GPU users on non-NVIDIA backends; multi-GPU non-CUDA support is intentionally out of scope here (tracked separately). 1. get_all_torch_devices: add AMD/ROCm, MLU, and a generic fallback arm. Previously the function only enumerated NVIDIA, Intel XPU, and Ascend NPU when cpu_state==GPU; on AMD/ROCm (which exposes its GPU through torch.cuda.) and DirectML it fell through to an empty list. The biggest user-visible regression: unload_all_models() iterates this list, so it became a silent no-op on AMD/ROCm. /free, manager unloads, and shutdown stopped releasing VRAM. - is_amd() now shares the torch.cuda. arm with is_nvidia(), since ROCm reuses the CUDA API surface. - is_mlu() gets its own arm using torch.mlu.device_count(). - A final fallback appends get_torch_device() for any GPU backend the explicit arms miss (notably DirectML), so callers see at least the current device and unload_all_models works. MPS users are unaffected: cpu_state==MPS already routes to the else branch which appends get_torch_device() returning mps. 2. main.py DynamicVRAM init: guard the comfy_aimdo branch with an explicit is_nvidia() check. The outer condition allows entering the DynamicVRAM init block when the user passes --enable-dynamic-vram explicitly, bypassing the implicit is_nvidia() gate. On non-NVIDIA backends this then runs comfy_aimdo.control.init_devices(range(torch.cuda.device_count())), which is comfy-aimdo-only territory and may crash at startup. Add a leading is_nvidia() check that logs a clean warning and falls back to the legacy ModelPatcher path.	2026-05-22 17:13:55 -07:00
Jedrzej Kosinski	5dc4e38b89	Defer @pollockjj's tiled-VAE and UPSCALE_MODEL MultiGPU lanes (#14066 ) * Revert "Add tiled VAE lane to MultiGPU Work Units" This reverts commit `4d3d68e473`. The tiled VAE lane will land as part of a follow-up PR alongside the UPSCALE_MODEL lane, separated from the threaded-loader fix PR (#14052) to keep the upstream merge focused. * Revert "Add UPSCALE_MODEL lane to MultiGPU CFG Split" This reverts commit `74b0a826ea`. The UPSCALE_MODEL lane will land as part of a follow-up PR alongside the tiled VAE lane, separated from the threaded-loader fix PR (#14052) to keep the upstream merge focused. --------- Co-authored-by: John Pollock <pollockjj@gmail.com>	2026-05-22 16:44:29 -07:00
Jedrzej Kosinski	cb83c41db7	Merge pull request #14052 from rattus128/prs/worksplit-t-load-fix fixup threaded loader with worksplit multi-gpu	2026-05-22 16:36:33 -07:00
Matt Miller	c3c881f37b	openapi: rename cloud-side response schemas to match runtime (PR D) (#14065 ) * openapi: rename cloud-side response schemas to match runtime (PR D) Follow-up to the BE-1106 stack (#14060/61/63). Cloud's Go handlers reference response schemas by name (e.g., ingest.WorkflowResponse, ingest.SubscribeResponse), but vendor's matching operations were declaring those responses against differently-named vendor-side schemas (CloudWorkflow, BillingSubscription, etc.). After the stack landed, schemas like WorkflowResponse exist in vendor but weren't referenced by any path, so codegen pruned the unreferenced types. This PR: 1. Updates 34 operation $refs in cloud-runtime paths to point to the schema names cloud's handlers expect (e.g., CloudWorkflow → WorkflowResponse on /api/workflows/{workflow_id}). 2. Adds 12 cloud-only schemas that weren't in vendor yet but are referenced by these renames (e.g., SubscribeResponse, CancelSubscriptionResponse, BillingOpStatusResponse). Each copied verbatim from Comfy-Org/cloud's hand-written ingest spec and tagged x-runtime: [cloud] with a [cloud-only] description prefix. Schema renames span the same domains as the operationId renames in PR A: billing/subscriptions (7 schemas), workflows (5), userdata (3), jobs (2), hub (2), history (2), auth/workspace (4), and misc cloud endpoints (9). Convergent safety check after this lands (against cloud's TestCutoverSafe gate, BE-1106): Pre-PR D: 205 missing handler refs Post-PR D: 105 missing handler refs (-49%) Cumulative since the original 938-ref baseline: -89% The remaining 105 are a Phase 3 follow-up (response headers, text/plain responses, codegen-derived enum sub-types, and a small set of inline-response-schema operations that vendor declares inline where cloud has named-schema $refs). * openapi: drop PR-label comment from new schemas block PR-internal labels don't belong in committed code — future readers won't know what 'PR D' means and the marker stops being useful the moment this PR merges.	2026-05-22 16:34:52 -07:00
Matt Miller	7984a6a38e	openapi: rename 55 cloud-side operationIds to match runtime (PR A of 3) (#14060 ) Some checks are pending Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run Details Execution Tests / test (macos-latest) (push) Waiting to run Details Execution Tests / test (ubuntu-latest) (push) Waiting to run Details Execution Tests / test (windows-latest) (push) Waiting to run Details Test server launches without errors / test (push) Waiting to run Details Unit Tests / test (macos-latest) (push) Waiting to run Details Unit Tests / test (ubuntu-latest) (push) Waiting to run Details Unit Tests / test (windows-2022) (push) Waiting to run Details * openapi: rename 55 cloud-side operationIds to match runtime handlers For the 55 operations below, vendor's operationId did not match the name cloud's runtime handlers expect. Generated types from vendor therefore had different names (e.g. CreateSubscription200JSONResponse) than what cloud handlers reference (Subscribe200JSONResponse), which blocks the post-cutover combined-spec codegen. All 55 renames target the cloud-runtime-authoritative name. Several of these endpoints are shared concepts (queue, settings, userdata, object_info) that OSS local also serves — the rename aligns vendor with the longstanding cloud handler-side convention to unblock the shared codegen. No request/response shape changes in this PR; only operationId labels. Notable categories: - Billing/subscriptions: 7 renames (subscribe, getBillingPlans, ...) - Workspace + workflows: 13 renames (createWorkflow, ...) - Hub: 3 renames - Auth/users: 5 renames - Shared OSS surface (settings, queue, view, userdata): 12 renames - Misc cloud-only: 15 renames Identified via Comfy-Org/cloud's TestCutoverSafe build-safety gate (BE-1106), which compares handler type references against codegen output from the combined spec. * fix(openapi): resolve getHistory operationId collision Spectral flagged: both /api/history (OSS local) and /api/history_v2 (cloud) had operationId 'getHistory' after the rename. Rename vendor's /api/history to 'getPromptHistory' to disambiguate. Cloud's runtime denies /api/history at the overlay level so combined codegen is unaffected by this change. * openapi: add 41 cloud-runtime schemas to components.schemas (PR B of 3) (#14061) * openapi: add 41 cloud-runtime schemas to components.schemas (cutover prep) Adds schemas that exist in Comfy-Org/cloud's hand-written ingest spec but not yet in this vendored OSS spec. All tagged x-runtime: [cloud] per the field-drift convention and prefixed with [cloud-only] in the description. These schemas are referenced by cloud's Go handlers via the generated ingest.<Schema> Go type names. Codegen from the vendored spec didn't produce those types because the schemas weren't declared here. Adding them unblocks the post-cutover combined-spec codegen. Schemas added (alphabetical): AssetDownloadResponse, AssetMetadataResponse, BillingBalanceResponse, BillingPlansResponse, BillingStatusResponse, GetUserDataResponseFull, HistoryDetailEntry, HistoryDetailResponse, HistoryResponse, HubLabelInfo, HubProfileSummary, HubWorkflowListResponse, HubWorkflowStatus, HubWorkflowSummary, HubWorkflowTemplateEntry, JobStatusResponse, JobsListResponse, LabelRef, LogsResponse, Member, OAuthRegisterBadRequestResponse, PendingInvite, Plan, PlanAvailability, PlanAvailabilityReason, PlanSeatSummary, PreviewPlanInfo, PreviewSubscribeResponse, PublishedWorkflowDetail, SecretResponse, SubscriptionDuration, SubscriptionTier, UserDataResponseFull, ValidationError, ValidationResult, WorkflowForkedFrom, WorkflowResponse, WorkflowVersionContentResponse, WorkspaceAPIKeyInfo, WorkspaceSummary, WorkspaceWithRole Identified via Comfy-Org/cloud's TestCutoverSafe build-safety gate (BE-1106). Companion to PR #14060 (operationId renames). * fix(openapi): add BindingErrorResponse schema OAuthRegisterBadRequestResponse references BindingErrorResponse but that schema wasn't in the original add. Adding it now as a cloud-only schema matching the cloud runtime's binding-error shape (single 'message' string field). * openapi: add missing 4xx/5xx response bodies for cloud-emitting endpoints (#14063) Vendor declares shared endpoints (e.g. /api/queue, /api/settings, /api/assets/, /api/billing/) with success responses but is missing many of the 4xx/5xx error response bodies that Comfy-Org/cloud's runtime actually emits. Cloud's Go handlers reference the generated ingest.Op<StatusCode>JSONResponse types for these missing statuses, which currently fail to resolve when codegen runs against the vendored spec. This PR adds 237 response entries across 117 operations, restoring the documented error responses that cloud emits. Bodies are copied verbatim from Comfy-Org/cloud's hand-written ingest spec (services/ingest/openapi.yaml) and reference a new ErrorResponse schema also added in this PR (matches cloud's {code, message} runtime shape, tagged x-runtime: [cloud]). ErrorResponse is intentionally separate from the existing CloudError schema. CloudError's shape ({error}) describes one runtime; cloud emits a different shape ({code, message}). Existing CloudError refs in vendor are untouched; new cloud-emitting error references use ErrorResponse. Identified via Comfy-Org/cloud's TestCutoverSafe build-safety gate (BE-1106). Companion to PR #14060 (operationId renames) and PR #14061 (cloud-only schema additions).	2026-05-22 16:15:18 -07:00
comfyanonymous	e75b739c1d	Delete the source branch after doing the backport. (#14062 )	2026-05-22 15:47:03 -07:00
Matt Miller	112fcd5f3b	openapi: align response declarations with implementation (5 endpoints) (#14058 ) * openapi: align response declarations with implementation (5 endpoints) - POST /api/assets/download: replace 200 with 202 + tracking-task body (endpoint runs asynchronously and returns task_id/status/message). - POST /api/assets/export: same 200 → 202 + tracking-task body. - POST /api/assets/from-workflow: change 201 → 200 (handler responds 200, not 201; no Location header emitted). - POST /api/feedback: change 200 → 201 (creates a feedback record). - /api/jobs and /api/jobs/{job_id}: change timestamp fields from type: number to type: integer + format: int64. Values are Unix milliseconds — number causes oapi-codegen to emit float64, losing precision and producing the wrong Go type. Affected fields: create_time, update_time, execution_start_time, execution_end_time. Verification: each change reflects what the endpoint observably returns; no handler changes required. Backwards-compatible for existing clients (integer is a subset of number; status code shifts within 2xx). * openapi: align asset download/export 202 status enum with runtime + sibling schemas CodeRabbit caught a vocabulary mismatch: the two new 202 response schemas declared `[pending, running, completed, failed]` while the rest of the same spec uses `[created, running, completed, failed]` for the identical task lifecycle (download/export progress WebSocket events, /api/tasks, TaskEntry, TaskResponse — 4 sites total). Cloud's runtime emits `created` on initial creation (AssetDownloadResponseStatusCreated; task.Status sourced from the DB enum whose initial value is Created). `pending` would have introduced a fifth, contradictory vocabulary for the same lifecycle and pushed the spec further from the implementation it is meant to align with. Followup tracked separately: extract a shared TaskStatus enum so all five sites move in lockstep instead of needing per-site edits.	2026-05-22 14:31:43 -07:00
John Pollock	4d3d68e473	Add tiled VAE lane to MultiGPU Work Units Some checks are pending Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details	2026-05-22 13:42:21 -05:00
John Pollock	74b0a826ea	Add UPSCALE_MODEL lane to MultiGPU CFG Split Introduce tiled_scale_multidim_multigpu in comfy/utils.py: a tile scheduler that dispatches per-device tile functions through the existing MultiGPUThreadPool and merges per-device CPU output buffers in deterministic key order. The worker only catches BaseException at the thread boundary to funnel errors to the main thread; bare torch.cuda.set_device and torch.cuda.synchronize calls inside the worker fail loud if the device is not CUDA, which is part of the primitive's contract. Add UPSCALE_MODEL input on the MultiGPU CFG Split node and an upscale-model descriptor deepclone helper in comfy/multigpu.py. Clones stay CPU-resident until execute time and are returned to CPU afterward. ImageUpscaleWithModel dispatches through tiled_scale_multidim_multigpu when a multigpu descriptor is attached; the single-device path runs unchanged when no clones are present.	2026-05-22 13:41:48 -05:00
Alexander Piskun	1579bbb52d	[Partner Nodes] add new Rodin2.5 nodes (#14051 ) * [Partner Nodes] add new Rodin2.5 nodes Signed-off-by: bigcat88 <bigcat88@icloud.com> * [Partner Nodes] fixed Quality Mesh Options Signed-off-by: bigcat88 <bigcat88@icloud.com> * [Partner Nodes] fix: remove non-supported "usdz" Signed-off-by: bigcat88 <bigcat88@icloud.com> * [Partner Nodes] fix: always pass seed to server Signed-off-by: bigcat88 <bigcat88@icloud.com> * [Partner Nodes] fix: set the default "material" value to "Shaded" Signed-off-by: bigcat88 <bigcat88@icloud.com> --------- Signed-off-by: bigcat88 <bigcat88@icloud.com>	2026-05-22 09:07:21 -07:00
Rattus	7a18f9affb	comfy-aimdo 0.4.4 Comfy-aimdo 0.4.4 contains a small bugfix to allow recovery of a hostbuf after full truncation. This pattern doesnt happen as a general rule, but does happen in the upcoming worksplit-multigpu branch.	2026-05-23 01:00:30 +10:00
Rattus	df17b560c5	memory_management: replace thread refusal with mutex This was an attempt to be a fast path by ensuring the file slice was created by the owning thread and refusing without needing ot mutex but worksplit-multigpu doesnt work that way. Go mutex. Shoot me for overthinking next time.	2026-05-23 01:00:30 +10:00
Alexis Rolland	93888ae8e3	Move logic nodes into utils category (#14033 ) Some checks are pending Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run Details Execution Tests / test (macos-latest) (push) Waiting to run Details Execution Tests / test (ubuntu-latest) (push) Waiting to run Details Execution Tests / test (windows-latest) (push) Waiting to run Details Test server launches without errors / test (push) Waiting to run Details Unit Tests / test (macos-latest) (push) Waiting to run Details Unit Tests / test (ubuntu-latest) (push) Waiting to run Details Unit Tests / test (windows-2022) (push) Waiting to run Details	2026-05-22 13:32:08 +08:00
Pauan	38ebc19037	Adding in And, Or, and Not nodes. (#14004 )	2026-05-22 11:01:12 +08:00
comfyanonymous	9650570378	Update Discord invite link in README.md (#14045 )	2026-05-21 19:52:38 -07:00
rattus	f48c32871b	fe: Consolidate warnings (#13970 )	2026-05-22 10:18:13 +08:00
comfyanonymous	8edff549e3	Update backport workflow to use commit SHA input (#14043 )	2026-05-21 18:22:47 -07:00
comfyanonymous	8fecef0686	Add validation for source branch in backport workflow (#14042 )	2026-05-21 16:39:19 -07:00
Jedrzej Kosinski	5d681a5420	Fix SIGPIPE false negative in backport release validation (#14041 )	2026-05-21 16:29:08 -07:00
comfyanonymous	32e58393b8	Add backport release workflow. (#14038 ) Some checks failed Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run Details Execution Tests / test (macos-latest) (push) Waiting to run Details Execution Tests / test (ubuntu-latest) (push) Waiting to run Details Execution Tests / test (windows-latest) (push) Waiting to run Details Test server launches without errors / test (push) Waiting to run Details Unit Tests / test (macos-latest) (push) Waiting to run Details Unit Tests / test (ubuntu-latest) (push) Waiting to run Details Unit Tests / test (windows-2022) (push) Waiting to run Details Build package / Build Test (3.10) (push) Has been cancelled Details Build package / Build Test (3.11) (push) Has been cancelled Details Build package / Build Test (3.12) (push) Has been cancelled Details Build package / Build Test (3.13) (push) Has been cancelled Details Build package / Build Test (3.14) (push) Has been cancelled Details	2026-05-21 14:49:55 -07:00
Jedrzej Kosinski	b649502c9c	Report all torch devices from /system_stats Some checks failed Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Build package / Build Test (3.10) (push) Has been cancelled Details Build package / Build Test (3.11) (push) Has been cancelled Details Build package / Build Test (3.12) (push) Has been cancelled Details Build package / Build Test (3.13) (push) Has been cancelled Details Build package / Build Test (3.14) (push) Has been cancelled Details The /system_stats endpoint was returning a hardcoded single-element devices list built from get_torch_device(), which only reflects the primary CUDA device. On multi-GPU systems this hides the additional devices from frontends / tooling (the API surface that enables multigpu support discovery). Switch to iterating get_all_torch_devices(), with the primary device kept first so existing clients reading devices[0] keep working. (Worksplit-multigpu-only: get_all_torch_devices is the multigpu helper introduced on this branch; master's /system_stats remains unchanged.) Amp-Thread-ID: https://ampcode.com/threads/T-019e4a00-fe3d-76bd-a2f2-a8c8c4040082 Co-authored-by: Amp <amp@ampcode.com>	2026-05-21 13:04:54 -07:00
Jedrzej Kosinski	2ed396c769	Mark non-NVIDIA multigpu gaps with TODOs in _handle_batch Two CodeRabbit findings from #7063 (#13 and #14) are deferred because worksplit-multigpu's initial release scope is NVIDIA-only QA. Leave a TODO at the unconditional torch.cuda.set_device call and at the post-aggregation point so the required guards/synchronize are easy to find when multigpu support is extended to XPU/NPU/MPS/CPU/DirectML. Amp-Thread-ID: https://ampcode.com/threads/T-019e4a00-fe3d-76bd-a2f2-a8c8c4040082 Co-authored-by: Amp <amp@ampcode.com>	2026-05-21 12:47:43 -07:00
Kosinkadink	d0b9dbb5a6	Merge remote-tracking branch 'origin/master' into worksplit-multigpu Brings in 18 commits from master so worksplit-multigpu does not regress fixes that landed on main since the last sync: - #13699 Hunyuan 3D 2.1 batch-size fixes (overlap with our own backport; conflict resolved in favor of the shape>=2 gate that binds swap_cfg_halves once and reuses it for the output swap-back) - #14031 ModelPatcherDynamic lora reshape / backup restore fix - #13802 Multi-threaded model load (memory_management / pinned_memory / model_management / aimdo plumbing) - #12679 lanczos single-channel tensor fix - #14010 Stable Audio 3 support - assorted partner-node, openapi, workflow-template, and tooling updates Amp-Thread-ID: https://ampcode.com/threads/T-019e4a00-fe3d-76bd-a2f2-a8c8c4040082 Co-authored-by: Amp <amp@ampcode.com>	2026-05-21 12:17:59 -07:00
Kosinkadink	fd79f22bdf	Backport Hunyuan 3D 2.1 attention batch-size fixes from #13699 CrossAttention.kv.view and Attention.qkv_combined.view both hardcoded batch=1 in the reshape, crashing or silently mis-shaping whenever the actual batch dimension was greater than 1. These were fixed on master in #13699 as part of the same patch that gated the chunk(2) swap, but worksplit-multigpu only picked up the chunk(2) gate. Bring the two view() fixes over so we have parity with master. Amp-Thread-ID: https://ampcode.com/threads/T-019e4a00-fe3d-76bd-a2f2-a8c8c4040082 Co-authored-by: Amp <amp@ampcode.com>	2026-05-21 12:17:24 -07:00
Kosinkadink	019261ed96	Simplify Hunyuan 3D 2.1 swap_cfg_halves gate to a shape check The previous gate (len(cond_or_uncond) == 2 and set == {0, 1}) was intended to skip the cond/uncond swap when only one half was present under MultiGPU CFG Split, but it was too restrictive: it also skipped batch_size > 1 + CFG (cond_or_uncond like [0, 0, 1, 1] or [0,0,0,0, 1,1,1,1]), where chunk(2) still splits the batch cleanly into a cond half and an uncond half and the swap is still required. Switch to context.shape[0] >= 2, matching the parallel fix landed on master in #13699. The swap is a permutation-invariant no-op when the two halves don't form a CFG pair (since the output swap_cfg_halves block immediately undoes the permutation), so the only thing the gate actually needs to do is guard against chunk(2) on a batch of one. Amp-Thread-ID: https://ampcode.com/threads/T-019e4a00-fe3d-76bd-a2f2-a8c8c4040082 Co-authored-by: Amp <amp@ampcode.com>	2026-05-21 12:14:02 -07:00
Alexander Piskun	b293f8cefd	[Partner Nodes] add widget for automatic upscaling for the ByteDance2Reference node (#14032 ) Signed-off-by: bigcat88 <bigcat88@icloud.com>	2026-05-21 11:58:03 -07:00
Daxiong (Lin)	2ca1480f91	chore: update workflow templates to v0.9.82 (#14034 )	2026-05-21 11:48:20 -07:00
Kosinkadink	822a3ecf73	Note _calc_cond_batch and _calc_cond_batch_multigpu must stay in sync Per review feedback on #7063. The two functions share the conds-by-hooks accumulation, memory-fit batching, and per-chunk output aggregation; the multigpu variant adds per-device scheduling, .to(device) placement, per-device patcher/control lookup, and thread-pool dispatch around the inner loop. Documenting the relationship without extracting helpers -- extraction can land after the initial worksplit-multigpu release once both paths have settled. Amp-Thread-ID: https://ampcode.com/threads/T-019e4a00-fe3d-76bd-a2f2-a8c8c4040082 Co-authored-by: Amp <amp@ampcode.com>	2026-05-21 11:47:53 -07:00
Jedrzej Kosinski	1417b711ce	Fix CodeRabbit findings in worksplit-multigpu (#14017 ) Fix CodeRabbit findings in worksplit-multigpu	2026-05-21 11:42:08 -07:00
Kosinkadink	a18dd219d5	Pass per-device model to multigpu control clones in pre_run_control QwenFunControlNet.pre_run stashes model.diffusion_model into extra_args, which the control_model then uses for forward passes (img_in, txt_in, pe_embedder, time_text_embed). With multigpu, every per-device control clone was being pre_run with the base model on GPU0, so secondary devices would invoke those modules with parameters on GPU0 and inputs on their own device, raising 'Expected all tensors to be on the same device'. Build a device -> per-device BaseModel lookup from the patcher's additional multigpu models and pass each clone the model on its own device. Falls back to the base model when no per-device match is found (single-GPU path and the case where cnet.multigpu_clones lags the patcher's clone set). Amp-Thread-ID: https://ampcode.com/threads/T-019e4a00-fe3d-76bd-a2f2-a8c8c4040082 Co-authored-by: Amp <amp@ampcode.com>	2026-05-21 11:40:49 -07:00
Alexander Piskun	6ecf5eca7a	[Partner Nodes] add OpenRouter LLM node (#14007 ) * [Partner Nodes] add reasoning widget to Anthropic node Signed-off-by: bigcat88 <bigcat88@icloud.com> * [Partner Nodes] add new OpenRouterLLM node Signed-off-by: bigcat88 <bigcat88@icloud.com> * [Partner Nodes] fix passing images to Grok LLM Signed-off-by: bigcat88 <bigcat88@icloud.com> --------- Signed-off-by: bigcat88 <bigcat88@icloud.com>	2026-05-21 11:36:11 -07:00
Kosinkadink	963621603c	Free QwenFunControlNet base_model reference in cleanup QwenFunControlNet.pre_run stashes the model's diffusion_model into self.extra_args['base_model'], but ControlBase.cleanup never clears extra_args. The diffusion_model reference therefore lingered between sampling runs, blocking ComfyUI's model offload/eviction logic from freeing the UNet and -- for multigpu -- holding one such reference per per-device control clone (defeating the max_gpus pruning added in this PR). Override cleanup to drop the entry; super().cleanup() already recurses into multigpu_clones so each per-device clone pops its own. Amp-Thread-ID: https://ampcode.com/threads/T-019e4a00-fe3d-76bd-a2f2-a8c8c4040082 Co-authored-by: Amp <amp@ampcode.com>	2026-05-21 11:35:54 -07:00
Kosinkadink	adde1239b1	Restore prepare_state backward-compatible signature Drop the new ignore_multigpu positional argument from prepare_state and from the ON_PREPARE_STATE callbacks; pass the flag via model_options instead. This restores the original 3-arg callback signature so existing custom-node ON_PREPARE_STATE handlers keep working unchanged, while still letting prepare_state's recursive call into multigpu_clones short-circuit. Amp-Thread-ID: https://ampcode.com/threads/T-019e4a00-fe3d-76bd-a2f2-a8c8c4040082 Co-authored-by: Amp <amp@ampcode.com>	2026-05-21 11:35:39 -07:00
rattus	03e511862e	Fix reshaping lora application (#14031 ) * ModelPatcherDyanmic: purge stale vbar allocs on force cast * ModelPatcherDynamic: restore backups before load If doing a clean reload, mutative changes (lora application) could be applied on-top of the already loaded weight. Restore from backup unconditionally so that the new load is clean.	2026-05-21 09:47:16 -07:00
Edoardo Carmignani	aab41a9ddb	fix(lanczos): correct dimension transposition for single-channel tensors (#12679 )	2026-05-21 23:47:20 +08:00
Alexis Rolland	4259a0c7c3	Update MoGe nodes display names, search aliases and descriptions (#14030 ) Some checks are pending Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run Details Execution Tests / test (macos-latest) (push) Waiting to run Details Execution Tests / test (ubuntu-latest) (push) Waiting to run Details Execution Tests / test (windows-latest) (push) Waiting to run Details Test server launches without errors / test (push) Waiting to run Details Unit Tests / test (macos-latest) (push) Waiting to run Details Unit Tests / test (ubuntu-latest) (push) Waiting to run Details Unit Tests / test (windows-2022) (push) Waiting to run Details	2026-05-21 16:50:09 +08:00
Alexis Rolland	af3d9b60af	chore: Dataset nodes clean-up (CORE-237) (#14002 )	2026-05-21 15:14:16 +08:00
Alexis Rolland	7b7c5fed7c	Update MediaPipe nodes to standardize with existing code base (CORE-242) (#14025 )	2026-05-21 14:39:30 +08:00
Matt Miller	1668aaf037	openapi: remove cloud-only job_ids query param from GET /api/assets (#14016 ) Some checks are pending Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Build package / Build Test (3.10) (push) Waiting to run Details Build package / Build Test (3.11) (push) Waiting to run Details Build package / Build Test (3.12) (push) Waiting to run Details Build package / Build Test (3.13) (push) Waiting to run Details Build package / Build Test (3.14) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run Details Execution Tests / test (windows-latest) (push) Waiting to run Details Execution Tests / test (macos-latest) (push) Waiting to run Details Execution Tests / test (ubuntu-latest) (push) Waiting to run Details Test server launches without errors / test (push) Waiting to run Details Unit Tests / test (macos-latest) (push) Waiting to run Details Unit Tests / test (ubuntu-latest) (push) Waiting to run Details Unit Tests / test (windows-2022) (push) Waiting to run Details The job_ids query parameter on GET /api/assets is tagged x-runtime: [cloud] and only exists for cloud's variant of this endpoint. Cloud removed all consumers and the cloud-side handler/codegen/tests in Comfy-Org/cloud#3778. With cloud no longer accepting this parameter, the [cloud-only] documentation here is wrong — drop it so the daily sync to cloud/services/ingest/vendor/openapi.yaml propagates the removal.	2026-05-20 21:32:08 -07:00
Matt Miller	ea174d3f12	fix(openapi): correct POST /api/assets/import to importPublishedAssets (#14027 ) The operation at POST /api/assets/import was defined as `importAssets` with a URL-list body shape, but no runtime actually serves that operation at this path. The cloud runtime serves a different operation here — `importPublishedAssets` — which imports published-workflow assets into the caller's library by ID, not by URL. Cloud's URL-based asset ingestion lives at separate paths (POST /assets/download + GET /assets/remote-metadata) tracked elsewhere; nothing in this PR affects that work. Changes: - Replace the operation at POST /api/assets/import with `importPublishedAssets`, taking ImportPublishedAssetsRequest (published_asset_ids + optional share_id) and returning ImportPublishedAssetsResponse (list of AssetInfo). - Remove the unused AssetImportRequest component schema (no other references in the spec). - Operation and schemas tagged x-runtime: [cloud] with [cloud-only] description prefix, matching the existing convention for cloud-runtime-only operations elsewhere in the spec. Spectral lint passes (0 errors); the two hint-level findings on the spec are pre-existing and unrelated. No FE consumer references AssetImportRequest today; this is a pure spec correction to match what the cloud runtime actually serves.	2026-05-20 21:28:16 -07:00
Matt Miller	9f9b32ed97	feat: add OAuth 2.1 + RFC 7591 DCR endpoints to openapi.yaml (#14026 ) Add the OAuth 2.1 authorization flow and RFC 7591 Dynamic Client Registration endpoints to the shared spec, alongside the existing auth-tagged operations (/api/auth/session, /api/auth/token, /.well-known/jwks.json). All tagged x-runtime: [cloud] with a [cloud-only] description prefix, following the established convention for cloud-runtime-only operations. Endpoints: - GET /.well-known/oauth-authorization-server (RFC 8414 metadata) - GET /.well-known/oauth-protected-resource (RFC 9728 metadata) - GET /oauth/authorize (consent challenge) - POST /oauth/authorize (consent submission) - POST /oauth/token (RFC 6749 §3.2) - POST /oauth/register (RFC 7591 §3.1 DCR) Component schemas added: - OAuthAuthorizationServerMetadata - OAuthProtectedResourceMetadata - OAuthConsentChallenge, OAuthConsentChallengeWorkspace - OAuthAuthorizeRedirectResponse - OAuthTokenResponse, OAuthTokenError - OAuthRegisterRequest, OAuthRegisterResponse, OAuthRegisterError These endpoints are implemented in the cloud runtime today and are called by browser frontends rendering the consent UI and by MCP-spec-compliant clients (Claude Desktop, Cursor, etc.) doing auto-discovery + self-registration. Documenting them in the shared spec lets the cloud frontend generate types directly from this spec instead of maintaining a parallel definition. Spectral lints clean (0 errors). The hint-level findings on OAuthTokenError / OAuthRegisterError ("standard error schema") match the same hint on CloudError — these are protocol-specific RFC-shaped errors, not generic application errors.	2026-05-20 21:22:12 -07:00
Jedrzej Kosinski	4d9106dced	Document --cuda-device comma format and MultiGPU Options relative_speed gap Two doc-only changes addressing minor CodeRabbit findings on PR #7063: * cli_args.py: clarify --cuda-device help text to document the required comma-separated format ('0' or '0,1'), matching how the value is consumed by CUDA_VISIBLE_DEVICES in main.py. * nodes_multigpu.py: add a docstring NOTE on the (currently unregistered) MultiGPUOptionsNode explaining that its relative_speed input is plumbed through to model_options['multigpu_options'] but is not yet consulted by the cond scheduler, which still uses uniform round-robin via next_available_device(). Wire relative_speed into the scheduler before re-enabling the node. Amp-Thread-ID: https://ampcode.com/threads/T-019e43b8-8258-70fd-ab3a-53e4c97f85d5 Co-authored-by: Amp <amp@ampcode.com>	2026-05-20 20:48:59 -07:00
Jedrzej Kosinski	ac0a90c323	Use cond_shapes in multigpu memory-fit check (parity with single-GPU path) The multigpu cond-batching loop called model.memory_required(input_shape) without conditioning shapes, while the single-GPU path at line 279 passes cond_shapes. Large conditioning tensors (e.g. video prompts, control inputs) were therefore under-counted, risking OOM at runtime when the chosen batch size was too large. Match the single-GPU pattern by building cond_shapes from each batched cond's conditioning dict and passing it to memory_required. Amp-Thread-ID: https://ampcode.com/threads/T-019e43b8-8258-70fd-ab3a-53e4c97f85d5 Co-authored-by: Amp <amp@ampcode.com>	2026-05-20 19:52:03 -07:00
comfyanonymous	95fdc6cf91	Repo security stuff. (#14019 )	2026-05-20 17:17:55 -07:00
rattus	5aa5ccc9e0	Multi-threaded load of models from disk (big load time speedups & Offload to disk) (CORE-43,CORE-152,CORE-164,CORE-165,CORE-117) (#13802 ) * model_management: disable non-dynamic smart memory Disable smart memory outright for non dynamic models. This is a minor step towards deprecation of --disable-dynamic-vram and the legacy ModelPatcher. This is needed for estimate-free model development, where new models can opt-out of supplying a memory estimate and not have to worry about hard VRAM allocations due to legacy non-dynamic model patchers This is also a general stability increase for a lot of stray use cases where estimates may still be off and going forward we are not going to accurately maintain such estimates. * pinned_memory: implement with aimdo growable buffer Use a single growable buffer so we can do threaded pre-warming on pinned memory. * mm: use aimdo to do transfer from disk to pin Aimdo implements a faster threaded loader. * Add stream host pin buffer for AIMDO casts Introduce per-offload-stream HostBuffer reuse for pinned staging, include it in cast buffer reset synchronization. Defer actual casts that go via this pin path to a separate pass such that the buffer can be allocated monolithically (to avoid cudaHostRegister thrash). * remove old pin path * Implement JIT pinned memory pressure Replace the predictive pin pressure mechanism with JIT PIN memory pressure. * LowVRAMPatch: change to two-phase visit * lora: re-implement as inplace swiss-army-knife operation * prepare for multiple pin sets * implement pinned loras * requirements: comfy-aimdo 0.4.0 * ops: remove unused arg This was defeatured in aimdo iteration * ops: sync the CPU with only the offload stream activity This was syncing with the offload stream which itself is synced with the compute stream, so this was syncing CPU with compute transitively. Define the event to sync it more gently. * pins: implement freeing intermediate for pinned memory Pinning is more important than inactive intermediates and the stream pin buffer is more important than even active intermediates. * execution: implement pin eviction on RAM presure Add back proper pin freeing on RAM pressure * implement pin registration swaps Uncap the windows pins from 50% by extending the pool and have a pressure mechanism to move the pin reservations om demand. This unfortunately implies a GPU sync to do the freeing so significant hysterisis needs to be added to consolidate these pressure events. * cli_args/execution: Implement lower background cache-ram threshold Limit the amount of RAM background intermediates can use, so that switching workflows doesn't degrade performance too much. * make default * bump aimdo * model-patcher: force-cast tiny weights Flux 2 gets crazy stalls due to a mix of tiny and giant weights creating lopsided steam buffer rotations which creates stalls. * ops: refactor in prep for chunking * mm: delegate pin-on-the-way to aimdo Aimdo is able to chunk and slice this on the way for better CPU->GPU overlap. The main advantage is the ability to shorten the bus contention window between previous weight transfer and the next weights vbar fault. * bump aimdo * pinning updates * specify hostbuf max allocation size There a signs of virtual memory exhaustion on some linux systems when throwing 128GB for every little piece. Pass the actual to save aimdo from over-estimates * tests: update execution tests for caching The default caching changed to ram-cache so update these tests accordingly. Remove the LRU 0 test as this also falls through to RAM cache.	2026-05-20 17:03:58 -07:00
Jedrzej Kosinski	dd85851efe	Prune inherited multigpu clones when max_gpus is lowered create_multigpu_deepclones cloned the existing 'multigpu' additional_models list verbatim and never pruned entries beyond limit_extra_devices. If a workflow was previously prepared for more GPUs, reducing max_gpus would leave stale clones attached and eligible for later scheduling. Replace the TODO block with a real prune that keeps only clones whose load_device is either the model's load_device or in limit_extra_devices, and re-match clones if anything was removed. Amp-Thread-ID: https://ampcode.com/threads/T-019e43b8-8258-70fd-ab3a-53e4c97f85d5 Co-authored-by: Amp <amp@ampcode.com>	2026-05-20 16:46:45 -07:00

1 2 3 4 5 ...

5452 Commits