EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-05-25 00:17:23 +08:00

Author	SHA1	Message	Date
Jedrzej Kosinski	b319c8088b	SelectXDevice: address code-review follow-ups True reset semantics for "default": - On first selector application, cache the loader's original load_device / offload_device on the underlying model object (which is shared across patcher clones) and restore those base values when the user picks "default". Previously "default" meant "passthrough" so SelectXDevice(gpu:1) -> SelectXDevice(default) silently kept the gpu:1 routing. CPU + dynamic VRAM: - When SelectModelDevice / SelectCLIPDevice resolves to CPU on a ModelPatcherDynamic, also call clone(disable_dynamic=True) so the result is a plain ModelPatcher, matching ModelPatcherDynamic.__new__'s intent that CPU loads never run through the dynamic path. Fallback to the regular dynamic clone if disable_dynamic is unsupported on that patcher. MultiGPU collision pruning: - After SelectModelDevice retargets the primary patcher, drop any multigpu clone (from a prior MultiGPU CFG Split) whose load_device now matches the primary; otherwise two patchers would be bound to the same device. Logs the prune at info level. SelectVAEDevice: reject CPU at runtime: - The UI uses get_gpu_device_options_no_cpu(), but a workflow opened from another machine could still pass "cpu" through validate_inputs. Detect that case explicitly, log a "CPU is not a supported choice" passthrough message, and leave the VAE unchanged. Cosmetic: - Update VAE node docstring to accurately reflect the runtime CPU rejection rather than the older "intentionally not offered" claim. - Demote the fallback warnings inside resolve_gpu_device_option to no log at all; the Select*Device nodes now own a single context-rich info-level message per failed lookup, so there is no double logging. Amp-Thread-ID: https://ampcode.com/threads/T-019e52b4-31ee-72cd-996b-64ecd9420e13 Co-authored-by: Amp <amp@ampcode.com>	2026-05-22 22:29:45 -07:00
Jedrzej Kosinski	9ee1540882	SelectXDevice: use lowercase validate_inputs for V3 combo bypass V3 io.ComfyNode subclasses use the lowercase `validate_inputs` hook for opting out of strict combo validation (execution.py line 862); the uppercase `VALIDATE_INPUTS` is the V1 spelling and is ignored on V3 nodes. The strict combo check at execution.py line 1025 is gated on `if x not in validate_function_inputs`, so renaming to `validate_inputs(cls, device='default')` lets unknown `gpu:N` values pass validation and fall through to the runtime fallback. Amp-Thread-ID: https://ampcode.com/threads/T-019e52b4-31ee-72cd-996b-64ecd9420e13 Co-authored-by: Amp <amp@ampcode.com>	2026-05-22 21:50:29 -07:00
Jedrzej Kosinski	4e650055d0	SelectXDevice nodes: register new load_device with ModelPatcherDynamic When --enable-dynamic-vram is on, every ModelPatcher is a ModelPatcherDynamic whose underlying model has a per-device dynamic_pins dict, initialized in __init__ for self.load_device only. If a cloned patcher's load_device is later reassigned (as the Select{Model,CLIP,VAE} Device nodes do), the new device key is missing and partially_unload_ram raises KeyError: device(type='cuda', index=N). Fix: - Extract the per-device dynamic_pins init in ModelPatcherDynamic.__init__ into a new helper method register_load_device(device) which is now also called from __init__. - Each Select*Device node calls clone.patcher.register_load_device(resolved) after retargeting load_device, guarded by hasattr so non-dynamic patchers (plain ModelPatcher in non-dynamic-vram installs) skip it. Caught by happy-path test where SelectCLIPDevice retargeted CLIP from cuda:0 to cuda:1 and CLIPTextEncode then crashed in partially_unload_ram -> dynamic_pins[cuda:1]. Amp-Thread-ID: https://ampcode.com/threads/T-019e52b4-31ee-72cd-996b-64ecd9420e13 Co-authored-by: Amp <amp@ampcode.com>	2026-05-22 21:46:07 -07:00
Jedrzej Kosinski	d7706091ae	Add Select Model/CLIP/VAE Device passthrough nodes Replace the per-loader device widgets removed in the previous commit with three small passthrough selector nodes registered under advanced/multigpu: - Select Model Device (MODEL in/out) - options: default / cpu / gpu:N - Select CLIP Device (CLIP in/out) - options: default / cpu / gpu:N - Select VAE Device (VAE in/out) - options: default / gpu:N (no cpu) Each node clones the inbound patcher (model.clone() / clip.clone() / copy.copy(vae)+vae.patcher.clone()) and retargets load_device (and offload_device for cpu / vae_offload_device for VAE). Portability across machines with different GPU counts: - VALIDATE_INPUTS returns True so an unknown gpu:N value (e.g. a workflow saved on a 2-GPU machine opened on a 1-GPU machine) does not error at validation time. - At runtime, resolve_gpu_device_option(...) returns None for unknown options (with a warning), and each selector then logs a per-node info message and passes through unchanged, matching the no-op style used by MultiGPU CFG Split's "No extra torch devices need initialization..." log. Also adds comfy.model_management.get_gpu_device_options_no_cpu() which the VAE selector uses; on a single-GPU box this collapses to just ["default"], which is fine. Amp-Thread-ID: https://ampcode.com/threads/T-019e52b4-31ee-72cd-996b-64ecd9420e13 Co-authored-by: Amp <amp@ampcode.com>	2026-05-22 21:39:18 -07:00
Jedrzej Kosinski	5dc4e38b89	Defer @pollockjj's tiled-VAE and UPSCALE_MODEL MultiGPU lanes (#14066 ) * Revert "Add tiled VAE lane to MultiGPU Work Units" This reverts commit `4d3d68e473`. The tiled VAE lane will land as part of a follow-up PR alongside the UPSCALE_MODEL lane, separated from the threaded-loader fix PR (#14052) to keep the upstream merge focused. * Revert "Add UPSCALE_MODEL lane to MultiGPU CFG Split" This reverts commit `74b0a826ea`. The UPSCALE_MODEL lane will land as part of a follow-up PR alongside the tiled VAE lane, separated from the threaded-loader fix PR (#14052) to keep the upstream merge focused. --------- Co-authored-by: John Pollock <pollockjj@gmail.com>	2026-05-22 16:44:29 -07:00
John Pollock	4d3d68e473	Add tiled VAE lane to MultiGPU Work Units Some checks are pending Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details	2026-05-22 13:42:21 -05:00
John Pollock	74b0a826ea	Add UPSCALE_MODEL lane to MultiGPU CFG Split Introduce tiled_scale_multidim_multigpu in comfy/utils.py: a tile scheduler that dispatches per-device tile functions through the existing MultiGPUThreadPool and merges per-device CPU output buffers in deterministic key order. The worker only catches BaseException at the thread boundary to funnel errors to the main thread; bare torch.cuda.set_device and torch.cuda.synchronize calls inside the worker fail loud if the device is not CUDA, which is part of the primitive's contract. Add UPSCALE_MODEL input on the MultiGPU CFG Split node and an upscale-model descriptor deepclone helper in comfy/multigpu.py. Clones stay CPU-resident until execute time and are returned to CPU afterward. ImageUpscaleWithModel dispatches through tiled_scale_multidim_multigpu when a multigpu descriptor is attached; the single-device path runs unchanged when no clones are present.	2026-05-22 13:41:48 -05:00
Jedrzej Kosinski	4d9106dced	Document --cuda-device comma format and MultiGPU Options relative_speed gap Two doc-only changes addressing minor CodeRabbit findings on PR #7063: * cli_args.py: clarify --cuda-device help text to document the required comma-separated format ('0' or '0,1'), matching how the value is consumed by CUDA_VISIBLE_DEVICES in main.py. * nodes_multigpu.py: add a docstring NOTE on the (currently unregistered) MultiGPUOptionsNode explaining that its relative_speed input is plumbed through to model_options['multigpu_options'] but is not yet consulted by the cond scheduler, which still uses uniform round-robin via next_available_device(). Wire relative_speed into the scheduler before re-enabling the node. Amp-Thread-ID: https://ampcode.com/threads/T-019e43b8-8258-70fd-ab3a-53e4c97f85d5 Co-authored-by: Amp <amp@ampcode.com>	2026-05-20 20:48:59 -07:00
Jedrzej Kosinski	50d1dd6273	Fix MultiGPU Options node discarding cloned GPUOptionsGroup GPUOptionsGroup.clone() returns a new instance, but the return value was discarded, causing the node to mutate the upstream caller's group in-place. When multiple MultiGPU Options nodes share an input group, each node's additions would leak into earlier siblings. Assign the clone result back to gpu_options so each node owns its own copy. Amp-Thread-ID: https://ampcode.com/threads/T-019e43b8-8258-70fd-ab3a-53e4c97f85d5 Co-authored-by: Amp <amp@ampcode.com>	2026-05-20 16:46:23 -07:00
Jedrzej Kosinski	1d8e379f41	Rename MultiGPU Work Units to MultiGPU CFG Split Amp-Thread-ID: https://ampcode.com/threads/T-019d3ee9-19d5-767a-9d7a-e50cbbef815b Co-authored-by: Amp <amp@ampcode.com>	2026-03-30 08:00:20 -07:00
Jedrzej Kosinski	5f4fcd19e7	Simplify multigpu nodes: default max_gpus=2, remove gpu_options input, disable Options node Amp-Thread-ID: https://ampcode.com/threads/T-019d3ee9-19d5-767a-9d7a-e50cbbef815b Co-authored-by: Amp <amp@ampcode.com>	2026-03-30 07:30:32 -07:00
Jedrzej Kosinski	d52dcbc88f	Rewrite multigpu nodes to V3 format Amp-Thread-ID: https://ampcode.com/threads/T-019d3ee9-19d5-767a-9d7a-e50cbbef815b Co-authored-by: Amp <amp@ampcode.com>	2026-03-30 07:23:13 -07:00
Jedrzej Kosinski	6dca17bd2d	Satisfy ruff linting	2025-03-03 23:08:29 -06:00
Jedrzej Kosinski	093914a247	Made MultiGPU Work Units node more robust by forcing ModelPatcher clones to match at sample time, reuse loaded MultiGPU clones, finalize MultiGPU Work Units node ID and name, small refactors/cleanup of logging and multigpu-related code	2025-03-03 22:56:13 -06:00
Jedrzej Kosinski	eda866bf51	Extracted multigpu core code into multigpu.py, added load_balance_devices to get subdivision of work based on available devices and splittable work item count, added MultiGPU Options nodes to set relative_speed of specific devices; does not change behavior yet	2025-01-27 06:25:48 -06:00
Jedrzej Kosinski	e3298b84de	Create proper MultiGPU Initialize node, create gpu_options to create scaffolding for asymmetrical GPU support	2025-01-26 09:34:20 -06:00
Jedrzej Kosinski	02a4d0ad7d	Added unload_model_and_clones to model_management.py to allow unloading only relevant models	2025-01-23 01:20:00 -06:00
Jedrzej Kosinski	328d4f16a9	Make WeightHooks compatible with MultiGPU, clean up some code	2025-01-20 04:34:26 -06:00
Jedrzej Kosinski	bfce723311	Initial work on multigpu_clone function, which will account for additional_models getting cloned	2025-01-17 03:31:28 -06:00
Jedrzej Kosinski	25818dc848	Added a 'max_gpus' input	2025-01-14 13:45:14 -06:00
Jedrzej Kosinski	d5088072fb	Make test node for multigpu instead of storing it in just a local __init__.py	2025-01-13 20:20:25 -06:00

21 Commits