Commit Graph

8 Commits

Author SHA1 Message Date
Emiliooooo
e860732dba fix(directml): correct VRAM detection and make torchaudio imports optional
## VRAM Detection (model_management.py)

The DirectML code path had two hardcoded `1024 * 1024 * 1024 #TODO` values
in `get_total_memory()` and `get_free_memory()`, causing ComfyUI to report
only 1 GB of VRAM on any AMD/Intel GPU using the DirectML backend — regardless
of actual hardware. This forced NORMAL_VRAM or LOW_VRAM calculations to be
wildly wrong.

Fix for `get_total_memory`:
- On Windows, reads `HardwareInformation.qwMemorySize` from the GPU driver
  registry key via `winreg`. This is the 64-bit accurate value (unlike
  `Win32_VideoController.AdapterRAM` which overflows at 4 GB).
- Allows override via `COMFYUI_DIRECTML_VRAM_MB` env var.
- Falls back to 6 GB if registry query fails (safe default for modern dGPUs).

Fix for `get_free_memory`:
- Uses `torch_directml.gpu_memory(0)` to get per-tile usage fractions and
  derives free memory as `total * (1 - max_usage_fraction)`.

## torchaudio: optional import on AMD/DirectML

torchaudio has a DLL incompatibility with torch-directml (which ships its own
torch runtime). The following files had bare `import torchaudio` at module
level, crashing ComfyUI startup entirely when torchaudio was absent:

- comfy/ldm/lightricks/vae/audio_vae.py
- comfy/audio_encoders/whisper.py
- comfy/audio_encoders/audio_encoders.py
- comfy_extras/nodes_audio.py
- comfy_extras/nodes_lt.py
- comfy_extras/nodes_wandancer.py

Each import is wrapped in `try/except (ImportError, OSError): torchaudio = None`,
matching the pattern already used in comfy/ldm/mmaudio/vae/autoencoder.py and
comfy/ldm/ace/vae/music_dcae_pipeline.py. Audio nodes will degrade gracefully
rather than preventing ComfyUI from starting.

Tested on: AMD Radeon RX 5600 XT (6 GB VRAM, gfx1010, Windows 10)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 12:10:31 -04:00
rattus
f466b06601
Fix fp16 audio encoder models (#12811)
* mp: respect model_defined_dtypes in default caster

This is needed for parametrizations when the dtype changes between sd
and model.

* audio_encoders: archive model dtypes

Archive model dtypes to stop the state dict load override the dtypes
defined by the core for compute etc.
2026-03-06 18:20:07 -05:00
rattus
f8acd9c402
Reduce RAM usage, fix VRAM OOMs, and fix Windows shared memory spilling with adaptive model loading (#11845) 2026-02-01 01:01:11 -05:00
comfyanonymous
9288c78fc5
Support the HuMo model. (#9903) 2025-09-17 00:12:48 -04:00
comfyanonymous
a39ac59c3e
Add encoder part of whisper large v3 as an audio encoder model. (#9894)
Not useful yet but some models use it.
2025-09-16 01:19:50 -04:00
comfyanonymous
29bf807b0e
Cleanup. (#9838) 2025-09-12 21:57:04 -04:00
Jukka Seppänen
2559dee492
Support wav2vec base models (#9637)
* Support wav2vec base models

* trim trailing whitespace

* Do interpolation after
2025-09-12 21:52:58 -04:00
comfyanonymous
914c2a2973
Implement wav2vec2 as an audio encoder model. (#9549)
This is useless on its own but there are multiple models that use it.
2025-08-25 23:26:47 -04:00