From 8d8e0c68b04b48bf0574ae0466501eec95aa0563 Mon Sep 17 00:00:00 2001 From: liminfei-amd Date: Wed, 17 Jun 2026 22:08:49 +0800 Subject: [PATCH] Gate static pin_memory on the host-RAM budget (fixes #13730) On AMD/ROCm a clean launch stalls at "Requested to load LTXAV" while system RAM fills and spills to swap, even though VRAM sits at ~65%. It is host-side pinned-memory exhaustion, not VRAM pressure. partially_load() pins every offloaded weight via the static path pin_memory, which ignores ensure_pin_registerable()'s result and unconditionally cudaHostRegisters up to MAX_PINNED_MEMORY (0.90*RAM on Linux). Those pins are only reclaimable from is_dynamic() models by free_registrations, and dynamic VRAM is off by default on AMD, so they are never reclaimable. Page-locked RAM is not swappable, so the loader exhausts RAM and thrashes. The dynamic-VRAM pin path (comfy/pinned_memory.py) already guards this with ensure_pin_budget/ensure_pin_registerable; only the static path was missing it. Gate pin_memory the same way and skip pinning when the budget cannot be met (the weight stays in pageable RAM, still correct, just not page-locked). Behavior is unchanged when RAM is ample and under --high-ram. --- comfy/model_management.py | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/comfy/model_management.py b/comfy/model_management.py index b15d08ba1..611fcd5d1 100644 --- a/comfy/model_management.py +++ b/comfy/model_management.py @@ -1534,7 +1534,14 @@ def pin_memory(tensor): size = tensor.nbytes comfy.memory_management.extra_ram_release(comfy.memory_management.RAM_CACHE_HEADROOM) - ensure_pin_registerable(size) + # Respect the host-RAM budget like the dynamic-VRAM pin path (comfy/pinned_memory.py) + # already does. Without this gate the static load path keeps cudaHostRegister-ing + # offloaded weights toward MAX_PINNED_MEMORY (90% of RAM on Linux) regardless of how + # little RAM is actually free, so unswappable pages fill RAM+swap and large model + # loads stall (issue #13730). When the budget cannot be met, skip pinning and leave + # the weight in pageable RAM (still correct, just not pinned). + if not ensure_pin_budget(size) or not ensure_pin_registerable(size): + return False ptr = tensor.data_ptr() if ptr == 0: