From 8d8e0c68b04b48bf0574ae0466501eec95aa0563 Mon Sep 17 00:00:00 2001
From: liminfei-amd <liminfei-amd@users.noreply.github.com>
Date: Wed, 17 Jun 2026 22:08:49 +0800
Subject: [PATCH] Gate static pin_memory on the host-RAM budget (fixes #13730)

On AMD/ROCm a clean launch stalls at "Requested to load LTXAV" while system RAM
fills and spills to swap, even though VRAM sits at ~65%. It is host-side
pinned-memory exhaustion, not VRAM pressure.

partially_load() pins every offloaded weight via the static path pin_memory,
which ignores ensure_pin_registerable()'s result and unconditionally
cudaHostRegisters up to MAX_PINNED_MEMORY (0.90*RAM on Linux). Those pins are
only reclaimable from is_dynamic() models by free_registrations, and dynamic
VRAM is off by default on AMD, so they are never reclaimable. Page-locked RAM
is not swappable, so the loader exhausts RAM and thrashes.

The dynamic-VRAM pin path (comfy/pinned_memory.py) already guards this with
ensure_pin_budget/ensure_pin_registerable; only the static path was missing it.
Gate pin_memory the same way and skip pinning when the budget cannot be met
(the weight stays in pageable RAM, still correct, just not page-locked).
Behavior is unchanged when RAM is ample and under --high-ram.
---
 comfy/model_management.py | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/comfy/model_management.py b/comfy/model_management.py
index b15d08ba1..611fcd5d1 100644
--- a/comfy/model_management.py
+++ b/comfy/model_management.py
@@ -1534,7 +1534,14 @@ def pin_memory(tensor):
 
     size = tensor.nbytes
     comfy.memory_management.extra_ram_release(comfy.memory_management.RAM_CACHE_HEADROOM)
-    ensure_pin_registerable(size)
+    # Respect the host-RAM budget like the dynamic-VRAM pin path (comfy/pinned_memory.py)
+    # already does. Without this gate the static load path keeps cudaHostRegister-ing
+    # offloaded weights toward MAX_PINNED_MEMORY (90% of RAM on Linux) regardless of how
+    # little RAM is actually free, so unswappable pages fill RAM+swap and large model
+    # loads stall (issue #13730). When the budget cannot be met, skip pinning and leave
+    # the weight in pageable RAM (still correct, just not pinned).
+    if not ensure_pin_budget(size) or not ensure_pin_registerable(size):
+        return False
 
     ptr = tensor.data_ptr()
     if ptr == 0: