pixeldit: sanitize and warn on NaN PiD output on AMD fp16/bf16

On AMD ROCm (gfx11xx APUs/GPUs), the PiD PidNet forward in fp16/bf16 can return an all-NaN tensor through the ROCm/AOTriton attention path, which then corrupts the decoded image. The default (non-AOTriton) path stays clean, so the bad values come from an AOTriton attention miscompilation on gfx11xx rather than the PiD math itself (see ROCm/triton#909 and ROCm/aotriton#179). Guard the PidNet output on AMD: when it is fp16/bf16 and actually contains NaN/Inf, log a one-time warning that points at --use-split-cross-attention and clamp the values with nan_to_num before decode (the same pattern already used for flux/lumina). Non-AMD devices and fp32 paths are unaffected; finite outputs only pay a single isfinite() check. Fixes #14249 Signed-off-by: liminfei-amd <91481003+liminfei-amd@users.noreply.github.com>
2026-06-26 09:49:26 +08:00 · 2026-06-07 12:19:19 +08:00 · 2026-06-07 12:19:19 +08:00 · 70724ddf43
commit 70724ddf43
parent 2cdaaf4a25
1 changed files with 21 additions and 1 deletions
--- a/comfy/ldm/pixeldit/pid.py
+++ b/comfy/ldm/pixeldit/pid.py
@ -3,16 +3,24 @@ directly to a 4x-upscaled image in 4 distilled flow-matching steps. PixDiT_T2I
 body + LQ projection branch injected before each MMDiT patch block.
 """

+import logging
 from typing import List

 import torch
 import torch.nn as nn
 import torch.nn.functional as F

+import comfy.model_management
+
 from .model import PixDiT_T2I
 from .modules import precompute_freqs_cis_2d


+# Warn at most once per process when the ROCm/AOTriton attention path returns
+# non-finite values that the guard below sanitizes (see ComfyUI #14249).
+_PID_AMD_NONFINITE_WARNED = False
+
+
 class SigmaAwareGatePerTokenPerDim(nn.Module):
    """gate = sigmoid(content_proj(cat[x, lq]) - exp(log_alpha) * sigma); out = x + gate * lq.

@ -217,7 +225,7 @@ class PidNet(PixDiT_T2I):

        lq_features = self.lq_proj(lq_latent=lq_latent.to(x), target_pH=Hs, target_pW=Ws)

-        return super()._forward(
+        out = super()._forward(
            x, timesteps,
            context=context, attention_mask=attention_mask,
            transformer_options=transformer_options,
@ -225,3 +233,15 @@ class PidNet(PixDiT_T2I):
            pid_degrade_sigma=degrade_sigma,
            **kwargs,
        )
+        if comfy.model_management.is_amd() and out.is_floating_point() and out.dtype in (torch.float16, torch.bfloat16) and not torch.isfinite(out).all():
+            global _PID_AMD_NONFINITE_WARNED
+            if not _PID_AMD_NONFINITE_WARNED:
+                logging.warning(
+                    "PiD produced non-finite output on AMD; sanitizing NaN/Inf so the "
+                    "decoded image stays usable. This is a known ROCm/AOTriton attention "
+                    "miscompilation on gfx11xx (ComfyUI #14249); for an unaffected run, "
+                    "launch with --use-split-cross-attention."
+                )
+                _PID_AMD_NONFINITE_WARNED = True
+            out = torch.nan_to_num(out, nan=0.0, posinf=65504.0, neginf=-65504.0)
+        return out