fix: disable SageAttention for Hunyuan3D v2.1 DiT

SageAttention's quantized kernels produce NaN in the Hunyuan3D v2.1 diffusion transformer, causing the downstream VoxelToMesh to generate zero vertices and crash in save_glb. Add low_precision_attention=False to both optimized_attention calls in the v2.1 DiT (CrossAttention and Attention classes), following the same pattern used by ACE (ace_step15.py). This makes SageAttention fall back to pytorch attention for Hunyuan3D only, while all other models keep the SageAttention speedup. Root cause: the 3D occupancy/SDF prediction requires higher numerical precision at voxel boundaries than SageAttention's quantized kernels provide. Image and video diffusion tolerate this precision loss. Fixes: comfyanonymous/ComfyUI#10943 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-14 20:09:24 +08:00 · 2026-03-05 00:30:45 +01:00 · 2026-03-05 00:30:45 +01:00 · 124300b732
commit 124300b732
parent 8811db52db
1 changed files with 2 additions and 0 deletions
--- a/comfy/ldm/hunyuan3dv2_1/hunyuandit.py
+++ b/comfy/ldm/hunyuan3dv2_1/hunyuandit.py
@ -343,6 +343,7 @@ class CrossAttention(nn.Module):
            k.reshape(b, s2, self.num_heads * self.head_dim),
            v,
            heads=self.num_heads,
+            low_precision_attention=False,
        )

        out = self.out_proj(x)
@ -412,6 +413,7 @@ class Attention(nn.Module):
            key.reshape(B, N, self.num_heads * self.head_dim),
            value,
            heads=self.num_heads,
+            low_precision_attention=False,
        )

        x = self.out_proj(x)