fix: disable SageAttention for Hunyuan3D v2.1 DiT

SageAttention's quantized kernels produce NaN in the Hunyuan3D v2.1
diffusion transformer, causing the downstream VoxelToMesh to generate
zero vertices and crash in save_glb.

Add low_precision_attention=False to both optimized_attention calls in
the v2.1 DiT (CrossAttention and Attention classes), following the same
pattern used by ACE (ace_step15.py). This makes SageAttention fall back
to pytorch attention for Hunyuan3D only, while all other models keep
the SageAttention speedup.

Root cause: the 3D occupancy/SDF prediction requires higher numerical
precision at voxel boundaries than SageAttention's quantized kernels
provide. Image and video diffusion tolerate this precision loss.

Fixes: comfyanonymous/ComfyUI#10943

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Paulo Muggler 2026-03-05 00:30:45 +01:00
parent 8811db52db
commit 124300b732

View File

@ -343,6 +343,7 @@ class CrossAttention(nn.Module):
k.reshape(b, s2, self.num_heads * self.head_dim),
v,
heads=self.num_heads,
low_precision_attention=False,
)
out = self.out_proj(x)
@ -412,6 +413,7 @@ class Attention(nn.Module):
key.reshape(B, N, self.num_heads * self.head_dim),
value,
heads=self.num_heads,
low_precision_attention=False,
)
x = self.out_proj(x)