fix: reshape dora_scale before broadcasting in weight_decompose

In weight_decompose(), the 1D dora_scale tensor [N] divided by the multi-dimensional weight_norm [N, 1, ...] would incorrectly broadcast to [N, N, ...] (outer-product shape) instead of element-wise [N, 1, ...]. This caused shape mismatches when applying DoRA to non-square weight matrices (e.g. MLP layers where d_ff != d_model), while silently producing correct results for square weights (most attention Q/K/V/O). Fix: explicitly reshape dora_scale to match weight_norm's dimensionality before the division. Fixes #12938 Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
2026-05-25 08:27:25 +08:00 · 2026-03-17 11:17:57 +08:00 · 2026-03-17 11:17:57 +08:00 · e5f6c1ff68
commit e5f6c1ff68
parent 8cc746a864
1 changed files with 10 additions and 0 deletions
--- a/comfy/weight_adapter/base.py
+++ b/comfy/weight_adapter/base.py
@ -298,6 +298,16 @@ def weight_decompose(
        )
    weight_norm = weight_norm + torch.finfo(weight.dtype).eps

+    # Reshape dora_scale to match weight_norm dimensionality to avoid
+    # incorrect broadcasting. Without this, a 1D dora_scale [N] divided by
+    # a multi-dim weight_norm [N, 1] would broadcast to [N, N] instead of
+    # the intended element-wise [N, 1]. This caused shape mismatches for
+    # non-square weights (e.g. MLP layers where d_ff != d_model).
+    if wd_on_output_axis:
+        dora_scale = dora_scale.reshape(weight.shape[0], *[1] * (weight.dim() - 1))
+    else:
+        dora_scale = dora_scale.reshape(*[1] * (weight.dim() - 1), weight.shape[-1])
+
    weight_calc *= (dora_scale / weight_norm).type(weight.dtype)
    if strength != 1.0:
        weight_calc -= weight