mirror of
https://github.com/comfyanonymous/ComfyUI.git
synced 2026-03-28 20:43:32 +08:00
fix: reshape dora_scale before broadcasting in weight_decompose
In weight_decompose(), the 1D dora_scale tensor [N] divided by the multi-dimensional weight_norm [N, 1, ...] would incorrectly broadcast to [N, N, ...] (outer-product shape) instead of element-wise [N, 1, ...]. This caused shape mismatches when applying DoRA to non-square weight matrices (e.g. MLP layers where d_ff != d_model), while silently producing correct results for square weights (most attention Q/K/V/O). Fix: explicitly reshape dora_scale to match weight_norm's dimensionality before the division. Fixes #12938 Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
This commit is contained in:
parent
8cc746a864
commit
e5f6c1ff68
@ -298,6 +298,16 @@ def weight_decompose(
|
|||||||
)
|
)
|
||||||
weight_norm = weight_norm + torch.finfo(weight.dtype).eps
|
weight_norm = weight_norm + torch.finfo(weight.dtype).eps
|
||||||
|
|
||||||
|
# Reshape dora_scale to match weight_norm dimensionality to avoid
|
||||||
|
# incorrect broadcasting. Without this, a 1D dora_scale [N] divided by
|
||||||
|
# a multi-dim weight_norm [N, 1] would broadcast to [N, N] instead of
|
||||||
|
# the intended element-wise [N, 1]. This caused shape mismatches for
|
||||||
|
# non-square weights (e.g. MLP layers where d_ff != d_model).
|
||||||
|
if wd_on_output_axis:
|
||||||
|
dora_scale = dora_scale.reshape(weight.shape[0], *[1] * (weight.dim() - 1))
|
||||||
|
else:
|
||||||
|
dora_scale = dora_scale.reshape(*[1] * (weight.dim() - 1), weight.shape[-1])
|
||||||
|
|
||||||
weight_calc *= (dora_scale / weight_norm).type(weight.dtype)
|
weight_calc *= (dora_scale / weight_norm).type(weight.dtype)
|
||||||
if strength != 1.0:
|
if strength != 1.0:
|
||||||
weight_calc -= weight
|
weight_calc -= weight
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user