Update quant doc so it's not completely wrong. (#13381)

There is still more that needs to be fixed.
This commit is contained in:
comfyanonymous 2026-04-12 20:27:38 -07:00 committed by GitHub
parent 31283d2892
commit 971932346a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -139,9 +139,9 @@ Example:
"_quantization_metadata": { "_quantization_metadata": {
"format_version": "1.0", "format_version": "1.0",
"layers": { "layers": {
"model.layers.0.mlp.up_proj": "float8_e4m3fn", "model.layers.0.mlp.up_proj": {"format": "float8_e4m3fn"},
"model.layers.0.mlp.down_proj": "float8_e4m3fn", "model.layers.0.mlp.down_proj": {"format": "float8_e4m3fn"},
"model.layers.1.mlp.up_proj": "float8_e4m3fn" "model.layers.1.mlp.up_proj": {"format": "float8_e4m3fn"}
} }
} }
} }
@ -165,4 +165,4 @@ Activation quantization (e.g., for FP8 Tensor Core operations) requires `input_s
3. **Compute scales**: Derive `input_scale` from collected statistics 3. **Compute scales**: Derive `input_scale` from collected statistics
4. **Store in checkpoint**: Save `input_scale` parameters alongside weights 4. **Store in checkpoint**: Save `input_scale` parameters alongside weights
The calibration dataset should be representative of your target use case. For diffusion models, this typically means a diverse set of prompts and generation parameters. The calibration dataset should be representative of your target use case. For diffusion models, this typically means a diverse set of prompts and generation parameters.