When a full whisper checkpoint (encoder + decoder) is loaded via
AudioEncoderLoader, two classes of spurious warnings were emitted:
1. 'unexpected audio encoder' for every decoder.* key - the decoder is
not part of WhisperLargeV3, so these keys are always present in full
whisper checkpoints and should be silently discarded.
2. 'missing audio encoder' for feature_extractor.mel_spectrogram buffers
(window and mel_scale.fb) - these are torchaudio buffers computed
deterministically from config at init time; they are never stored in
standard whisper checkpoints but are always correctly initialised.
Fix: strip decoder keys from the state-dict before loading, and suppress
warnings for the two known torchaudio-computed buffer keys.
Fixes#13276
* mp: respect model_defined_dtypes in default caster
This is needed for parametrizations when the dtype changes between sd
and model.
* audio_encoders: archive model dtypes
Archive model dtypes to stop the state dict load override the dtypes
defined by the core for compute etc.