* fix attention OOM in xformers
* allow passing attention mask in flux attention
* allow an attn_mask in flux
* attn masks can be done using replace patches instead of a separate dict
* fix return types
* fix return order
* enumerate
* patch the right keys
* arg names
* fix a silly bug
* fix xformers masks
* replace match with if, elif, else
* mask with image_ref_size
* remove unused import
* remove unused import 2
* fix pytorch/xformers attention
This corrects a weird inconsistency with skip_reshape.
It also allows masks of various shapes to be passed, which will be
automtically expanded (in a memory-efficient way) to a size that is
compatible with xformers or pytorch sdpa respectively.
* fix mask shapes
- Experimental support for sage attention on Linux
- Diffusers loader now supports model indices
- Transformers model management now aligns with updates to ComfyUI
- Flux layers correctly use unbind
- Add float8 support for model loading in more places
- Experimental quantization approaches from Quanto and torchao
- Model upscaling interacts with memory management better
This update also disables ROCm testing because it isn't reliable enough
on consumer hardware. ROCm is not really supported by the 7600.