Here's an optimized version of your Python function. The primary changes are to minimize the creation of intermediate lists and to use dictionary comprehensions for more efficient data manipulation.
### Changes and Optimizations
1. **Avoid Unneeded List Creation:**
- Instead of mapping and filtering the keys in a separate step (`map` and `filter`), it is done directly in the list comprehension.
2. **Dictionary Comprehension**:
- By directly assigning `out` to `{}` or `state_dict`, it forgoes unnecessary intermediate steps in the conditional initialization.
3. **In-Loop Item Assignment**.
- Keys to be replaced and corresponding operations are now handled directly within loops, reducing intermediate variable assignments.
This rewritten function should perform better, especially with large dictionaries, due to reduced overhead from list operations and more efficient key manipulation.
This commit fixes the temporal tile size calculation, and removes
a redundant tile at the end of the range when its elements are
completely covered by the previous tile.
Co-authored-by: Andrew Kvochko <a.kvochko@lightricks.com>
* fix attention OOM in xformers
* allow passing attention mask in flux attention
* allow an attn_mask in flux
* attn masks can be done using replace patches instead of a separate dict
* fix return types
* fix return order
* enumerate
* patch the right keys
* arg names
* fix a silly bug
* fix xformers masks
* replace match with if, elif, else
* mask with image_ref_size
* remove unused import
* remove unused import 2
* fix pytorch/xformers attention
This corrects a weird inconsistency with skip_reshape.
It also allows masks of various shapes to be passed, which will be
automtically expanded (in a memory-efficient way) to a size that is
compatible with xformers or pytorch sdpa respectively.
* fix mask shapes
To use:
"Load CLIP" node with t5xxl + type mochi
"Load Diffusion Model" node with the mochi dit file.
"Load VAE" with the mochi vae file.
EmptyMochiLatentVideo node for the latent.
euler + linear_quadratic in the KSampler node.
Previously when a list of 3 images [0, 1, 2] was used for a 6 frame video
they were concated like this:
[0, 1, 2, 0, 1, 2]
now they are concated like this:
[0, 0, 1, 1, 2, 2]