Commit Graph

5 Commits

Author SHA1 Message Date
Rattus
2d96b2fdf1 MPDynamic: Add support for model defined dtype
If the model defines a dtype that is different to what is in the state
dict, respect that at load time. This is done as part of the casting
process.
2026-01-22 00:05:25 +10:00
Rattus
b0d6f2a9fc main: Rework aimdo into process
Be more tolerant of unsupported platforms and fallback properly.
Fixes crash when cuda is not installed at all.
2026-01-21 14:34:23 +10:00
Rattus
307d25e747 ruff 2026-01-21 14:33:01 +10:00
Rattus
bacd916833 execution: add aimdo primary pytorch cache integration
We need to general pytorch cache defragmentation on an appropriate level for
aimdo. Do in here on the per node basis, which has a reasonable chance of
purging stale shapes out of the pytorch caching allocator and saving VRAM
without costing too much garbage collector thrash.

This looks like a lot of GC but because aimdo never fails from pytorch and
saves the pytorch allocator from ever need to defrag out of demand, but it
needs a oil change every now and then so we gotta do it. Doing it here also
means the pytorch temps are cleared from task manager VRAM usage so user
anxiety can go down a little when they see their vram drop back at the end
of workflows inline with inference usage (rather than assuming full VRAM
leaks).
2026-01-21 14:32:12 +10:00
Rattus
f9a225b590 mm: Implement cast buffer allocations 2026-01-21 14:32:12 +10:00