Commit Graph

83 Commits

Author SHA1 Message Date
comfyanonymous
ef16077917 Add CMP 30HX card to the nvidia_16_series list. 2023-08-04 12:08:45 -04:00
comfyanonymous
28401d83c5 Only shift text encoder to vram when CPU cores are under 8. 2023-07-31 00:08:54 -04:00
comfyanonymous
2ee42215be Lower CPU thread check for running the text encoder on the CPU vs GPU. 2023-07-30 17:18:24 -04:00
comfyanonymous
aa8fde7d6b Try to fix memory issue with lora. 2023-07-22 21:38:56 -04:00
comfyanonymous
b2879e0168 Merge branch 'fix-AttributeError-module-'torch'-has-no-attribute-'mps'' of https://github.com/KarryCharon/ComfyUI 2023-07-20 00:34:54 -04:00
comfyanonymous
3aad28d483 Add MX450 and MX550 to list of cards with broken fp16. 2023-07-19 03:08:30 -04:00
comfyanonymous
22abe3af9f Fix device print on old torch version. 2023-07-17 15:18:58 -04:00
comfyanonymous
5ddb2ca26f Add a command line argument to enable backend:cudaMallocAsync 2023-07-17 11:00:14 -04:00
comfyanonymous
ba6e888eb9 Lower lora ram usage when in normal vram mode. 2023-07-16 02:59:04 -04:00
comfyanonymous
73c2afbe44 Speed up lora loading a bit. 2023-07-15 13:25:22 -04:00
KarryCharon
3ee78c064b fix mps miss import 2023-07-12 10:06:34 +08:00
comfyanonymous
42805fd416 Empty cache after model unloading for normal vram and lower. 2023-07-09 09:56:03 -04:00
comfyanonymous
9caaa09c71 Add arguments to run the VAE in fp16 or bf16 for testing. 2023-07-06 23:23:46 -04:00
comfyanonymous
fa8010f038 Disable autocast in unet for increased speed. 2023-07-05 21:58:29 -04:00
comfyanonymous
06ce99e525 Fix issue with OSX. 2023-07-04 02:09:02 -04:00
comfyanonymous
fd93e324e8 Improvements for OSX. 2023-07-03 00:08:30 -04:00
comfyanonymous
280a4e3544 Switch to fp16 on some cards when the model is too big. 2023-07-02 10:00:57 -04:00
comfyanonymous
dd4abf1345 Add a --force-fp16 argument to force fp16 for testing. 2023-07-01 22:42:35 -04:00
comfyanonymous
1e24a78d85 --gpu-only now keeps the VAE on the device. 2023-07-01 15:22:40 -04:00
comfyanonymous
2ee0aa317c Leave text_encoder on the CPU when it can handle it. 2023-07-01 14:38:51 -04:00
comfyanonymous
d5a7abe10d Try to keep text encoders loaded and patched to increase speed.
load_model_gpu() is now used with the text encoder models instead of just
the unet.
2023-07-01 13:28:07 -04:00
comfyanonymous
e946dca0e1 Make highvram and normalvram shift the text encoders to vram and back.
This is faster on big text encoder models than running it on the CPU.
2023-07-01 12:37:23 -04:00
comfyanonymous
790073a21d Move unet to device right after loading on highvram mode. 2023-06-29 20:43:06 -04:00
comfyanonymous
7b13cacfea Use pytorch attention by default on nvidia when xformers isn't present.
Add a new argument --use-quad-cross-attention
2023-06-26 13:03:44 -04:00
comfyanonymous
282638b813 Add a --gpu-only argument to keep and run everything on the GPU.
Make the CLIP model work on the GPU.
2023-06-15 15:38:52 -04:00
comfyanonymous
442430dcef Some comments to say what the vram state options mean. 2023-06-04 17:51:04 -04:00
comfyanonymous
e136e86a13 Cleanups and fixes for model_management.py
Hopefully fix regression on MPS and CPU.
2023-06-03 11:05:37 -04:00
comfyanonymous
6b80950a41 Refactor and improve model_management code related to free memory. 2023-06-02 15:21:33 -04:00
space-nuko
7cb90ba509 More accurate total 2023-06-02 00:14:41 -05:00
space-nuko
22b707f1cf System stats endpoint 2023-06-01 23:26:23 -05:00
comfyanonymous
f929d8df00 Tweak lowvram model memory so it's closer to what it was before. 2023-06-01 04:04:35 -04:00
comfyanonymous
f6f0a25226 Empty cache on mps. 2023-06-01 03:52:51 -04:00
comfyanonymous
4af4fe017b Auto load model in lowvram if not enough memory. 2023-05-30 12:36:41 -04:00
comfyanonymous
8c539fa5bc Print the torch device that is used on startup. 2023-05-13 17:11:27 -04:00
comfyanonymous
f7e427c557 Make maximum_batch_area take into account python2.0 attention function.
More conservative xformers maximum_batch_area.
2023-05-06 19:58:54 -04:00
comfyanonymous
57f35b3d16 maximum_batch_area for xformers.
Remove useless code.
2023-05-06 19:28:46 -04:00
comfyanonymous
b877fefbb3 Lowvram mode for gligen and fix some lowvram issues. 2023-05-05 18:11:41 -04:00
comfyanonymous
93fc8c1ea9 Fix import. 2023-05-05 00:19:35 -04:00
comfyanonymous
2edaaba3c2 Fix imports. 2023-05-04 18:10:29 -04:00
comfyanonymous
806786ed1d Don't try to get vram from xpu or cuda when directml is enabled. 2023-04-29 00:28:48 -04:00
comfyanonymous
e7ae3bc44c You can now select the device index with: --directml id
Like this for example: --directml 1
2023-04-28 16:51:35 -04:00
comfyanonymous
c2afcad2a5 Basic torch_directml support. Use --directml to use it. 2023-04-28 14:28:57 -04:00
comfyanonymous
e6771d0986 Implement Linear hypernetworks.
Add a HypernetworkLoader node to use hypernetworks.
2023-04-23 12:35:25 -04:00
comfyanonymous
6c156642e4 Add support for GLIGEN textbox model. 2023-04-19 11:06:32 -04:00
comfyanonymous
3b9a2f504d Move code to empty gpu cache to model_management.py 2023-04-15 11:19:07 -04:00
comfyanonymous
4861dbb2e2 Print xformers version and warning about 0.0.18 2023-04-09 01:31:47 -04:00
comfyanonymous
d35efcbcb2 Add a --force-fp32 argument to force fp32 for debugging. 2023-04-07 00:27:54 -04:00
comfyanonymous
55a48f27db Small refactor. 2023-04-06 23:53:54 -04:00
藍+85CD
3adedd52d3 Merge branch 'master' into ipex 2023-04-07 09:11:30 +08:00
藍+85CD
01c0951a73 Fix auto lowvram detection on CUDA 2023-04-06 15:44:05 +08:00