Commit Graph

156 Commits

Author SHA1 Message Date
doctorpangloss
341c9f2e90 Improvements to node loading, node API, folder paths and progress
- Improve node loading order. It now occurs "as late as possible".
   Configuration should be exposed as per the README.
 - Added methods to specify custom folders and models used in examples
   more robustly for custom nodes.
 - Downloading models can now be gracefully interrupted.
 - Progress notifications are now sent over the network for distributed
   ComfyUI operations.
 - Python objects have been moved around to prevent less transitive
   package importing issues.
2024-03-13 16:14:18 -07:00
doctorpangloss
93cdef65a4 Merge upstream 2024-03-12 09:49:47 -07:00
comfyanonymous
0ed72befe1 Change log levels.
Logging level now defaults to info. --verbose sets it to debug.
2024-03-11 13:54:56 -04:00
doctorpangloss
00728eb20f Merge upstream 2024-03-11 09:32:57 -07:00
comfyanonymous
65397ce601 Replace prints with logging and add --verbose argument. 2024-03-10 12:14:23 -04:00
doctorpangloss
c0d9bc0129 Merge with upstream 2024-03-08 15:17:20 -08:00
comfyanonymous
dce3555339 Add some tesla pascal GPUs to the fp16 working but slower list. 2024-03-02 17:16:31 -05:00
doctorpangloss
7520691021 Merge with master 2024-02-19 10:55:22 -08:00
comfyanonymous
88f300401c Enable fp16 by default on mps. 2024-02-19 12:00:48 -05:00
comfyanonymous
929e266f3e Manual cast for bf16 on older GPUs. 2024-02-17 09:01:17 -05:00
comfyanonymous
0b3c50480c Make --force-fp32 disable loading models in bf16. 2024-02-16 23:01:54 -05:00
comfyanonymous
f83109f09b Stable Cascade Stage C. 2024-02-16 10:55:08 -05:00
comfyanonymous
aeaeca10bd Small refactor of is_device_* functions. 2024-02-15 21:10:10 -05:00
doctorpangloss
3367362cec Fix directml again now that I understand what the command line is doing 2024-02-08 10:17:49 -08:00
Benjamin Berman
8508a5a853 Fix args.directml is not None error 2024-02-08 08:40:13 -08:00
doctorpangloss
d9b4607c36 Add locks to model_management to prevent multiple copies of the models from being loaded at the same time 2024-02-07 15:18:13 -08:00
doctorpangloss
8e9052c843 Merge with upstream 2024-02-07 14:27:50 -08:00
comfyanonymous
66e28ef45c Don't use is_bf16_supported to check for fp16 support. 2024-02-04 20:53:35 -05:00
comfyanonymous
24129d78e6 Speed up SDXL on 16xx series with fp16 weights and manual cast. 2024-02-04 13:23:43 -05:00
comfyanonymous
4b0239066d Always use fp16 for the text encoders. 2024-02-02 10:02:49 -05:00
doctorpangloss
82edb2ff0e Merge with latest upstream. 2024-01-29 15:06:31 -08:00
comfyanonymous
f9e55d8463 Only auto enable bf16 VAE on nvidia GPUs that actually support it. 2024-01-15 03:10:22 -05:00
doctorpangloss
369aeb598f Merge upstream, fix 3.12 compatibility, fix nightlies issue, fix broken node 2024-01-03 16:00:36 -08:00
comfyanonymous
1b103e0cb2 Add argument to run the VAE on the CPU. 2023-12-30 05:49:07 -05:00
comfyanonymous
e1e322cf69 Load weights that can't be lowvramed to target device. 2023-12-28 21:41:10 -05:00
comfyanonymous
a252963f95 --disable-smart-memory now unloads everything like it did originally. 2023-12-23 04:25:06 -05:00
comfyanonymous
36a7953142 Greatly improve lowvram sampling speed by getting rid of accelerate.
Let me know if this breaks anything.
2023-12-22 14:38:45 -05:00
comfyanonymous
2f9d6a97ec Add --deterministic option to make pytorch use deterministic algorithms. 2023-12-17 16:59:21 -05:00
comfyanonymous
b0aab1e4ea Add an option --fp16-unet to force using fp16 for the unet. 2023-12-11 18:36:29 -05:00
comfyanonymous
ba07cb748e Use faster manual cast for fp8 in unet. 2023-12-11 18:24:44 -05:00
comfyanonymous
57926635e8 Switch text encoder to manual cast.
Use fp16 text encoder weights for CPU inference to lower memory usage.
2023-12-10 23:00:54 -05:00
comfyanonymous
340177e6e8 Disable non blocking on mps. 2023-12-10 01:30:35 -05:00
comfyanonymous
9ac0b487ac Make --gpu-only put intermediate values in GPU memory instead of cpu. 2023-12-08 02:35:45 -05:00
comfyanonymous
2db86b4676 Slightly faster lora applying. 2023-12-06 05:13:14 -05:00
comfyanonymous
ca82ade765 Use .itemsize to get dtype size for fp8. 2023-12-04 11:52:06 -05:00
comfyanonymous
31b0f6f3d8 UNET weights can now be stored in fp8.
--fp8_e4m3fn-unet and --fp8_e5m2-unet are the two different formats
supported by pytorch.
2023-12-04 11:10:00 -05:00
Benjamin Berman
01312a55a4 merge upstream 2023-12-03 20:41:13 -08:00
comfyanonymous
0cf4e86939 Add some command line arguments to store text encoder weights in fp8.
Pytorch supports two variants of fp8:
--fp8_e4m3fn-text-enc (the one that seems to give better results)
--fp8_e5m2-text-enc
2023-11-17 02:56:59 -05:00
comfyanonymous
7339479b10 Disable xformers when it can't load properly. 2023-11-13 12:31:10 -05:00
comfyanonymous
dd4ba68b6e Allow different models to estimate memory usage differently. 2023-11-12 04:03:52 -05:00
comfyanonymous
8594c8be4d Empty the cache when torch cache is more than 25% free mem. 2023-10-22 13:58:12 -04:00
Benjamin Berman
d21655b5a2 merge upstream 2023-10-17 14:47:59 -07:00
comfyanonymous
c8013f73e5 Add some Quadro cards to the list of cards with broken fp16. 2023-10-16 16:48:46 -04:00
comfyanonymous
fd4c5f07e7 Add a --bf16-unet to test running the unet in bf16. 2023-10-13 14:51:10 -04:00
comfyanonymous
9a55dadb4c Refactor code so model can be a dtype other than fp32 or fp16. 2023-10-13 14:41:17 -04:00
comfyanonymous
88733c997f pytorch_attention_enabled can now return True when xformers is enabled. 2023-10-11 21:30:57 -04:00
comfyanonymous
20d3852aa1 Pull some small changes from the other repo. 2023-10-11 20:38:48 -04:00
doctorpangloss
e8b60dfc6e merge upstream 2023-10-06 15:02:31 -07:00
Simon Lui
eec449ca8e Allow Intel GPUs to LoRA cast on GPU since it supports BF16 natively. 2023-09-22 21:11:27 -07:00
comfyanonymous
1cdfb3dba4 Only do the cast on the device if the device supports it. 2023-09-20 17:52:41 -04:00