doctorpangloss
341c9f2e90
Improvements to node loading, node API, folder paths and progress
...
- Improve node loading order. It now occurs "as late as possible".
Configuration should be exposed as per the README.
- Added methods to specify custom folders and models used in examples
more robustly for custom nodes.
- Downloading models can now be gracefully interrupted.
- Progress notifications are now sent over the network for distributed
ComfyUI operations.
- Python objects have been moved around to prevent less transitive
package importing issues.
2024-03-13 16:14:18 -07:00
doctorpangloss
93cdef65a4
Merge upstream
2024-03-12 09:49:47 -07:00
comfyanonymous
0ed72befe1
Change log levels.
...
Logging level now defaults to info. --verbose sets it to debug.
2024-03-11 13:54:56 -04:00
doctorpangloss
00728eb20f
Merge upstream
2024-03-11 09:32:57 -07:00
comfyanonymous
65397ce601
Replace prints with logging and add --verbose argument.
2024-03-10 12:14:23 -04:00
doctorpangloss
c0d9bc0129
Merge with upstream
2024-03-08 15:17:20 -08:00
comfyanonymous
dce3555339
Add some tesla pascal GPUs to the fp16 working but slower list.
2024-03-02 17:16:31 -05:00
doctorpangloss
7520691021
Merge with master
2024-02-19 10:55:22 -08:00
comfyanonymous
88f300401c
Enable fp16 by default on mps.
2024-02-19 12:00:48 -05:00
comfyanonymous
929e266f3e
Manual cast for bf16 on older GPUs.
2024-02-17 09:01:17 -05:00
comfyanonymous
0b3c50480c
Make --force-fp32 disable loading models in bf16.
2024-02-16 23:01:54 -05:00
comfyanonymous
f83109f09b
Stable Cascade Stage C.
2024-02-16 10:55:08 -05:00
comfyanonymous
aeaeca10bd
Small refactor of is_device_* functions.
2024-02-15 21:10:10 -05:00
doctorpangloss
3367362cec
Fix directml again now that I understand what the command line is doing
2024-02-08 10:17:49 -08:00
Benjamin Berman
8508a5a853
Fix args.directml is not None error
2024-02-08 08:40:13 -08:00
doctorpangloss
d9b4607c36
Add locks to model_management to prevent multiple copies of the models from being loaded at the same time
2024-02-07 15:18:13 -08:00
doctorpangloss
8e9052c843
Merge with upstream
2024-02-07 14:27:50 -08:00
comfyanonymous
66e28ef45c
Don't use is_bf16_supported to check for fp16 support.
2024-02-04 20:53:35 -05:00
comfyanonymous
24129d78e6
Speed up SDXL on 16xx series with fp16 weights and manual cast.
2024-02-04 13:23:43 -05:00
comfyanonymous
4b0239066d
Always use fp16 for the text encoders.
2024-02-02 10:02:49 -05:00
doctorpangloss
82edb2ff0e
Merge with latest upstream.
2024-01-29 15:06:31 -08:00
comfyanonymous
f9e55d8463
Only auto enable bf16 VAE on nvidia GPUs that actually support it.
2024-01-15 03:10:22 -05:00
doctorpangloss
369aeb598f
Merge upstream, fix 3.12 compatibility, fix nightlies issue, fix broken node
2024-01-03 16:00:36 -08:00
comfyanonymous
1b103e0cb2
Add argument to run the VAE on the CPU.
2023-12-30 05:49:07 -05:00
comfyanonymous
e1e322cf69
Load weights that can't be lowvramed to target device.
2023-12-28 21:41:10 -05:00
comfyanonymous
a252963f95
--disable-smart-memory now unloads everything like it did originally.
2023-12-23 04:25:06 -05:00
comfyanonymous
36a7953142
Greatly improve lowvram sampling speed by getting rid of accelerate.
...
Let me know if this breaks anything.
2023-12-22 14:38:45 -05:00
comfyanonymous
2f9d6a97ec
Add --deterministic option to make pytorch use deterministic algorithms.
2023-12-17 16:59:21 -05:00
comfyanonymous
b0aab1e4ea
Add an option --fp16-unet to force using fp16 for the unet.
2023-12-11 18:36:29 -05:00
comfyanonymous
ba07cb748e
Use faster manual cast for fp8 in unet.
2023-12-11 18:24:44 -05:00
comfyanonymous
57926635e8
Switch text encoder to manual cast.
...
Use fp16 text encoder weights for CPU inference to lower memory usage.
2023-12-10 23:00:54 -05:00
comfyanonymous
340177e6e8
Disable non blocking on mps.
2023-12-10 01:30:35 -05:00
comfyanonymous
9ac0b487ac
Make --gpu-only put intermediate values in GPU memory instead of cpu.
2023-12-08 02:35:45 -05:00
comfyanonymous
2db86b4676
Slightly faster lora applying.
2023-12-06 05:13:14 -05:00
comfyanonymous
ca82ade765
Use .itemsize to get dtype size for fp8.
2023-12-04 11:52:06 -05:00
comfyanonymous
31b0f6f3d8
UNET weights can now be stored in fp8.
...
--fp8_e4m3fn-unet and --fp8_e5m2-unet are the two different formats
supported by pytorch.
2023-12-04 11:10:00 -05:00
Benjamin Berman
01312a55a4
merge upstream
2023-12-03 20:41:13 -08:00
comfyanonymous
0cf4e86939
Add some command line arguments to store text encoder weights in fp8.
...
Pytorch supports two variants of fp8:
--fp8_e4m3fn-text-enc (the one that seems to give better results)
--fp8_e5m2-text-enc
2023-11-17 02:56:59 -05:00
comfyanonymous
7339479b10
Disable xformers when it can't load properly.
2023-11-13 12:31:10 -05:00
comfyanonymous
dd4ba68b6e
Allow different models to estimate memory usage differently.
2023-11-12 04:03:52 -05:00
comfyanonymous
8594c8be4d
Empty the cache when torch cache is more than 25% free mem.
2023-10-22 13:58:12 -04:00
Benjamin Berman
d21655b5a2
merge upstream
2023-10-17 14:47:59 -07:00
comfyanonymous
c8013f73e5
Add some Quadro cards to the list of cards with broken fp16.
2023-10-16 16:48:46 -04:00
comfyanonymous
fd4c5f07e7
Add a --bf16-unet to test running the unet in bf16.
2023-10-13 14:51:10 -04:00
comfyanonymous
9a55dadb4c
Refactor code so model can be a dtype other than fp32 or fp16.
2023-10-13 14:41:17 -04:00
comfyanonymous
88733c997f
pytorch_attention_enabled can now return True when xformers is enabled.
2023-10-11 21:30:57 -04:00
comfyanonymous
20d3852aa1
Pull some small changes from the other repo.
2023-10-11 20:38:48 -04:00
doctorpangloss
e8b60dfc6e
merge upstream
2023-10-06 15:02:31 -07:00
Simon Lui
eec449ca8e
Allow Intel GPUs to LoRA cast on GPU since it supports BF16 natively.
2023-09-22 21:11:27 -07:00
comfyanonymous
1cdfb3dba4
Only do the cast on the device if the device supports it.
2023-09-20 17:52:41 -04:00