doctorpangloss
|
93cdef65a4
|
Merge upstream
|
2024-03-12 09:49:47 -07:00 |
|
comfyanonymous
|
0ed72befe1
|
Change log levels.
Logging level now defaults to info. --verbose sets it to debug.
|
2024-03-11 13:54:56 -04:00 |
|
doctorpangloss
|
00728eb20f
|
Merge upstream
|
2024-03-11 09:32:57 -07:00 |
|
comfyanonymous
|
65397ce601
|
Replace prints with logging and add --verbose argument.
|
2024-03-10 12:14:23 -04:00 |
|
doctorpangloss
|
c0d9bc0129
|
Merge with upstream
|
2024-03-08 15:17:20 -08:00 |
|
comfyanonymous
|
dce3555339
|
Add some tesla pascal GPUs to the fp16 working but slower list.
|
2024-03-02 17:16:31 -05:00 |
|
doctorpangloss
|
7520691021
|
Merge with master
|
2024-02-19 10:55:22 -08:00 |
|
comfyanonymous
|
88f300401c
|
Enable fp16 by default on mps.
|
2024-02-19 12:00:48 -05:00 |
|
comfyanonymous
|
929e266f3e
|
Manual cast for bf16 on older GPUs.
|
2024-02-17 09:01:17 -05:00 |
|
comfyanonymous
|
0b3c50480c
|
Make --force-fp32 disable loading models in bf16.
|
2024-02-16 23:01:54 -05:00 |
|
comfyanonymous
|
f83109f09b
|
Stable Cascade Stage C.
|
2024-02-16 10:55:08 -05:00 |
|
comfyanonymous
|
aeaeca10bd
|
Small refactor of is_device_* functions.
|
2024-02-15 21:10:10 -05:00 |
|
doctorpangloss
|
3367362cec
|
Fix directml again now that I understand what the command line is doing
|
2024-02-08 10:17:49 -08:00 |
|
Benjamin Berman
|
8508a5a853
|
Fix args.directml is not None error
|
2024-02-08 08:40:13 -08:00 |
|
doctorpangloss
|
d9b4607c36
|
Add locks to model_management to prevent multiple copies of the models from being loaded at the same time
|
2024-02-07 15:18:13 -08:00 |
|
doctorpangloss
|
8e9052c843
|
Merge with upstream
|
2024-02-07 14:27:50 -08:00 |
|
comfyanonymous
|
66e28ef45c
|
Don't use is_bf16_supported to check for fp16 support.
|
2024-02-04 20:53:35 -05:00 |
|
comfyanonymous
|
24129d78e6
|
Speed up SDXL on 16xx series with fp16 weights and manual cast.
|
2024-02-04 13:23:43 -05:00 |
|
comfyanonymous
|
4b0239066d
|
Always use fp16 for the text encoders.
|
2024-02-02 10:02:49 -05:00 |
|
doctorpangloss
|
82edb2ff0e
|
Merge with latest upstream.
|
2024-01-29 15:06:31 -08:00 |
|
comfyanonymous
|
f9e55d8463
|
Only auto enable bf16 VAE on nvidia GPUs that actually support it.
|
2024-01-15 03:10:22 -05:00 |
|
doctorpangloss
|
369aeb598f
|
Merge upstream, fix 3.12 compatibility, fix nightlies issue, fix broken node
|
2024-01-03 16:00:36 -08:00 |
|
comfyanonymous
|
1b103e0cb2
|
Add argument to run the VAE on the CPU.
|
2023-12-30 05:49:07 -05:00 |
|
comfyanonymous
|
e1e322cf69
|
Load weights that can't be lowvramed to target device.
|
2023-12-28 21:41:10 -05:00 |
|
comfyanonymous
|
a252963f95
|
--disable-smart-memory now unloads everything like it did originally.
|
2023-12-23 04:25:06 -05:00 |
|
comfyanonymous
|
36a7953142
|
Greatly improve lowvram sampling speed by getting rid of accelerate.
Let me know if this breaks anything.
|
2023-12-22 14:38:45 -05:00 |
|
comfyanonymous
|
2f9d6a97ec
|
Add --deterministic option to make pytorch use deterministic algorithms.
|
2023-12-17 16:59:21 -05:00 |
|
comfyanonymous
|
b0aab1e4ea
|
Add an option --fp16-unet to force using fp16 for the unet.
|
2023-12-11 18:36:29 -05:00 |
|
comfyanonymous
|
ba07cb748e
|
Use faster manual cast for fp8 in unet.
|
2023-12-11 18:24:44 -05:00 |
|
comfyanonymous
|
57926635e8
|
Switch text encoder to manual cast.
Use fp16 text encoder weights for CPU inference to lower memory usage.
|
2023-12-10 23:00:54 -05:00 |
|
comfyanonymous
|
340177e6e8
|
Disable non blocking on mps.
|
2023-12-10 01:30:35 -05:00 |
|
comfyanonymous
|
9ac0b487ac
|
Make --gpu-only put intermediate values in GPU memory instead of cpu.
|
2023-12-08 02:35:45 -05:00 |
|
comfyanonymous
|
2db86b4676
|
Slightly faster lora applying.
|
2023-12-06 05:13:14 -05:00 |
|
comfyanonymous
|
ca82ade765
|
Use .itemsize to get dtype size for fp8.
|
2023-12-04 11:52:06 -05:00 |
|
comfyanonymous
|
31b0f6f3d8
|
UNET weights can now be stored in fp8.
--fp8_e4m3fn-unet and --fp8_e5m2-unet are the two different formats
supported by pytorch.
|
2023-12-04 11:10:00 -05:00 |
|
Benjamin Berman
|
01312a55a4
|
merge upstream
|
2023-12-03 20:41:13 -08:00 |
|
comfyanonymous
|
0cf4e86939
|
Add some command line arguments to store text encoder weights in fp8.
Pytorch supports two variants of fp8:
--fp8_e4m3fn-text-enc (the one that seems to give better results)
--fp8_e5m2-text-enc
|
2023-11-17 02:56:59 -05:00 |
|
comfyanonymous
|
7339479b10
|
Disable xformers when it can't load properly.
|
2023-11-13 12:31:10 -05:00 |
|
comfyanonymous
|
dd4ba68b6e
|
Allow different models to estimate memory usage differently.
|
2023-11-12 04:03:52 -05:00 |
|
comfyanonymous
|
8594c8be4d
|
Empty the cache when torch cache is more than 25% free mem.
|
2023-10-22 13:58:12 -04:00 |
|
Benjamin Berman
|
d21655b5a2
|
merge upstream
|
2023-10-17 14:47:59 -07:00 |
|
comfyanonymous
|
c8013f73e5
|
Add some Quadro cards to the list of cards with broken fp16.
|
2023-10-16 16:48:46 -04:00 |
|
comfyanonymous
|
fd4c5f07e7
|
Add a --bf16-unet to test running the unet in bf16.
|
2023-10-13 14:51:10 -04:00 |
|
comfyanonymous
|
9a55dadb4c
|
Refactor code so model can be a dtype other than fp32 or fp16.
|
2023-10-13 14:41:17 -04:00 |
|
comfyanonymous
|
88733c997f
|
pytorch_attention_enabled can now return True when xformers is enabled.
|
2023-10-11 21:30:57 -04:00 |
|
comfyanonymous
|
20d3852aa1
|
Pull some small changes from the other repo.
|
2023-10-11 20:38:48 -04:00 |
|
doctorpangloss
|
e8b60dfc6e
|
merge upstream
|
2023-10-06 15:02:31 -07:00 |
|
Simon Lui
|
eec449ca8e
|
Allow Intel GPUs to LoRA cast on GPU since it supports BF16 natively.
|
2023-09-22 21:11:27 -07:00 |
|
comfyanonymous
|
1cdfb3dba4
|
Only do the cast on the device if the device supports it.
|
2023-09-20 17:52:41 -04:00 |
|
comfyanonymous
|
321c5fa295
|
Enable pytorch attention by default on xpu.
|
2023-09-17 04:09:19 -04:00 |
|