comfyanonymous
8ce2a1052c
Optimizations to --fast and scaled fp8.
2024-10-22 02:12:28 -04:00
comfyanonymous
f82314fcfc
Fix duplicate sigmas on beta scheduler.
2024-10-21 20:19:45 -04:00
comfyanonymous
0075c6d096
Mixed precision diffusion models with scaled fp8.
...
This change allows supports for diffusion models where all the linears are
scaled fp8 while the other weights are the original precision.
2024-10-21 18:12:51 -04:00
comfyanonymous
83ca891118
Support scaled fp8 t5xxl model.
2024-10-20 22:27:00 -04:00
comfyanonymous
f9f9faface
Fixed model merging issue with scaled fp8.
2024-10-20 06:24:31 -04:00
comfyanonymous
471cd3eace
fp8 casting is fast on GPUs that support fp8 compute.
2024-10-20 00:54:47 -04:00
comfyanonymous
a68bbafddb
Support diffusion models with scaled fp8 weights.
2024-10-19 23:47:42 -04:00
comfyanonymous
73e3a9e676
Clamp output when rounding weight to prevent Nan.
2024-10-19 19:07:10 -04:00
comfyanonymous
67158994a4
Use the lowvram cast_to function for everything.
2024-10-17 17:25:56 -04:00
comfyanonymous
0bedfb26af
Revert "Fix Transformers FutureWarning ( #5140 )"
...
This reverts commit 95b7cf9bbe .
2024-10-16 12:36:19 -04:00
doctorpangloss
a83b561ea7
Follow symlinks for statics so that packages can correctly serve files when installed with uv. Update version.
2024-10-15 11:01:46 -07:00
doctorpangloss
5412451def
Handle custom_nodes returning None responses more gracefully
2024-10-15 11:01:21 -07:00
doctorpangloss
995807b4be
Improve custom node compatibility by including this stub symbol
2024-10-15 10:13:28 -07:00
doctorpangloss
40902acc28
Use the HuggingFace file for dreamshaper
2024-10-15 10:13:13 -07:00
Benjamin Berman
e5fc19a25b
Improve vanilla node importing and fix CUDA on CPU devices bug
2024-10-15 00:02:06 -07:00
Benjamin Berman
9c9df424b4
Fix CUDA package with no drivers
2024-10-14 22:56:21 -07:00
comfyanonymous
f584758271
Cleanup some useless lines.
2024-10-14 21:02:39 -04:00
svdc
95b7cf9bbe
Fix Transformers FutureWarning ( #5140 )
...
* Update sd1_clip.py
Fix Transformers FutureWarning
* Update sd1_clip.py
Fix comment
2024-10-14 20:12:20 -04:00
doctorpangloss
b0d606a282
Improve installation instructions with non-deprecated messaging. 0.2.3 is now directly written as the server version.
2024-10-14 15:54:21 -07:00
doctorpangloss
8512f361fe
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2024-10-14 15:26:27 -07:00
comfyanonymous
3c60ecd7a8
Fix fp8 ops staying enabled.
2024-10-12 14:10:13 -04:00
comfyanonymous
7ae6626723
Remove useless argument.
2024-10-12 07:16:21 -04:00
comfyanonymous
6632365e16
model_options consistency between functions.
...
weight_dtype -> dtype
2024-10-11 20:51:19 -04:00
Kadir Nar
ad07796777
🐛 Add device to variable c ( #5210 )
2024-10-11 20:37:50 -04:00
doctorpangloss
c0d1c9f96d
Improve OpenAPI spec
2024-10-11 14:46:26 -07:00
doctorpangloss
ed078c2f1f
Update web content
2024-10-11 14:00:16 -07:00
doctorpangloss
b5df6c64fa
Update OpenAPI spec to be more accurate
2024-10-11 13:59:57 -07:00
doctorpangloss
79b465faf2
Increase server response timeouts
2024-10-11 13:52:17 -07:00
doctorpangloss
caa6a37936
Fix pylint error
2024-10-11 13:51:13 -07:00
doctorpangloss
1cc637cb4f
Fix SDXL clip issue, fix website header issue
2024-10-10 22:46:52 -07:00
doctorpangloss
f3da381869
Fix inference mode execution issues
2024-10-10 21:00:09 -07:00
doctorpangloss
a38968f098
Improvements to execution
...
- Validation errors that occur early in the lifecycle of prompt
execution now get propagated to their callers in the
EmbeddedComfyClient. This includes error messages about missing node
classes.
- The execution context now includes the node_id and the prompt_id
- Latent previews are now sent with a node_id. This is not backwards
compatible with old frontends.
- Dependency execution errors are now modeled correctly.
- Distributed progress encodes image previews with node and prompt IDs.
- Typing for models
- The frontend was updated to use node IDs with previews
- Improvements to torch.compile experiments
- Some controlnet_aux nodes were upstreamed
2024-10-10 19:30:18 -07:00
doctorpangloss
69e523b89d
Experimental quantization support. Only Linux is meaningfully supported
2024-10-10 13:43:06 -07:00
comfyanonymous
1b80895285
Make clip loader nodes support loading sd3 t5xxl in lower precision.
...
Add attention mask support in the SD3 text encoder code.
2024-10-10 15:06:15 -04:00
doctorpangloss
5f26b76f59
Gracefully handle running with cuda torch on CPU only devices
2024-10-10 10:42:22 -07:00
Dr.Lt.Data
5f9d5a244b
Hotfix for the div zero occurrence when memory_used_encode is 0 ( #5121 )
...
https://github.com/comfyanonymous/ComfyUI/issues/5069#issuecomment-2382656368
2024-10-09 23:34:34 -04:00
Jonathan Avila
4b2f0d9413
Increase maximum macOS version to 15.0.1 when forcing upcast attention ( #5191 )
2024-10-09 22:21:41 -04:00
comfyanonymous
e38c94228b
Add a weight_dtype fp8_e4m3fn_fast to the Diffusion Model Loader node.
...
This is used to load weights in fp8 and use fp8 matrix multiplication.
2024-10-09 19:43:17 -04:00
doctorpangloss
c34403b574
Fix invalid device here
2024-10-09 11:21:19 -07:00
comfyanonymous
7ea7b2e77f
Slightly improve the fast previews for flux by adding a bias.
2024-10-09 09:48:18 -07:00
comfyanonymous
9786ea4a17
Use torch.nn.functional.linear in RGB preview code.
...
Add an optional bias to the latent RGB preview code.
2024-10-09 09:48:17 -07:00
comfyanonymous
91f458061c
Fix flux doras with diffusers keys.
2024-10-09 09:48:16 -07:00
City
7d1c420d19
Flux torch.compile fix ( #5082 )
2024-10-09 09:47:46 -07:00
doctorpangloss
99f0fa8b50
Enable sage attention autodetection
2024-10-09 09:27:05 -07:00
doctorpangloss
388dad67d5
Fix pylint errors in attention
2024-10-09 09:26:02 -07:00
doctorpangloss
bbe2ed330c
Memory management and compilation improvements
...
- Experimental support for sage attention on Linux
- Diffusers loader now supports model indices
- Transformers model management now aligns with updates to ComfyUI
- Flux layers correctly use unbind
- Add float8 support for model loading in more places
- Experimental quantization approaches from Quanto and torchao
- Model upscaling interacts with memory management better
This update also disables ROCm testing because it isn't reliable enough
on consumer hardware. ROCm is not really supported by the 7600.
2024-10-09 09:13:47 -07:00
comfyanonymous
203942c8b2
Fix flux doras with diffusers keys.
2024-10-08 19:03:40 -04:00
comfyanonymous
8dfa0cc552
Make SD3 fast previews a little better.
2024-10-07 09:19:59 -04:00
comfyanonymous
e5ecdfdd2d
Make fast previews for SDXL a little better by adding a bias.
2024-10-06 19:27:04 -04:00
comfyanonymous
7d29fbf74b
Slightly improve the fast previews for flux by adding a bias.
2024-10-06 17:55:46 -04:00
comfyanonymous
7d2467e830
Some minor cleanups.
2024-10-05 13:22:39 -04:00
Benjamin Berman
0a25b67ff8
Fix pylint errors
2024-10-04 21:12:37 -07:00
Benjamin Berman
afbb8aa154
Fix #23
2024-10-04 21:10:19 -07:00
doctorpangloss
de45dd50c5
Improve vanilla node importing for execution nodes
2024-10-04 10:56:43 -07:00
comfyanonymous
6f021d8aa0
Let --verbose have an argument for the log level.
2024-10-04 10:05:34 -04:00
comfyanonymous
d854ed0bcf
Allow using SD3 type te output on flux model.
2024-10-03 09:44:54 -04:00
comfyanonymous
abcd006b8c
Allow more permutations of clip/t5 in dual clip loader.
2024-10-03 09:26:11 -04:00
comfyanonymous
d985d1d7dc
CLIP Loader node now supports clip_l and clip_g only for SD3.
2024-10-02 04:25:17 -04:00
comfyanonymous
d1cdf51e1b
Refactor some of the TE detection code.
2024-10-01 07:08:41 -04:00
doctorpangloss
144fe6c421
Fix aiohttp bugs
2024-09-30 13:12:53 -07:00
comfyanonymous
b4626ab93e
Add simpletuner lycoris format for SD unet.
2024-09-30 06:03:27 -04:00
comfyanonymous
a9e459c2a4
Use torch.nn.functional.linear in RGB preview code.
...
Add an optional bias to the latent RGB preview code.
2024-09-29 11:27:49 -04:00
comfyanonymous
3bb4dec720
Fix issue with loras, lowvram and --fast fp8.
2024-09-28 14:42:32 -04:00
City
8733191563
Flux torch.compile fix ( #5082 )
2024-09-27 22:07:51 -04:00
doctorpangloss
6ef2d534b6
Fix polling for history too quickly. This will need an alternative approach so that readiness is immediate
2024-09-27 12:46:28 -07:00
doctorpangloss
d25394d386
API now supports fire-and-forget, checking on queue status; prefetch_count now expressly set to 1 for workers
2024-09-27 12:07:54 -07:00
doctorpangloss
a664a1fbc9
Add Flux inpainting model
2024-09-27 12:06:58 -07:00
doctorpangloss
667b77149e
Improve scaling and fit for diffusion
2024-09-26 18:08:34 -07:00
doctorpangloss
dbc8ee92a5
Add method to make this congruent with aio client
2024-09-26 18:08:15 -07:00
doctorpangloss
ab1a1de7a4
Fix missing arg to add_model_folder_path
2024-09-26 13:26:52 -07:00
doctorpangloss
a78f20178d
Fix linking error
2024-09-25 10:16:56 -07:00
doctorpangloss
8f58242c91
Fix frozenset v set issue in folder_paths
2024-09-24 20:36:50 -07:00
comfyanonymous
bdd4a22a2e
Fix flux TE not loading t5 embeddings.
2024-09-24 22:57:22 -04:00
chaObserv
479a427a48
Add dpmpp_2m_cfg_pp ( #4992 )
2024-09-24 02:42:56 -04:00
doctorpangloss
fa3176f96f
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2024-09-23 12:50:31 -07:00
doctorpangloss
4bdc208f29
Add noise to specific channels in a latent
2024-09-23 08:51:48 -07:00
comfyanonymous
3a0eeee320
Make --listen listen on both ipv4 and ipv6 at the same time by default.
2024-09-23 04:38:19 -04:00
comfyanonymous
9c41bc8d10
Remove useless line.
2024-09-23 02:32:29 -04:00
comfyanonymous
7a415f47a9
Add an optional VAE input to the ControlNetApplyAdvanced node.
...
Deprecate the other controlnet nodes.
2024-09-22 01:24:52 -04:00
comfyanonymous
dc96a1ae19
Load controlnet in fp8 if weights are in fp8.
2024-09-21 04:50:12 -04:00
comfyanonymous
2d810b081e
Add load_controlnet_state_dict function.
2024-09-21 01:51:51 -04:00
comfyanonymous
9f7e9f0547
Add an error message when a controlnet needs a VAE but none is given.
2024-09-21 01:33:18 -04:00
comfyanonymous
70a708d726
Fix model merging issue.
2024-09-20 02:31:44 -04:00
yoinked
e7d4782736
add laplace scheduler [2407.03297] ( #4990 )
...
* add laplace scheduler [2407.03297]
* should be here instead lol
* better settings
2024-09-19 23:23:09 -04:00
comfyanonymous
ad66f7c7d8
Add model_options to load_controlnet function.
2024-09-19 08:23:35 -04:00
Simon Lui
de8e8e3b0d
Fix xpu Pytorch nightly build from calling optimize which doesn't exist. ( #4978 )
2024-09-19 05:11:42 -04:00
doctorpangloss
e820a5de20
Revert "Reduce repeated calls of get_immediate_node_signature for ancestors in cache ( #4871 )"
...
This reverts commit f6b7194f64 .
2024-09-17 16:54:55 -07:00
doctorpangloss
d30f15ed09
Fix caching issues with text nodes when working with the UI
2024-09-17 16:09:47 -07:00
pharmapsychotic
0b7dfa986d
Improve tiling calculations to reduce number of tiles that need to be processed. ( #4944 )
2024-09-17 03:51:10 -04:00
comfyanonymous
d514bb38ee
Add some option to model_options for the text encoder.
...
load_device, offload_device and the initial_device can now be set.
2024-09-17 03:49:54 -04:00
comfyanonymous
0849c80e2a
get_key_patches now works without unloading the model.
2024-09-17 01:57:59 -04:00
comfyanonymous
e813abbb2c
Long CLIP L support for SDXL, SD3 and Flux.
...
Use the *CLIPLoader nodes.
2024-09-15 07:59:38 -04:00
comfyanonymous
f48e390032
Support AliMama SD3 and Flux inpaint controlnets.
...
Use the ControlNetInpaintingAliMamaApply node.
2024-09-14 09:05:16 -04:00
doctorpangloss
83b2f0174c
Fix tests, improve distributed worker health check, add torch compile options
2024-09-13 18:10:11 -07:00
doctorpangloss
ffb4ed9cf2
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2024-09-13 12:45:23 -07:00
comfyanonymous
cf80d28689
Support loading controlnets with different input.
2024-09-13 09:54:37 -04:00
Robin Huang
b962db9952
Add cli arg to override user directory ( #4856 )
...
* Override user directory.
* Use overridden user directory.
* Remove prints.
* Remove references to global user_files.
* Remove unused replace_folder function.
* Remove newline.
* Remove global during get_user_directory.
* Add validation.
2024-09-12 08:10:27 -04:00
comfyanonymous
9d720187f1
types -> comfy_types to fix import issue.
2024-09-12 03:57:46 -04:00
comfyanonymous
9f4daca9d9
Doesn't really make sense for cfg_pp sampler to call regular one.
2024-09-11 02:51:36 -04:00
yoinked
b5d0f2a908
Add CFG++ to DPM++ 2S Ancestral ( #3871 )
...
* Update sampling.py
* Update samplers.py
* my bad
* "fix" the sampler
* Update samplers.py
* i named it wrong
* minor sampling improvements
mainly using a dynamic rho value (hey this sounds a lot like smea!!!)
* revert rho change
rho? r? its just 1/2
2024-09-11 02:49:44 -04:00
comfyanonymous
9c5fca75f4
Fix lora issue.
2024-09-08 10:10:47 -04:00
comfyanonymous
32a60a7bac
Support onetrainer text encoder Flux lora.
2024-09-08 09:31:41 -04:00
Jim Winkens
bb52934ba4
Fix import issue ( #4815 )
2024-09-07 05:28:32 -04:00
comfyanonymous
ea77750759
Support a generic Comfy format for text encoder loras.
...
This is a format with keys like:
text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.v_proj.lora_up.weight
Instead of waiting for me to add support for specific lora formats you can
convert your text encoder loras to this format instead.
If you want to see an example save a text encoder lora with the SaveLora
node with the commit right after this one.
2024-09-07 02:20:39 -04:00
doctorpangloss
25e636fb65
Qwen2
2024-09-06 17:44:08 -07:00
doctorpangloss
e8eab4dbc6
Fix tensor types
2024-09-06 11:04:32 -07:00
comfyanonymous
c27ebeb1c2
Fix onnx export not working on flux.
2024-09-06 03:21:52 -04:00
doctorpangloss
a4fb34a0b8
Improve language and compositing nodes
2024-09-05 21:56:04 -07:00
doctorpangloss
7e1201e777
Merge branch 'master' of github.com:hiddenswitch/ComfyUI
2024-09-05 09:30:45 -07:00
doctorpangloss
0ba08f273a
Move comfy_extras nodes, fix pylint errors
2024-09-05 09:29:26 -07:00
doctorpangloss
db423f8013
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2024-09-05 09:23:00 -07:00
Benjamin Berman
fc02cd8373
Add fine tuned CLIP checkpoint
2024-09-05 01:24:17 -07:00
comfyanonymous
5cbaa9e07c
Mistoline flux controlnet support.
2024-09-05 00:05:17 -04:00
doctorpangloss
ed33ab1e7d
Support ProcessPoolExecutor to improve memory management
2024-09-04 17:03:22 -07:00
comfyanonymous
c7427375ee
Prioritize freeing partially offloaded models first.
2024-09-04 19:47:32 -04:00
Jedrzej Kosinski
f04229b84d
Add emb_patch support to UNetModel forward ( #4779 )
2024-09-04 14:35:15 -04:00
doctorpangloss
c75b9964ab
Fix Never on python 3.10
2024-09-04 09:35:10 -07:00
Silver
f067ad15d1
Make live preview size a configurable launch argument ( #4649 )
...
* Make live preview size a configurable launch argument
* Remove import from testing phase
* Update cli_args.py
2024-09-03 19:16:38 -04:00
doctorpangloss
38bcd9fcbd
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2024-09-03 15:28:52 -07:00
comfyanonymous
483004dd1d
Support newer glora format.
2024-09-03 17:02:19 -04:00
comfyanonymous
00a5d08103
Lower fp8 lora memory usage.
2024-09-03 01:25:05 -04:00
comfyanonymous
d043997d30
Flux onetrainer lora.
2024-09-02 08:22:15 -04:00
comfyanonymous
8d31a6632f
Speed up inference on nvidia 10 series on Linux.
2024-09-01 17:29:31 -04:00
comfyanonymous
b643eae08b
Make minimum_inference_memory() depend on --reserve-vram
2024-09-01 01:18:34 -04:00
comfyanonymous
935ae153e1
Cleanup.
2024-08-30 12:53:59 -04:00
Chenlei Hu
e91662e784
Get logs endpoint & system_stats additions ( #4690 )
...
* Add route for getting output logs
* Include ComfyUI version
* Move to own function
* Changed to memory logger
* Unify logger setup logic
* Fix get version git fallback
---------
Co-authored-by: pythongosssss <125205205+pythongosssss@users.noreply.github.com>
2024-08-30 12:46:37 -04:00
comfyanonymous
63fafaef45
Fix potential issue with hydit controlnets.
2024-08-30 04:58:41 -04:00
doctorpangloss
3f88282b6a
Fix absolute imports
2024-08-29 18:38:58 -07:00
doctorpangloss
52230c24f2
Fix runwayml removing their huggingface repositories
2024-08-29 18:14:24 -07:00
doctorpangloss
1bc96a7a1b
Fix #20 base path can now be set before folder paths are initialized, although all of this really has to be reworked
2024-08-29 18:02:36 -07:00
doctorpangloss
fd503d8a96
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2024-08-29 16:37:30 -07:00
comfyanonymous
6eb5d64522
Fix glora lowvram issue.
2024-08-29 19:07:23 -04:00
comfyanonymous
10a79e9898
Implement model part of flux union controlnet.
2024-08-29 18:41:22 -04:00
comfyanonymous
ea3f39bd69
InstantX depth flux controlnet.
2024-08-29 02:14:19 -04:00
comfyanonymous
b33cd61070
InstantX canny controlnet.
2024-08-28 19:02:50 -04:00
doctorpangloss
ccdbd957ef
Fix pylint issues
2024-08-28 15:48:47 -07:00
doctorpangloss
9e8bb0b297
Add image tracing to SVG support using vtrace, python skia. The Skia library can be used for additional drawing tasks
2024-08-28 14:49:19 -07:00
doctorpangloss
46ffaa2f0d
Fix Flux controlnets
2024-08-28 14:48:42 -07:00
comfyanonymous
d31e226650
Unify RMSNorm code.
2024-08-28 16:56:38 -04:00
comfyanonymous
38c22e631a
Fix case where model was not properly unloaded in merging workflows.
2024-08-27 19:03:51 -04:00
doctorpangloss
54740d99d6
Upstream the chat templates
2024-08-27 12:58:40 -07:00
Chenlei Hu
6bbdcd28ae
Support weight padding on diff weight patch ( #4576 )
2024-08-27 13:55:37 -04:00
comfyanonymous
ab130001a8
Do RMSNorm in native type.
2024-08-27 02:41:56 -04:00
doctorpangloss
8615c86722
Merge branch 'master' of github.com:comfyanonymous/ComfyUI
2024-08-26 16:59:38 -07:00
doctorpangloss
27f4d70904
Fix pylint
2024-08-26 16:56:27 -07:00
doctorpangloss
f49bcd4f3c
Upstream InstantX Union ControlNet support for Flux
2024-08-26 16:54:29 -07:00
comfyanonymous
2ca8f6e23d
Make the stochastic fp8 rounding reproducible.
2024-08-26 15:12:06 -04:00
comfyanonymous
7985ff88b9
Use less memory in float8 lora patching by doing calculations in fp16.
2024-08-26 14:45:58 -04:00
comfyanonymous
c6812947e9
Fix potential memory leak.
2024-08-26 02:07:32 -04:00
doctorpangloss
48ca1a4910
Include Kijai fp8 nodes. LoRAs are not supported by nf4
2024-08-25 22:41:10 -07:00
doctorpangloss
69e6d52301
Fix tests
2024-08-25 19:55:18 -07:00
doctorpangloss
c4fe16252b
Fix imports
2024-08-25 18:56:47 -07:00
doctorpangloss
7100603016
Register moves
2024-08-25 18:53:50 -07:00
doctorpangloss
5155a3e248
Merge WIP
2024-08-25 18:52:29 -07:00
doctorpangloss
d7b65c9f55
Add flux controlnet to known controlnets
2024-08-25 15:24:46 -07:00
Benjamin Berman
ad9c4a7237
Upstream nf4 nodes
2024-08-25 15:23:14 -07:00
comfyanonymous
9230f65823
Fix some controlnets OOMing when loading.
2024-08-25 05:54:29 -04:00
comfyanonymous
8ae23d8e80
Fix onnx export.
2024-08-23 17:52:47 -04:00
comfyanonymous
7df42b9a23
Fix dora.
2024-08-23 04:58:59 -04:00
comfyanonymous
5d8bbb7281
Cleanup.
2024-08-23 04:06:27 -04:00
comfyanonymous
2c1d2375d6
Fix.
2024-08-23 04:04:55 -04:00
Simon Lui
64ccb3c7e3
Rework IPEX check for future inclusion of XPU into Pytorch upstream and do a bit more optimization of ipex.optimize(). ( #4562 )
2024-08-23 03:59:57 -04:00
Scorpinaus
9465b23432
Added SD15_Inpaint_Diffusers model support for unet_config_from_diffusers_unet function ( #4565 )
2024-08-23 03:57:08 -04:00
comfyanonymous
c0b0da264b
Missing imports.
2024-08-22 17:20:51 -04:00
comfyanonymous
c26ca27207
Move calculate function to comfy.lora
2024-08-22 17:12:00 -04:00
comfyanonymous
7c6bb84016
Code cleanups.
2024-08-22 17:05:12 -04:00
comfyanonymous
c54d3ed5e6
Fix issue with models staying loaded in memory.
2024-08-22 15:58:20 -04:00
comfyanonymous
c7ee4b37a1
Try to fix some lora issues.
2024-08-22 15:32:18 -04:00
David
7b70b266d8
Generalize MacOS version check for force-upcast-attention ( #4548 )
...
This code automatically forces upcasting attention for MacOS versions 14.5 and 14.6. My computer returns the string "14.6.1" for `platform.mac_ver()[0]`, so this generalizes the comparison to catch more versions.
I am running MacOS Sonoma 14.6.1 (latest version) and was seeing black image generation on previously functional workflows after recent software updates. This PR solved the issue for me.
See comfyanonymous/ComfyUI#3521
2024-08-22 13:24:21 -04:00
comfyanonymous
8f60d093ba
Fix issue.
2024-08-22 10:38:24 -04:00
comfyanonymous
843a7ff70c
fp16 is actually faster than fp32 on a GTX 1080.
2024-08-21 23:23:50 -04:00
comfyanonymous
a60620dcea
Fix slow performance on 10 series Nvidia GPUs.
2024-08-21 16:39:02 -04:00
comfyanonymous
015f73dc49
Try a different type of flux fp16 fix.
2024-08-21 16:17:15 -04:00
comfyanonymous
904bf58e7d
Make --fast work on pytorch nightly.
2024-08-21 14:01:41 -04:00
Svein Ove Aas
5f50263088
Replace use of .view with .reshape ( #4522 )
...
When generating images with fp8_e4_m3 Flux and batch size >1, using --fast, ComfyUI throws a "view size is not compatible with input tensor's size and stride" error pointing at the first of these two calls to view.
As reshape is semantically equivalent to view except for working on a broader set of inputs, there should be no downside to changing this. The only difference is that it clones the underlying data in cases where .view would error out. I have confirmed that the output still looks as expected, but cannot confirm that no mutable use is made of the tensors anywhere.
Note that --fast is only marginally faster than the default.
2024-08-21 11:21:48 -04:00
comfyanonymous
76369e991c
Indentation.
2024-08-20 23:02:45 -07:00
Xrvk
bd18041d25
Add Flux model support for InstantX style controlnet residuals ( #4444 )
...
* Add Flux model support for InstantX style controlnet residuals
* Refactor Flux controlnet residual step to a separate method
* Rollback minor change
* New format for applying controlnet residuals: input->double_blocks, output->single_blocks
* Adjust XLabs Flux controlnet to fit new syntax of applying Flux controlnet residuals
* Remove unnecessary import and minor style change
2024-08-20 23:02:45 -07:00
doctorpangloss
3e54f9da36
Fix torch_dtype issues, missing DualCLIPLoader known model support
2024-08-20 23:00:12 -07:00
comfyanonymous
03ec517afb
Remove useless line, adjust windows default reserved vram.
2024-08-21 00:47:19 -04:00
doctorpangloss
540c43fae7
Typings
2024-08-20 21:25:16 -07:00
comfyanonymous
510f3438c1
Speed up fp8 matrix mult by using better code.
2024-08-20 22:53:26 -04:00
comfyanonymous
ea63b1c092
Simpletrainer lycoris format.
2024-08-20 12:05:13 -04:00
comfyanonymous
9953f22fce
Add --fast argument to enable experimental optimizations.
...
Optimizations that might break things/lower quality will be put behind
this flag first and might be enabled by default in the future.
Currently the only optimization is float8_e4m3fn matrix multiplication on
4000/ADA series Nvidia cards or later. If you have one of these cards you
will see a speed boost when using fp8_e4m3fn flux for example.
2024-08-20 11:55:51 -04:00
comfyanonymous
d1a6bd6845
Support loading long clipl model with the CLIP loader node.
2024-08-20 10:46:36 -04:00
comfyanonymous
83dbac28eb
Properly set if clip text pooled projection instead of using hack.
2024-08-20 10:46:36 -04:00
comfyanonymous
538cb068bc
Make cast_to a nop if weight is already good.
2024-08-20 10:46:36 -04:00
comfyanonymous
1b3eee672c
Fix potential issue with multi devices.
2024-08-20 10:46:36 -04:00
comfyanonymous
9eee470244
New load_text_encoder_state_dicts function.
...
Now you can load text encoders straight from a list of state dicts.
2024-08-19 17:36:35 -04:00
comfyanonymous
045377ea89
Add a --reserve-vram argument if you don't want comfy to use all of it.
...
--reserve-vram 1.0 for example will make ComfyUI try to keep 1GB vram free.
This can also be useful if workflows are failing because of OOM errors but
in that case please report it if --reserve-vram improves your situation.
2024-08-19 17:16:18 -04:00
comfyanonymous
4d341b78e8
Bug fixes.
2024-08-19 16:28:55 -04:00
comfyanonymous
6138f92084
Use better dtype for the lowvram lora system.
2024-08-19 15:35:25 -04:00
comfyanonymous
be0726c1ed
Remove duplication.
2024-08-19 15:26:50 -04:00
comfyanonymous
4506ddc86a
Better subnormal fp8 stochastic rounding. Thanks Ashen.
2024-08-19 13:38:03 -04:00
comfyanonymous
20ace7c853
Code cleanup.
2024-08-19 12:48:59 -04:00
comfyanonymous
22ec02afc0
Handle subnormal numbers in float8 rounding.
2024-08-19 05:51:08 -04:00
comfyanonymous
39f114c44b
Less broken non blocking?
2024-08-18 16:53:17 -04:00
comfyanonymous
6730f3e1a3
Disable non blocking.
...
It fixed some perf issues but caused other issues that need to be debugged.
2024-08-18 14:38:09 -04:00
comfyanonymous
73332160c8
Enable non blocking transfers in lowvram mode.
2024-08-18 10:29:33 -04:00
comfyanonymous
2622c55aff
Automatically use RF variant of dpmpp_2s_ancestral if RF model.
2024-08-18 00:47:25 -04:00
Ashen
1beb348ee2
dpmpp_2s_ancestral_RF for rectified flow (Flux, SD3 and Auraflow).
2024-08-18 00:33:30 -04:00