EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-06-28 02:39:26 +08:00

Author	SHA1	Message	Date
chenchaonan	7073a93653	sync upstream (#19 ) * [Partner Nodes] feat: add Krea 2 Medium Turbo model (#14280) * [Partner Nodes] feat: add seed input to Flux Erase node (#14283) Signed-off-by: bigcat88 <bigcat88@icloud.com> * chore: update workflow templates to v0.9.98 (#14284) * Bump comfyui-frontend-package to 1.45.15 (#14265) * Fix ideogram if model dtype gets set to fp8. (#14291) * Consolidate audio nodes into SaveAudioAdvanced node (CORE-202) (#13871) * Enable cfg1 optimization for DualModelGuider with CFGGuider (#14290) * Enable cfg1 optimization for DualModelGuider * Fix CFG Override tooltip * Fix interoperation with external source of pinned memory pressure (#14252) * mm: split off registration helper to doer and headroom calc * pinned_memory: implement registration comfy side Move away from Aimdo buffer registrations which seem fraught with danger and do it comfy side. Just start with the basic move. * pinned_memory: do registrations as portable memory * pinned_memory: discard async errors on registration fail Like the good ol days. * pinned_memory: implement abs shortfall retry If pinned registration happens to fail despite the previous budget ensures, consider the allocation shortfall, ensure it again, and try again. This allows comfy pins to interoperate with other software that might be doing substantive pinning. * aimdo 049 (#14300) * [Partner Nodes] feat: add new Gemini text node (#14299) * [Partner Nodes] feat: add temperature and top_p to NanoBanan node (#14305) * feat: add PreviewGaussianSplat + PreviewPointCloud nodes (#14194) * Update AMD portable readme. (#14303) * BE-1172 fix(3d): save Preview3DAdvanced / PreviewGaussianSplat / PreviewPointCloud to temp/, rename viewport input (#14294) * feat(3d): reorder Preview3DAdvanced / PreviewGaussianSplat / PreviewPointCloud inputs and outputs (#14308) * Update line endings check to ignore .ci files. (#14319) * Use windows line endings for windows portable readmes. (#14334) * Add SeedVR2 support (CORE-6) (#14110) * chore: update embedded docs to v0.5.3 (#14350) * Add Color primitive (#14260) * Improve ResolutionSelector (#14309) * feat(assets): extract image dimensions at ingest and emit on asset responses (#13991) * feat(assets): extract image dimensions at ingest and emit on asset responses Image assets now carry width/height under the existing `metadata` field on asset responses, shaped as `{"kind": "image", "width": W, "height": H}`. This lets consumers get original dimensions (e.g. for clients that render server-side thumbnails and can't recover them from naturalWidth/Height) without an extra round-trip. Dimensions are written to AssetReference.system_metadata across three ingest paths: - Direct file ingest (upload, in-place registration): Pillow reads the image header right after hashing, while the file is still in OS page cache. Non-image MIME types are skipped without touching the file. - From-hash registration: this path never reads the file bytes, so dimensions are best-effort copied from any prior sibling reference of the same asset that already carries kind=image metadata. Missing siblings, non-image siblings, or absent dimension keys leave the new reference's metadata unchanged. - Scanner enrichment: extends the existing system_metadata write in enrich_asset so scanner-registered images get the same treatment as uploaded ones. Existing system_metadata keys (e.g. safetensors fields written by the enricher, download provenance) are preserved through merge. Existing assets ingested before this change retain their current metadata — no automatic backfill in this PR. Tests cover image emission, non-image no-op, merge preservation, and the from-hash sibling back-fill (including the no-sibling and non-image-sibling cases). * fix(assets): validate sibling dimensions before backfilling Per CodeRabbit review on #13991: the previous loop accepted any sibling with `kind == "image"` and copied whichever dimension keys happened to be present, then returned. A partial sibling (kind set but missing or invalid width/height) could persist incomplete metadata onto the new reference even when a later sibling had valid dimensions. Now we validate that the sibling has both width and height as positive integers before adopting its dimensions, and continue scanning to the next sibling otherwise. * fix(assets): reject booleans in sibling dimension validation (use type-is) Per CodeRabbit follow-up on #13991: bool is a subclass of int in Python, so isinstance(True, int) is True. The previous strict-int gate would have accepted width=True (truthy + > 0) as a valid dimension. Realistic occurrence is low (extract_image_dimensions returns proper ints, JSON doesn't serialize bools as numbers), but the validation gate exists for defense-in-depth so it should be actually strict. --------- Co-authored-by: guill <jacob.e.segal@gmail.com> * Revert "Add SeedVR2 support (CORE-6) (#14110)" (#14359) This reverts commit `7863cf0e53`. * chore(openapi): sync shared API contract from cloud@5273c30 (#14266) * fix: Add back apply_rotary_emb for Qwen Image (#14364) * Allow custom templates with Ideogram4 TE (#14374) * main/server: Add --debug-hang (#14371) Add an option to debug a hang with ctrl-C, dumping the backtraces to see where its stuck or slow. * Add LoRA key mapping for LTXV/LTXAV models (#14349) * feat: Add model support for SCAIL-2 (#14373) * initial SCAIL2 support * Move bg_removal_model input socket to first position for nicer display (#14353) * mm: dont reset cast buffers in cleanup_models_gc() (#14372) cleanup_models_gc can be called once per load_models_gpu via free_memory, which in turn can de-activate an active model via this reset_cast_buffers. cleanup_models_gc() could also come via obscure garbage collector paths so limit reset_cast_buffers to the post-node callsite instead. * Ensure conditions are not trainable to avoid bugs (#14368) * feat: Add Bernini-R model support (Wan video) (CORE-279) (#14216) * Depth anything 3 (Core-135) (#13853) Co-authored-by: Alexis Rolland <alexisrolland@hotmail.com> * Always enable cuda malloc on cu130 and higher. (#14381) * chore(openapi): sync shared API contract from cloud@ca12913 (#14367) * [Trainer/bug] Ensure model is not inference mode (CORE-72) (#13400) * Ensure model is not inference mode * force clone inside training mode to avoid inference tensor * Allow force deepcopy for model patcher * chore(assets): drop vestigial tags.tag_type column (#14248) tag_type was always "user" in practice — no code path ever set it to anything else (no system/seeded classification was wired up) and nothing queried it. The column, its ix_tags_tag_type index, and the TagUsage.type API field were dead weight, so they're removed. Adds alembic migration 0004 to drop the column and index. Verified: asset-seeder tests pass; migration applies cleanly on a fresh SQLite (tags retains only name; tag_type column + index dropped). Co-authored-by: guill <jacob.e.segal@gmail.com> * feat(assets): cursor-based pagination on GET /api/assets (#14014) * spec(assets): add cursor pagination params to GET /api/assets Add 'after' query param and 'next_cursor' response field for keyset pagination. Matches the cloud Go implementation (BE-893) so frontend sees a unified contract across runtimes. Offset/limit remain as a deprecated fallback. * feat(assets): add cursor encode/decode helpers for keyset pagination Port of cloud common/pagination/cursor.go. Wire format is base64url of {"s", "v", "id"} JSON; times are Unix microseconds UTC to match PostgreSQL timestamp precision. Includes a byte-identity fixture pinned against the cloud Go wire format so cross-runtime FE pagination can't silently drift. * feat(assets): thread cursor through schemas, service, and query layer list_assets_page accepts an opaque 'after' cursor and returns next_cursor when more pages are available. The query applies a keyset WHERE clause and a secondary ORDER BY id for deterministic tiebreak. Cursor sort field is validated against the request sort, and a last_access_time sort (OSS-only) falls back to offset/limit. Offset is ignored whenever a cursor is supplied. * feat(assets): wire cursor pagination through GET /api/assets handler Adds integration tests for: full cursor walk, invalid-cursor 400, sort/cursor mismatch 400, cursor-wins-over-offset, absent next_cursor when no more results, and pagination stability across deletes. * fix(assets): address cursor-review verified findings - Mint next_cursor on every cursor-supported sort, not only when 'after' was supplied. A first request (no 'after') previously returned next_cursor=None, leaving cursor mode unreachable from a clean start. - Over-fetch limit+1 so an exactly-full terminal page doesn't mint a spurious cursor pointing at a phantom next page. - Map crafted out-of-range microsecond cursors (OverflowError / OSError in datetime construction) to 400 INVALID_CURSOR instead of leaking 500. - Bump MAX_CURSOR_VALUE_LENGTH 256 -> 512 to match the AssetReference name column max; without this, a long-named asset minted a cursor the same server then refused on the next request. Cross-runtime byte identity with cloud is unaffected because no cloud cursor ever carries a value > 256 (cloud schema doesn't permit it). - Return None from _encode_next_cursor when the boundary row carries a NULL sort value (e.g. an Asset without size_bytes backfilled), instead of silently encoding 0 and mis-positioning the keyset. - Fix schemas_in.py comment so it matches actual handler behavior (last_access_time + 'after' raises 400, does not fall back). - Add AssetsApiError schema + 400 response to GET /api/assets in openapi.yaml so generated clients know the INVALID_CURSOR envelope. - Extend integration coverage: first-page mint, exact-multiple terminal page, cursor walks for created_at/updated_at/size sorts, datetime overflow surfaces as 400 not 500. - Add unit coverage for datetime overflow and 512-char round-trip. * feat(assets): bind cursor to sort order + Go-compat JSON escaping Address three needs-judgment items from the cursor-review judge synthesis: 1. Cursor wire format now includes an "o" key carrying the sort direction ("asc" / "desc") it was minted under. A request that replays the cursor with a flipped `order` parameter is rejected with 400 INVALID_CURSOR instead of silently walking the wrong direction. Legacy cursors without "o" still decode (the binding is best-effort until cloud mirrors the field — follow-up filed separately). 2. JSON serialization now escapes `<`, `>`, `&`, U+2028, U+2029 to mirror Go's default `json.Marshal` behavior. Without this, an asset name containing those characters produced different bytes on Python vs cloud Go. The escaped form is what both runtimes emit. 3. Add direct query-layer tests for the keyset tiebreaker — the secondary ORDER BY id branch was previously unexercised. Two scenarios: all rows share a primary sort value, and mixed ties straddle page boundaries. Both assert no row is dropped or duplicated across the walk. Wire-format note: Python cursors now differ from current cloud cursors by exactly the "o" key. Cloud follow-up will bring the two back into byte alignment. * fix(assets): address bot review comments - Soften offset param prose: it's not deprecated, just not preferred for sequential walks. Random-access UIs (jump-to-page, item count displays) legitimately still want offset, so dropping the 'deprecated' framing rather than promoting it to a machine-readable deprecated:true flag. - Add explicit HTTP status assertions before every json() / next_cursor read in test_list_cursor.py so a failing request surfaces as an HTTP error instead of a confusing KeyError on a 4xx/5xx body. * feat(assets): require cursor o field, drop legacy permissive path Cursor pagination hasn't shipped on either runtime yet — this PR is still draft and cloud's mirror is just behind it — so there are no legacy no-o cursors in the wild. Make o mandatory from day one rather than landing permissive and tightening later. decode_cursor now rejects any payload without o (or with a non-string o) as malformed. CursorPayload.order becomes a required str. Tests that constructed CursorPayload directly now pass order="desc"; test_legacy_cursor_without_order_accepted flips to test_cursor_without_order_rejected. * chore(assets): drop cross-repo prose from cursor comments Strip prose references to sibling Go implementations and external ticket IDs from cursor.py, the cursor tests, the keyset integration tests, asset_management's sort-field comment, and the legacy prompt_id alias comment. Pure docstring/comment scrub — no behavior or wire-format changes. x-runtime: [cloud] field annotations in openapi.yaml are unchanged; those are the spec's structural cross-runtime convention, not internal references. * test(assets): include 'o' in microsecond-boundary cursor payload The boundary test was building a cursor without the required `o` key, so decode failed on the missing-order branch before reaching the µs-overflow path the test is asserting. Both paths return 400 INVALID_CURSOR so the assertion passed for the wrong reason. Add `o` to the payload and matching `order=` to the request so the decode reaches the intended branch. * fix(assets): address ultrareview findings on cursor pagination Six fact-checked findings from the multi-model review pass: - Encoder/decoder length asymmetry: encode_cursor now rejects empty id, oversized id (>128), oversized value (>512), and invalid order tokens symmetrically with decode_cursor. Prevents the same server from minting a cursor it then 400s on the next request (e.g. a filesystem-scanned asset name >512 chars). The bad-order path now raises InvalidCursorError (still subclasses ValueError) so route-layer handling stays uniform. - Raw U+2028/U+2029 in cursor.py source: ripgrep treated those lines as line-terminators, confirming the bytes were the actual separators. Any editor save / autoformat / git tooling that normalizes invisibles would silently break the encoder. Replaced with explicit   /   Python escape sequences. - set(seen) == set(names) hid ordering regressions: a cursor walk that dropped a row at a page boundary or returned duplicates could pass. Reworked the assertion to (1) reject duplicates, (2) require full coverage, and (3) assert strict positional order for size sort, the only field with a clock-independent ordering. - Flaky time.sleep(0.05) between inserts: Windows CI clock resolution is ~15ms, so back-to-back inserts under load could collide and exercise the tiebreaker instead of the documented path. Removed the sleep and let the strengthened assertion above carry coverage / no-duplicates, with size sort carrying strict order. - Cursor error envelope diverged from the rest of routes.py: cursor 400s emitted {error: {code, message}} while every other 400 in the file emits {error: {code, message, details}} via _build_error_response. Switched to _build_error_response and added the details field to the AssetsApiError schema in openapi.yaml. - "Byte-identity fixtures" only checked substring containment, defeating the test class's stated purpose of pinning the wire format. Switched to exact-bytes equality against an inline expected payload string per fixture, so any whitespace / key-order / escape drift fails loudly. Also dropped Go / json.Marshal references from docstrings — the byte format is the contract, not the runtime that mints it. * fix(assets): cap cursors by encoded wire size, not just char count Char-count guards on value/id can still let multibyte or escape-heavy inputs blow past MAX_ENCODED_CURSOR_LENGTH once UTF-8 + escape expansion + base64url runs. A 512-character name of 'é' (2 bytes UTF-8) or '<' (serializes to the 6-byte '<' escape) passes the char check, mints a ~1500-byte cursor, then 400s when handed back on the next request. Compute the final encoded form and reject it before returning if it exceeds the wire cap. Adds regression tests for both inflation paths. * refactor(assets): extract cursor JSON escaping helper; size wire cap above per-field caps Addresses review feedback on cursor.py: - Extract the inline escape chain into _apply_wire_compatible_json_escapes() with a comment pinning it to the wire format's escape set, so the parity intent is explicit rather than reading as an ad-hoc transform. - Raise MAX_ENCODED_CURSOR_LENGTH to 8192 (comfortably above the ~5.2KB worst-case the per-field caps can produce) and drop the mint-time length guard. Encoder/decoder symmetry now holds by construction: the encoder can't produce a cursor the decode path rejects, so there is no confusing user-visible 'cursor too long' failure at mint time. - Rewrite the two over-wire-cap tests to assert worst-case multibyte and escape-heavy values mint and round-trip, instead of being rejected. * refactor(assets): drop cross-runtime cursor escaping; cursors are opaque The custom JSON escaping of <, >, &, U+2028, and U+2029 existed only to keep the encoded cursor byte-identical with the Cloud implementation of the same payload format. Cursors are opaque tokens, so byte-level compatibility across implementations is not needed — plain json.dumps output is sufficient. Remove the escaping helper and the byte-identity test fixtures that pinned the wire format; keep round-trip coverage for the affected characters. --------- Co-authored-by: guill <jacob.e.segal@gmail.com> * fix(assets): remove unused delete_content param from deleteAsset (#14241) * fix(assets): remove unused delete_content param from deleteAsset The delete_content query param on DELETE /api/assets/{id} was introduced in #12125 and had its default flipped to false in #12621. In practice no client sends it: the frontend issues a bare DELETE /assets/{id}, so every real caller already gets the default soft-delete (the reference is hidden, content preserved). The only thing that set delete_content=true was this repo's own test teardown. Remove the param from the route and the OpenAPI spec so the contract matches what clients actually use (and lines up with the cloud surface). The route now always soft-deletes. The underlying delete_asset_reference helper keeps its delete_content_if_orphan option, so orphan reclamation remains available internally for a future GC path — it's just no longer exposed on the public endpoint. Tests that used delete_content=true for hard cleanup now soft-delete; test_delete_upon_reference_count asserts content preservation instead of orphan removal. * test/docs: address review on deleteAsset delete_content removal - Rename test_delete_upon_reference_count -> test_soft_delete_preserves_asset_identity_across_references; the old name implied last-ref cleanup, but it now verifies the opposite (soft delete preserves identity across references). - Strengthen the re-association assertion: also check asset_hash == src_hash so it proves content reuse rather than relying on the now-tautological created_new is False. - Document delete_asset_reference: the orphan-reclamation branch is intentionally internal-only; the public endpoint always soft-deletes. - Normalize the soft-delete comment phrasing. * test(assets): make seed content unique per test for isolation Removing the delete_content param means delete is always a soft delete, so content created by one test now survives into the next. The suite had been relying on hard-delete teardown for isolation, so shared fixed-content fixtures started colliding: seeded_asset (b"A"4096) and make_asset_bytes (deterministic on name) produced the same hash every test, so the second seed deduped to the surviving asset and returned 200 instead of 201, cascading into ~14 failures/errors. Salt both fixtures with a per-test uuid so each test creates fresh content (created_new True, 201), while keeping content deterministic within a test (same name/size -> same bytes) and preserving exact byte length so size-based list/sort assertions are unaffected. main: force cudnn.benchmark to false (#14390) Some custom nodes try to set this true globally. It messes with dynamic VRAM with one-off spikes that can OOM but this is also very high risk for windows where such allocations might get serviced by shared memory fallback. Trump it. * feat(assets): add job_ids filter to GET /api/assets (#13998) * feat(assets): add job_ids filter to GET /api/assets Mirrors the existing cloud `job_ids` query param on the local Python server: clients can pass a comma-separated list (or repeated query params) of UUIDs to filter assets by their associated job. The `AssetReference.job_id` column already exists, so no migration is needed — this just plumbs the filter through schema → service → query. Marks the parameter as available in both runtimes by dropping the `[cloud-only]` description prefix and the `x-runtime: [cloud]` tag from the OpenAPI spec, per the OSS field-drift convention (absent runtime tag = populated by both local and cloud). * fix(assets): tighten job_ids — array schema, max_length, narrow except From cursor-reviews on the parent commit: - OpenAPI: declare job_ids as `type: array, items: string format: uuid` with `style: form, explode: true` so it matches the documented contract (and matches sibling include_tags/exclude_tags shape). Description now states both accepted shapes explicitly. - Schema: cap `job_ids` at 500 entries (max_length on the Pydantic field) so a client can't splice an unbounded list into the IN clauses. - Schema: drop `AttributeError` from the except — `raw` only contains `str` items by construction, so `uuid.UUID(<str>)` raises `ValueError` exclusively; the second clause was dead code. * fix(assets): tighten job_ids validator + add schema-level tests Aligns with the parallel hardening from draft PR #13848 (now closed as a duplicate). The validator now: - Raises ValueError on non-string list items (was: silently dropped). - Raises ValueError on non-string / non-list top-level values like dict or int (was: silently passed through to Pydantic's downstream coercion). Adds tests-unit/assets_test/queries/test_list_assets_query.py covering the validator end-to-end: CSV canonicalization, dedup order, default empty, invalid UUID, non-string list item, non-string non-list value, and the max_length=500 boundary. * feat(prompt): enforce canonical UUID prompt_id at job creation POST /prompt previously accepted any client-supplied prompt_id verbatim, str()-coercing even non-strings, and minting the literal job id "None" for an explicit JSON null. The new GET /api/assets job_ids filter matches stored job ids as canonical UUIDs exactly, so a non-UUID id minted a job whose assets could never be filtered. - validate_job_id (comfy_execution/jobs.py): requires a string in the canonical lowercase hyphenated UUID form; raises ValueError otherwise, including parseable-but-non-canonical spellings (uppercase, braced, URN, bare hex), which would otherwise be silently rewritten and then miss every exact-match lookup downstream (history keys, websocket correlation, /interrupt, the assets job_ids filter). - POST /prompt: absent or null prompt_id means the server mints uuid4; invalid means 400 invalid_prompt_id on the standard error envelope. - openapi.yaml: document the request-side prompt_id (format uuid, nullable) on PromptRequest. - tests: unit matrix for validate_job_id; integration tests against the booted server covering rejection, acceptance, and null handling. --------- Co-authored-by: guill <jacob.e.segal@gmail.com> * feat(assets): include asset id in executed WebSocket message (#13862) * feat(assets): enrich executed WS message with asset metadata When --enable-assets is set, each file-type output entry in the `executed` WebSocket message now includes id, name, asset_hash, size, and mime_type — matching the shape already returned by /upload/image. The enrichment lives in comfy_execution/asset_enrichment.py (no torch dependency) and is called from both send sites in execution.py: freshly executed nodes register the file inline via register_file_in_place; cached node re-sends look up the existing AssetReference by file path to avoid re-hashing. Errors are caught per-entry so a failure never blocks the WS message from sending. * fix(assets): inject only id in executed WS message per Asset Identity RFC Per the Asset Identity RFC, the executed WebSocket payload should carry id alone — hash is already encoded in the filename, and name/preview_url/ size belong behind GET /api/assets/{id} rather than being pushed eagerly. Simplifies the DB lookup path: we only need ref.id, so the asset.hash null-check is no longer required as a fallback trigger. * fix(assets): reject path traversal when resolving output abs_path Subfolder/filename were joined and absolutized without containment check, so '..' segments or an absolute filename could escape the type's base directory and register an unrelated on-disk file as an asset. Add commonpath-based containment check; skip enrichment (warn, leave entry unchanged) when the resolved path escapes base. Catches ValueError from cross-drive paths on Windows. * docs(assets): drop Asset Identity RFC reference from docstring * docs(assets): trim docstring to what enrichment does, not what it doesn't * test(assets): use real platform paths so containment check works on Windows The previous test setup patched os.path.abspath to identity and used a POSIX-style '/output' base, which collided with Windows path separators in os.path.commonpath. Drop the abspath/join patches and use a real tempdir-rooted base so the containment check runs against actual platform paths. * refactor(assets): enrich at output-processing time, not in the WS send path Per review: enrichment lived inside the client_id-guarded send sites, so a headless run (no websocket client) never registered assets at all, and ui_outputs/history stored the un-enriched entries. Now output_ui is enriched once, right after the node produces it and before it is stored in ui_outputs — so registration happens regardless of connected clients, and the asset id flows into history and the execution cache for free. _send_cached_ui re-sends the stored (already-enriched) dict verbatim, which lets the DB-lookup-by-path fallback be deleted: every enrichment is now a fresh output, and register_file_in_place re-hashes on upsert so an overwritten path can never carry a stale id. * revert(assets): drop job_ids filter from GET /api/assets (#14408) The job_ids query filter added in #13998 has no live consumer: the frontend Generated tab kept sourcing from GET /jobs, and the cloud side removed its equivalent filter from the shared asset spec. Carrying it on the local server only re-introduces Core<->Cloud drift on the shared contract, so remove it to match. Removed: the job_ids field + validator on ListAssetsQuery, the IN(...) clauses in list_references_page, the service/route passthrough, and the filter-only tests. Kept: the canonical-UUID prompt_id enforcement at job creation (also landed in #13998). It stands on its own -- job ids are matched verbatim by history keys, websocket correlation, and /interrupt -- and cloud inherits it by running core for execution, so no divergence is created. * chore(openapi): sync shared API contract from cloud@e3c52ad (#14406) * I don't think this actually works anymore. (#14403) * ops: tolerate already force casted dynamic weight (#14410) Some custom nodes .to weights completely out of load context which can wreak havoc if its for a model that is not active. Detect this condition and just let it fall-through to the non-dynamic loader straight up. --------- Signed-off-by: bigcat88 <bigcat88@icloud.com> Co-authored-by: Alexander Piskun <13381981+bigcat88@users.noreply.github.com> Co-authored-by: Daxiong (Lin) <contact@comfyui-wiki.com> Co-authored-by: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com> Co-authored-by: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com> Co-authored-by: Alexis Rolland <alexisrolland@hotmail.com> Co-authored-by: Jukka Seppänen <40791699+kijai@users.noreply.github.com> Co-authored-by: rattus <46076784+rattus128@users.noreply.github.com> Co-authored-by: Terry Jia <terryjia88@gmail.com> Co-authored-by: John Pollock <pollockjj@users.noreply.github.com> Co-authored-by: Silver <65376327+silveroxides@users.noreply.github.com> Co-authored-by: Matt Miller <mattmiller@comfy.org> Co-authored-by: guill <jacob.e.segal@gmail.com> Co-authored-by: kelseyee <971704395@qq.com> Co-authored-by: Kohaku-Blueleaf <59680068+KohakuBlueleaf@users.noreply.github.com> Co-authored-by: Talmaj <Talmaj@users.noreply.github.com>	2026-06-11 12:04:28 +08:00
comfyanonymous	da49b7d0b6	Remove useless annotations imports. (#14105 )	2026-05-25 19:23:29 -07:00
rattus	5aa5ccc9e0	Multi-threaded load of models from disk (big load time speedups & Offload to disk) (CORE-43,CORE-152,CORE-164,CORE-165,CORE-117) (#13802 ) * model_management: disable non-dynamic smart memory Disable smart memory outright for non dynamic models. This is a minor step towards deprecation of --disable-dynamic-vram and the legacy ModelPatcher. This is needed for estimate-free model development, where new models can opt-out of supplying a memory estimate and not have to worry about hard VRAM allocations due to legacy non-dynamic model patchers This is also a general stability increase for a lot of stray use cases where estimates may still be off and going forward we are not going to accurately maintain such estimates. * pinned_memory: implement with aimdo growable buffer Use a single growable buffer so we can do threaded pre-warming on pinned memory. * mm: use aimdo to do transfer from disk to pin Aimdo implements a faster threaded loader. * Add stream host pin buffer for AIMDO casts Introduce per-offload-stream HostBuffer reuse for pinned staging, include it in cast buffer reset synchronization. Defer actual casts that go via this pin path to a separate pass such that the buffer can be allocated monolithically (to avoid cudaHostRegister thrash). * remove old pin path * Implement JIT pinned memory pressure Replace the predictive pin pressure mechanism with JIT PIN memory pressure. * LowVRAMPatch: change to two-phase visit * lora: re-implement as inplace swiss-army-knife operation * prepare for multiple pin sets * implement pinned loras * requirements: comfy-aimdo 0.4.0 * ops: remove unused arg This was defeatured in aimdo iteration * ops: sync the CPU with only the offload stream activity This was syncing with the offload stream which itself is synced with the compute stream, so this was syncing CPU with compute transitively. Define the event to sync it more gently. * pins: implement freeing intermediate for pinned memory Pinning is more important than inactive intermediates and the stream pin buffer is more important than even active intermediates. * execution: implement pin eviction on RAM presure Add back proper pin freeing on RAM pressure * implement pin registration swaps Uncap the windows pins from 50% by extending the pool and have a pressure mechanism to move the pin reservations om demand. This unfortunately implies a GPU sync to do the freeing so significant hysterisis needs to be added to consolidate these pressure events. * cli_args/execution: Implement lower background cache-ram threshold Limit the amount of RAM background intermediates can use, so that switching workflows doesn't degrade performance too much. * make default * bump aimdo * model-patcher: force-cast tiny weights Flux 2 gets crazy stalls due to a mix of tiny and giant weights creating lopsided steam buffer rotations which creates stalls. * ops: refactor in prep for chunking * mm: delegate pin-on-the-way to aimdo Aimdo is able to chunk and slice this on the way for better CPU->GPU overlap. The main advantage is the ability to shorten the bus contention window between previous weight transfer and the next weights vbar fault. * bump aimdo * pinning updates * specify hostbuf max allocation size There a signs of virtual memory exhaustion on some linux systems when throwing 128GB for every little piece. Pass the actual to save aimdo from over-estimates * tests: update execution tests for caching The default caching changed to ram-cache so update these tests accordingly. Remove the LRU 0 test as this also falls through to RAM cache.	2026-05-20 17:03:58 -07:00
comfyanonymous	0a7d2ffd68	Support anima TE lora kohya format. (#13847 )	2026-05-11 20:01:52 -07:00
rattus	783782d5d7	Implement block prefetch + Lora Async load + and adopt in LTX (Speedup!) (CORE-111) (#13618 ) * mm: Use Aimdo raw allocator for cast buffers pytorch manages allocation of growing buffers on streams poorly. Pyt has no windows support for the expandable segments allocator (which is the right tool for this job), while also segmenting the memory by stream such that it can be generally re-used. So kick the problem to aimdo which can just grow a virtual region thats freed per stream. * plan * ops: move cpu handler up to the caller * ops: split up prefetch from weight prep block prefetching API Split up the casting and weight formating/lora stuff in prep for arbitrary prefetch support. * ops: implement block prefetching API allow a model to construct a prefetch list and operate it for increased async offload. * ltxv2: Implement block prefetching * Implement lora async offload Implement async offload of loras.	2026-05-02 19:23:24 -04:00
Rainer	e9c311b245	OneTainer ERNIE LoRA support (#13640 )	2026-04-30 19:33:41 -04:00
Jukka Seppänen	06f85e2c79	Fix text encoder lora loading for wrapped models (#12852 )	2026-03-09 16:08:51 -04:00
fappaz	b233dbe0bc	feat(ace-step): add ACE-Step 1.5 lycoris key alias mapping for LoKR #12638 (#12665 )	2026-02-26 18:19:19 -05:00
rattus	c0370044cd	MPDynamic: force load flux img_in weight (Fixes flux1 canny+depth lora crash) (#12446 ) * lora: add weight shape calculations. This lets the loader know if a lora will change the shape of a weight so it can take appropriate action. * MPDynamic: force load flux img_in weight This weight is a bit special, in that the lora changes its geometry. This is rather unique, not handled by existing estimate and doesn't work for either offloading or dynamic_vram. Fix for dynamic_vram as a special case. Ideally we can fully precalculate these lora geometry changes at load time, but just get these models working first.	2026-02-15 20:30:09 -05:00
comfyanonymous	ab1050bec3	Support ace step 1.5 base model loras. (#12252 )	2026-02-03 13:54:23 -05:00
ComfyUI Wiki	e89b22993a	Support ModelScope-Trainer/DiffSynth LoRA format for Flux.2 Klein models (#12042 ) Some checks are pending Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Execution Tests / test (macos-latest) (push) Waiting to run Details Execution Tests / test (ubuntu-latest) (push) Waiting to run Details Execution Tests / test (windows-latest) (push) Waiting to run Details Unit Tests / test (macos-latest) (push) Waiting to run Details Unit Tests / test (ubuntu-latest) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run Details Test server launches without errors / test (push) Waiting to run Details Unit Tests / test (windows-2022) (push) Waiting to run Details	2026-01-23 15:27:49 -05:00
kelseyee	a3b5d4996a	Support ModelScope-Trainer DiffSynth lora for Z Image. (#11805 )	2026-01-12 15:38:46 -05:00
dxqb	8e889c535d	Support "transformer." LoRA prefix for Z-Image (#11135 )	2025-12-08 15:17:26 -05:00
Jukka Seppänen	fd109325db	Kandinsky5 model support (#10988 ) * Add Kandinsky5 model support lite and pro T2V tested to work * Update kandinsky5.py * Fix fp8 * Fix fp8_scaled text encoder * Add transformer_options for attention * Code cleanup, optimizations, use fp32 for all layers originally at fp32 * ImageToVideo -node * Fix I2V, add necessary latent post process nodes * Support text to image model * Support block replace patches (SLG mostly) * Support official LoRAs * Don't scale RoPE for lite model as that just doesn't work... * Update supported_models.py * Rever RoPE scaling to simpler one * Fix typo * Handle latent dim difference for image model in the VAE instead * Add node to use different prompts for clip_l and qwen25_7b * Reduce peak VRAM usage a bit * Further reduce peak VRAM consumption by chunking ffn * Update chunking * Update memory_usage_factor * Code cleanup, don't force the fp32 layers as it has minimal effect * Allow for stronger changes with first frames normalization Default values are too weak for any meaningful changes, these should probably be exposed as advanced node options when that's available. * Add image model's own chat template, remove unused image2video template * Remove hard error in ReplaceVideoLatentFrames -node * Update kandinsky5.py * Update supported_models.py * Fix typos in prompt template They were now fixed in the original repository as well * Update ReplaceVideoLatentFrames Add tooltips Make source optional Better handle negative index * Rename NormalizeVideoLatentFrames -node For bit better clarity what it does * Fix NormalizeVideoLatentStart node out on non-op	2025-12-05 22:20:22 -05:00
comfyanonymous	5151cff293	Add some missing z image lora layers. (#10980 ) Some checks are pending Python Linting / Run Ruff (push) Waiting to run Details Python Linting / Run Pylint (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run Details Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run Details Execution Tests / test (macos-latest) (push) Waiting to run Details Execution Tests / test (ubuntu-latest) (push) Waiting to run Details Execution Tests / test (windows-latest) (push) Waiting to run Details Test server launches without errors / test (push) Waiting to run Details Unit Tests / test (macos-latest) (push) Waiting to run Details Unit Tests / test (ubuntu-latest) (push) Waiting to run Details Unit Tests / test (windows-2022) (push) Waiting to run Details	2025-11-28 23:55:00 -05:00
comfyanonymous	52a32e2b32	Support some z image lora formats. (#10978 )	2025-11-28 21:12:42 -05:00
comfyanonymous	47a9cde5d3	Support the omnigen2 umo lora. (#9886 )	2025-09-15 18:10:55 -04:00
comfyanonymous	27e067ce50	Implement the USO subject identity lora. (#9674 ) Use the lora with FluxContextMultiReferenceLatentMethod node set to "uso" and a ReferenceLatent node with the reference image.	2025-09-01 18:54:02 -04:00
PsychoLogicAu	2208aa616d	Support SimpleTuner lycoris lora for Qwen-Image (#9280 )	2025-08-11 16:56:16 -04:00
flybirdxx	4c3e57b0ae	Fixed an issue where qwenLora could not be loaded properly. (#9208 )	2025-08-06 13:23:11 -04:00
comfyanonymous	1c1687ab1c	Support HiDream SimpleTuner loras. (#8318 )	2025-05-28 18:47:15 -04:00
comfyanonymous	481732a0ed	Support official ACE Step loras. (#8094 )	2025-05-13 07:32:16 -04:00
comfyanonymous	08ff5fa08a	Cleanup chroma PR.	2025-04-30 20:57:30 -04:00
Silver	4ca3d84277	Support for Chroma - Flux1 Schnell distilled with CFG (#7355 ) * Upload files for Chroma Implementation * Remove trailing whitespace * trim more trailing whitespace..oops * remove unused imports * Add supported_inference_dtypes * Set min_length to 0 and remove attention_mask=True * Set min_length to 1 * get_mdulations added from blepping and minor changes * Add lora conversion if statement in lora.py * Update supported_models.py * update model_base.py * add uptream commits * set modelType.FLOW, will cause beta scheduler to work properly * Adjust memory usage factor and remove unnecessary code * fix mistake * reduce code duplication * remove unused imports * refactor for upstream sync * sync chroma-support with upstream via syncbranch patch * Update sd.py * Add Chroma as option for the OptimalStepsScheduler node	2025-04-30 20:57:00 -04:00
comfyanonymous	f935d42d8e	Support SimpleTuner lycoris lora format for HiDream.	2025-04-25 03:11:14 -04:00
Kohaku-Blueleaf	1f3fba2af5	Unified Weight Adapter system for better maintainability and future feature of Lora system (#7540 )	2025-04-21 20:15:32 -04:00
comfyanonymous	9e1d301129	Only use stable cascade lora format with cascade model.	2025-02-01 06:35:22 -05:00
City	bddb02660c	Add PixArt model support (#6055 ) * PixArt initial version * PixArt Diffusers convert logic * pos_emb and interpolation logic * Reduce duplicate code * Formatting * Use optimized attention * Edit empty token logic * Basic PixArt LoRA support * Fix aspect ratio logic * PixArtAlpha text encode with conds * Use same detection key logic for PixArt diffusers	2024-12-20 15:25:00 -05:00
comfyanonymous	ff2ff02168	Support old diffusion-pipe hunyuan video loras.	2024-12-18 06:23:54 -05:00
comfyanonymous	e4e1bff605	Support diffusion-pipe hunyuan video lora format.	2024-12-17 07:14:21 -05:00
Jedrzej Kosinski	0ee322ec5f	ModelPatcher Overhaul and Hook Support (#5583 ) * Added hook_patches to ModelPatcher for weights (model) * Initial changes to calc_cond_batch to eventually support hook_patches * Added current_patcher property to BaseModel * Consolidated add_hook_patches_as_diffs into add_hook_patches func, fixed fp8 support for model-as-lora feature * Added call to initialize_timesteps on hooks in process_conds func, and added call prepare current keyframe on hooks in calc_cond_batch * Added default_conds support in calc_cond_batch func * Added initial set of hook-related nodes, added code to register hooks for loras/model-as-loras, small renaming/refactoring * Made CLIP work with hook patches * Added initial hook scheduling nodes, small renaming/refactoring * Fixed MaxSpeed and default conds implementations * Added support for adding weight hooks that aren't registered on the ModelPatcher at sampling time * Made Set Clip Hooks node work with hooks from Create Hook nodes, began work on better Create Hook Model As LoRA node * Initial work on adding 'model_as_lora' lora type to calculate_weight * Continued work on simpler Create Hook Model As LoRA node, started to implement ModelPatcher callbacks, attachments, and additional_models * Fix incorrect ref to create_hook_patches_clone after moving function * Added injections support to ModelPatcher + necessary bookkeeping, added additional_models support in ModelPatcher, conds, and hooks * Added wrappers to ModelPatcher to facilitate standardized function wrapping * Started scaffolding for other hook types, refactored get_hooks_from_cond to organize hooks by type * Fix skip_until_exit logic bug breaking injection after first run of model * Updated clone_has_same_weights function to account for new ModelPatcher properties, improved AutoPatcherEjector usage in partially_load * Added WrapperExecutor for non-classbound functions, added calc_cond_batch wrappers * Refactored callbacks+wrappers to allow storing lists by id * Added forward_timestep_embed_patch type, added helper functions on ModelPatcher for emb_patch and forward_timestep_embed_patch, added helper functions for removing callbacks/wrappers/additional_models by key, added custom_should_register prop to hooks * Added get_attachment func on ModelPatcher * Implement basic MemoryCounter system for determing with cached weights due to hooks should be offloaded in hooks_backup * Modified ControlNet/T2IAdapter get_control function to receive transformer_options as additional parameter, made the model_options stored in extra_args in inner_sample be a clone of the original model_options instead of same ref * Added create_model_options_clone func, modified type annotations to use __future__ so that I can use the better type annotations * Refactored WrapperExecutor code to remove need for WrapperClassExecutor (now gone), added sampler.sample wrapper (pending review, will likely keep but will see what hacks this could currently let me get rid of in ACN/ADE) * Added Combine versions of Cond/Cond Pair Set Props nodes, renamed Pair Cond to Cond Pair, fixed default conds never applying hooks (due to hooks key typo) * Renamed Create Hook Model As LoRA nodes to make the test node the main one (more changes pending) * Added uuid to conds in CFGGuider and uuids to transformer_options to allow uniquely identifying conds in batches during sampling * Fixed models not being unloaded properly due to current_patcher reference; the current ComfyUI model cleanup code requires that nothing else has a reference to the ModelPatcher instances * Fixed default conds not respecting hook keyframes, made keyframes not reset cache when strength is unchanged, fixed Cond Set Default Combine throwing error, fixed model-as-lora throwing error during calculate_weight after a recent ComfyUI update, small refactoring/scaffolding changes for hooks * Changed CreateHookModelAsLoraTest to be the new CreateHookModelAsLora, rename old ones as 'direct' and will be removed prior to merge * Added initial support within CLIP Text Encode (Prompt) node for scheduling weight hook CLIP strength via clip_start_percent/clip_end_percent on conds, added schedule_clip toggle to Set CLIP Hooks node, small cleanup/fixes * Fix range check in get_hooks_for_clip_schedule so that proper keyframes get assigned to corresponding ranges * Optimized CLIP hook scheduling to treat same strength as same keyframe * Less fragile memory management. * Make encode_from_tokens_scheduled call cleaner, rollback change in model_patcher.py for hook_patches_backup dict * Fix issue. * Remove useless function. * Prevent and detect some types of memory leaks. * Run garbage collector when switching workflow if needed. * Moved WrappersMP/CallbacksMP/WrapperExecutor to patcher_extension.py * Refactored code to store wrappers and callbacks in transformer_options, added apply_model and diffusion_model.forward wrappers * Fix issue. * Refactored hooks in calc_cond_batch to be part of get_area_and_mult tuple, added extra_hooks to ControlBase to allow custom controlnets w/ hooks, small cleanup and renaming * Fixed inconsistency of results when schedule_clip is set to False, small renaming/typo fixing, added initial support for ControlNet extra_hooks to work in tandem with normal cond hooks, initial work on calc_cond_batch merging all subdicts in returned transformer_options * Modified callbacks and wrappers so that unregistered types can be used, allowing custom_nodes to have their own unique callbacks/wrappers if desired * Updated different hook types to reflect actual progress of implementation, initial scaffolding for working WrapperHook functionality * Fixed existing weight hook_patches (pre-registered) not working properly for CLIP * Removed Register/Direct hook nodes since they were present only for testing, removed diff-related weight hook calculation as improved_memory removes unload_model_clones and using sample time registered hooks is less hacky * Added clip scheduling support to all other native ComfyUI text encoding nodes (sdxl, flux, hunyuan, sd3) * Made WrapperHook functional, added another wrapper/callback getter, added ON_DETACH callback to ModelPatcher * Made opt_hooks append by default instead of replace, renamed comfy.hooks set functions to be more accurate * Added apply_to_conds to Set CLIP Hooks, modified relevant code to allow text encoding to automatically apply hooks to output conds when apply_to_conds is set to True * Fix cached_hook_patches not respecting target_device/memory_counter results * Fixed issue with setting weights from hooks instead of copying them, added additional memory_counter check when caching hook patches * Remove unnecessary torch.no_grad calls for hook patches * Increased MemoryCounter minimum memory to leave free by 2 until a better way to get inference memory estimate of currently loaded models exists For encode_from_tokens_scheduled, allow start_percent and end_percent in add_dict to limit which scheduled conds get encoded for optimization purposes * Removed a .to call on results of calculate_weight in patch_hook_weight_to_device that was screwing up the intermediate results for fp8 prior to being passed into stochastic_rounding call * Made encode_from_tokens_scheduled work when no hooks are set on patcher * Small cleanup of comments * Turn off hook patch caching when only 1 hook present in sampling, replace some current_hook = None with calls to self.patch_hooks(None) instead to avoid a potential edge case * On Cond/Cond Pair nodes, removed opt_ prefix from optional inputs * Allow both FLOATS and FLOAT for floats_strength input * Revert change, does not work * Made patch_hook_weight_to_device respect set_func and convert_func * Make discard_model_sampling True by default * Add changes manually from 'master' so merge conflict resolution goes more smoothly * Cleaned up text encode nodes with just a single clip.encode_from_tokens_scheduled call * Make sure encode_from_tokens_scheduled will respect use_clip_schedule on clip * Made nodes in nodes_hooks be marked as experimental (beta) * Add get_nested_additional_models for cases where additional_models could have their own additional_models, and add robustness for circular additional_models references * Made finalize_default_conds area math consistent with other sampling code * Changed 'opt_hooks' input of Cond/Cond Pair Set Default Combine nodes to 'hooks' * Remove a couple old TODO's and a no longer necessary workaround	2024-12-02 14:51:02 -05:00
comfyanonymous	15c39ea757	Support for the official mochi lora format.	2024-11-26 03:34:36 -05:00
comfyanonymous	41444b5236	Add some new weight patching functionality. Add a way to reshape lora weights. Allow weight patches to all weight not just .weight and .bias Add a way for a lora to set a weight to a specific value.	2024-11-21 07:19:17 -05:00
PsychoLogicAu	af8cf79a2d	support SimpleTuner lycoris lora for SD3 (#5340 )	2024-10-24 01:18:32 -04:00
comfyanonymous	f9f9faface	Fixed model merging issue with scaled fp8.	2024-10-20 06:24:31 -04:00
comfyanonymous	203942c8b2	Fix flux doras with diffusers keys.	2024-10-08 19:03:40 -04:00
comfyanonymous	b4626ab93e	Add simpletuner lycoris format for SD unet.	2024-09-30 06:03:27 -04:00
comfyanonymous	70a708d726	Fix model merging issue.	2024-09-20 02:31:44 -04:00
comfyanonymous	9c5fca75f4	Fix lora issue.	2024-09-08 10:10:47 -04:00
comfyanonymous	32a60a7bac	Support onetrainer text encoder Flux lora.	2024-09-08 09:31:41 -04:00
comfyanonymous	ea77750759	Support a generic Comfy format for text encoder loras. This is a format with keys like: text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.v_proj.lora_up.weight Instead of waiting for me to add support for specific lora formats you can convert your text encoder loras to this format instead. If you want to see an example save a text encoder lora with the SaveLora node with the commit right after this one.	2024-09-07 02:20:39 -04:00
comfyanonymous	483004dd1d	Support newer glora format.	2024-09-03 17:02:19 -04:00
comfyanonymous	d043997d30	Flux onetrainer lora.	2024-09-02 08:22:15 -04:00
comfyanonymous	6eb5d64522	Fix glora lowvram issue.	2024-08-29 19:07:23 -04:00
Chenlei Hu	6bbdcd28ae	Support weight padding on diff weight patch (#4576 )	2024-08-27 13:55:37 -04:00
comfyanonymous	7df42b9a23	Fix dora.	2024-08-23 04:58:59 -04:00
comfyanonymous	c0b0da264b	Missing imports.	2024-08-22 17:20:51 -04:00
comfyanonymous	c26ca27207	Move calculate function to comfy.lora	2024-08-22 17:12:00 -04:00
comfyanonymous	ea63b1c092	Simpletrainer lycoris format.	2024-08-20 12:05:13 -04:00
comfyanonymous	a9f04edc58	Implement text encoder part of HunyuanDiT loras.	2024-08-09 03:21:10 -04:00

1 2

70 Commits