ComfyUI/app/assets/database/queries/common.py
Luke Mino-Altherr 2bd4d82b4f
feat(assets): align local API with cloud spec (#12863)
* feat(assets): align local API with cloud spec

Unify response models, add missing fields, and align input schemas with
the cloud OpenAPI spec at cloud.comfy.org/openapi.

- Replace AssetSummary/AssetDetail/AssetUpdated with single Asset model
- Add is_immutable, metadata (system_metadata), prompt_id fields
- Support mime_type and preview_id in update endpoint
- Make CreateFromHashBody.name optional, add mime_type, require >=1 tag
- Add id/mime_type/preview_id to upload, relax tags to optional
- Rename total_tags → tags in tag add/remove responses
- Add GET /api/assets/tags/refine histogram endpoint
- Add DB migration for system_metadata and prompt_id columns

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix review issues: tags validation, size nullability, type annotation, hash mismatch check, and add tag histogram tests

- Remove contradictory min_length=1 from CreateFromHashBody.tags default
- Restore size field to int|None=None for proper null semantics
- Add Union type annotation to _build_asset_response result param
- Add hash mismatch validation on idempotent upload path (409 HASH_MISMATCH)
- Add unit tests for list_tag_histogram service function

Amp-Thread-ID: https://ampcode.com/threads/T-019cd993-f43c-704e-b3d7-6cfc3d4d4a80
Co-authored-by: Amp <amp@ampcode.com>

* Add preview_url to /assets API response using /api/view endpoint

For input and output assets, generate a preview_url pointing to the
existing /api/view endpoint using the asset's filename and tag-derived
type (input/output). Handles subdirectories via subfolder param and
URL-encodes filenames with spaces, unicode, and special characters.

This aligns the OSS backend response with the frontend AssetCard
expectation for thumbnail rendering.

Amp-Thread-ID: https://ampcode.com/threads/T-019cda3f-5c2c-751a-a906-ac6c9153ac5c
Co-authored-by: Amp <amp@ampcode.com>

* chore: remove unused imports from asset_reference queries

Amp-Thread-ID: https://ampcode.com/threads/T-019cda7d-cb21-77b4-a51b-b965af60208c
Co-authored-by: Amp <amp@ampcode.com>

* feat: resolve blake3 hashes in /view endpoint via asset database

Amp-Thread-ID: https://ampcode.com/threads/T-019cda7d-cb21-77b4-a51b-b965af60208c
Co-authored-by: Amp <amp@ampcode.com>

* Register uploaded images in asset database when --enable-assets is set

Add register_file_in_place() service function to ingest module for
registering already-saved files without moving them. Call it from the
/upload/image endpoint to return asset metadata in the response.

Amp-Thread-ID: https://ampcode.com/threads/T-019ce023-3384-7560-bacf-de40b0de0dd2
Co-authored-by: Amp <amp@ampcode.com>

* Exclude None fields from asset API JSON responses

Add exclude_none=True to model_dump() calls across asset routes to
keep response payloads clean by omitting unset optional fields.

Amp-Thread-ID: https://ampcode.com/threads/T-019ce023-3384-7560-bacf-de40b0de0dd2
Co-authored-by: Amp <amp@ampcode.com>

* Add comment explaining why /view resolves blake3 hashes

Amp-Thread-ID: https://ampcode.com/threads/T-019ce023-3384-7560-bacf-de40b0de0dd2
Co-authored-by: Amp <amp@ampcode.com>

* Move blake3 hash resolution to asset_management service

Extract resolve_hash_to_path() into asset_management.py and remove
_resolve_blake3_to_path from server.py. Also revert loopback origin
check to original logic.

Amp-Thread-ID: https://ampcode.com/threads/T-019ce023-3384-7560-bacf-de40b0de0dd2
Co-authored-by: Amp <amp@ampcode.com>

* Require at least one tag in UploadAssetSpec

Enforce non-empty tags at the Pydantic validation layer so uploads
with no tags are rejected with a 400 before reaching ingest. Adds
test_upload_empty_tags_rejected to cover this case.

Amp-Thread-ID: https://ampcode.com/threads/T-019ce377-8bde-7048-bc28-a9df063409f9
Co-authored-by: Amp <amp@ampcode.com>

* Add owner_id check to resolve_hash_to_path

Filter asset references by owner visibility so the /view endpoint
only resolves hashes for assets the requesting user can access.
Adds table-driven tests for owner visibility cases.

Amp-Thread-ID: https://ampcode.com/threads/T-019ce377-8bde-7048-bc28-a9df063409f9
Co-authored-by: Amp <amp@ampcode.com>

* Make ReferenceData.created_at and updated_at required

Remove None defaults and type: ignore comments. Move fields before
optional fields to satisfy dataclass ordering.

Amp-Thread-ID: https://ampcode.com/threads/T-019ce377-8bde-7048-bc28-a9df063409f9
Co-authored-by: Amp <amp@ampcode.com>

* Fix double commit in create_from_hash

Move mime_type update into _register_existing_asset so it shares a
single transaction with reference creation. Log a warning when the
hash is not found instead of silently returning None.

Amp-Thread-ID: https://ampcode.com/threads/T-019ce377-8bde-7048-bc28-a9df063409f9
Co-authored-by: Amp <amp@ampcode.com>

* Add exclude_none=True to create/upload responses

Align with get/update/list endpoints for consistent JSON output.

Amp-Thread-ID: https://ampcode.com/threads/T-019ce377-8bde-7048-bc28-a9df063409f9
Co-authored-by: Amp <amp@ampcode.com>

* Change preview_id to reference asset by reference ID, not content ID

Clients receive preview_id in API responses but could not dereference it
through public routes (which use reference IDs). Now preview_id is a
self-referential FK to asset_references.id so the value is directly
usable in the public API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Filter soft-deleted and missing refs from visibility queries

list_references_by_asset_id and list_tags_with_usage were not filtering
out deleted_at/is_missing refs, allowing /view?filename=blake3:... to
serve files through hidden references and inflating tag usage counts.
Add list_all_file_paths_by_asset_id for orphan cleanup which
intentionally needs unfiltered access.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Pass preview_id and mime_type through all asset creation fast paths

The duplicate-content upload path and hash-based creation paths were
silently dropping preview_id and mime_type. This wires both fields
through _register_existing_asset, create_from_hash, and all route
call sites so behavior is consistent regardless of whether the asset
content already exists.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove unimplemented client-provided ID from upload API

The `id` field on UploadAssetSpec was advertised for idempotent creation
but never actually honored when creating new references. Remove it
rather than implementing the feature.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Make asset mime_type immutable after first ingest

Prevents cross-tenant metadata mutation when multiple references share
the same content-addressed Asset row. mime_type can now only be set when
NULL (first ingest); subsequent attempts to change it are silently ignored.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Use resolved content_type from asset lookup in /view endpoint

The /view endpoint was discarding the content_type computed by
resolve_hash_to_path() and re-guessing from the filename, which
produced wrong results for extensionless files or mismatched extensions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Merge system+user metadata into filter projection

Extract rebuild_metadata_projection() to build AssetReferenceMeta rows
from {**system_metadata, **user_metadata}, so system-generated metadata
is queryable via metadata_filter and user keys override system keys.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Standardize tag ordering to alphabetical across all endpoints

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Derive subfolder tags from path in register_file_in_place

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Reject client-provided id, fix preview URLs, rename tags→total_tags

- Reject 'id' field in multipart upload with 400 UNSUPPORTED_FIELD
  instead of silently ignoring it
- Build preview URL from the preview asset's own metadata rather than
  the parent asset's
- Rename 'tags' to 'total_tags' in TagsAdd/TagsRemove response schemas
  for clarity

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: SQLite migration 0003 FK drop fails on file-backed DBs (MB-2)

Add naming_convention to Base.metadata so Alembic batch-mode reflection
can match unnamed FK constraints created by migration 0002. Pass
naming_convention and render_as_batch=True through env.py online config.

Add migration roundtrip tests (upgrade/downgrade/cycle from baseline).

Amp-Thread-ID: https://ampcode.com/threads/T-019ce466-1683-7471-b6e1-bb078223cda0
Co-authored-by: Amp <amp@ampcode.com>

* Fix missing tag count for is_missing references and update test for total_tags field

- Allow is_missing=True references to be counted in list_tags_with_usage
  when the tag is 'missing', so the missing tag count reflects all
  references that have been tagged as missing
- Add update_is_missing_by_asset_id query helper for bulk updates by asset
- Update test_add_and_remove_tags to use 'total_tags' matching the API schema

Amp-Thread-ID: https://ampcode.com/threads/T-019ce482-05e7-7324-a1b0-a56a929cc7ef
Co-authored-by: Amp <amp@ampcode.com>

* Remove unused imports in scanner.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Rename prompt_id to job_id on asset_references

Rename the column in the DB model, migration, and service schemas.
The API response emits both job_id and prompt_id (deprecated alias)
for backward compatibility with the cloud API.

Amp-Thread-ID: https://ampcode.com/threads/T-019cef41-60b0-752a-aa3c-ed7f20fda2f7
Co-authored-by: Amp <amp@ampcode.com>

* Add index on asset_references.preview_id for FK cascade performance

Amp-Thread-ID: https://ampcode.com/threads/T-019cef45-a4d2-7548-86d2-d46bcd3db419
Co-authored-by: Amp <amp@ampcode.com>

* Add clarifying comments for Asset/AssetReference naming and preview_id

Amp-Thread-ID: https://ampcode.com/threads/T-019cef49-f94e-7348-bf23-9a19ebf65e0d
Co-authored-by: Amp <amp@ampcode.com>

* Disallow all-null meta rows: add CHECK constraint, skip null values on write

- convert_metadata_to_rows returns [] for None values instead of an all-null row
- Remove dead None branch from _scalar_to_row
- Simplify null filter in common.py to just check for row absence
- Add CHECK constraint ck_asset_reference_meta_has_value to model and migration 0003

Amp-Thread-ID: https://ampcode.com/threads/T-019cef4e-5240-7749-bb25-1f17fcf9c09c
Co-authored-by: Amp <amp@ampcode.com>

* Remove dead None guards on result.asset in upload handler

register_file_in_place guarantees a non-None asset, so the
'if result.asset else None' checks were unreachable.

Amp-Thread-ID: https://ampcode.com/threads/T-019cef5b-4cf8-723c-8a98-8fb8f333c133
Co-authored-by: Amp <amp@ampcode.com>

* Remove mime_type from asset update API

Clients can no longer modify mime_type after asset creation via the
PUT /api/assets/{id} endpoint. This reduces the risk of mime_type
spoofing. The internal update_asset_hash_and_mime function remains
available for server-side use (e.g., enrichment).

Amp-Thread-ID: https://ampcode.com/threads/T-019cef5d-8d61-75cc-a1c6-2841ac395648
Co-authored-by: Amp <amp@ampcode.com>

* Fix migration constraint naming double-prefix and NULL in mixed metadata lists

- Use fully-rendered constraint names in migration 0003 to avoid the
  naming convention doubling the ck_ prefix on batch operations.
- Add table_args to downgrade so SQLite batch mode can find the CHECK
  constraint (not exposed by SQLite reflection).
- Fix model CheckConstraint name to use bare 'has_value' (convention
  auto-prefixes).
- Skip None items when converting metadata lists to rows, preventing
  all-NULL rows that violate the has_value check constraint.

Amp-Thread-ID: https://ampcode.com/threads/T-019cef87-94f9-7172-a6af-c6282290ce4f
Co-authored-by: Amp <amp@ampcode.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Amp <amp@ampcode.com>
2026-03-16 12:34:04 -07:00

128 lines
4.2 KiB
Python

"""Shared utilities for database query modules."""
import os
from decimal import Decimal
from typing import Iterable, Sequence
import sqlalchemy as sa
from sqlalchemy import exists
from app.assets.database.models import AssetReference, AssetReferenceMeta, AssetReferenceTag
from app.assets.helpers import escape_sql_like_string, normalize_tags
MAX_BIND_PARAMS = 800
def calculate_rows_per_statement(cols: int) -> int:
"""Calculate how many rows can fit in one statement given column count."""
return max(1, MAX_BIND_PARAMS // max(1, cols))
def iter_chunks(seq, n: int):
"""Yield successive n-sized chunks from seq."""
for i in range(0, len(seq), n):
yield seq[i : i + n]
def iter_row_chunks(rows: list[dict], cols_per_row: int) -> Iterable[list[dict]]:
"""Yield chunks of rows sized to fit within bind param limits."""
if not rows:
return
yield from iter_chunks(rows, calculate_rows_per_statement(cols_per_row))
def build_visible_owner_clause(owner_id: str) -> sa.sql.ClauseElement:
"""Build owner visibility predicate for reads.
Owner-less rows are visible to everyone.
"""
owner_id = (owner_id or "").strip()
if owner_id == "":
return AssetReference.owner_id == ""
return AssetReference.owner_id.in_(["", owner_id])
def build_prefix_like_conditions(
prefixes: list[str],
) -> list[sa.sql.ColumnElement]:
"""Build LIKE conditions for matching file paths under directory prefixes."""
conds = []
for p in prefixes:
base = os.path.abspath(p)
if not base.endswith(os.sep):
base += os.sep
escaped, esc = escape_sql_like_string(base)
conds.append(AssetReference.file_path.like(escaped + "%", escape=esc))
return conds
def apply_tag_filters(
stmt: sa.sql.Select,
include_tags: Sequence[str] | None = None,
exclude_tags: Sequence[str] | None = None,
) -> sa.sql.Select:
"""include_tags: every tag must be present; exclude_tags: none may be present."""
include_tags = normalize_tags(include_tags)
exclude_tags = normalize_tags(exclude_tags)
if include_tags:
for tag_name in include_tags:
stmt = stmt.where(
exists().where(
(AssetReferenceTag.asset_reference_id == AssetReference.id)
& (AssetReferenceTag.tag_name == tag_name)
)
)
if exclude_tags:
stmt = stmt.where(
~exists().where(
(AssetReferenceTag.asset_reference_id == AssetReference.id)
& (AssetReferenceTag.tag_name.in_(exclude_tags))
)
)
return stmt
def apply_metadata_filter(
stmt: sa.sql.Select,
metadata_filter: dict | None = None,
) -> sa.sql.Select:
"""Apply filters using asset_reference_meta projection table."""
if not metadata_filter:
return stmt
def _exists_for_pred(key: str, *preds) -> sa.sql.ClauseElement:
return sa.exists().where(
AssetReferenceMeta.asset_reference_id == AssetReference.id,
AssetReferenceMeta.key == key,
*preds,
)
def _exists_clause_for_value(key: str, value) -> sa.sql.ClauseElement:
if value is None:
return sa.not_(
sa.exists().where(
AssetReferenceMeta.asset_reference_id == AssetReference.id,
AssetReferenceMeta.key == key,
)
)
if isinstance(value, bool):
return _exists_for_pred(key, AssetReferenceMeta.val_bool == bool(value))
if isinstance(value, (int, float, Decimal)):
num = value if isinstance(value, Decimal) else Decimal(str(value))
return _exists_for_pred(key, AssetReferenceMeta.val_num == num)
if isinstance(value, str):
return _exists_for_pred(key, AssetReferenceMeta.val_str == value)
return _exists_for_pred(key, AssetReferenceMeta.val_json == value)
for k, v in metadata_filter.items():
if isinstance(v, list):
ors = [_exists_clause_for_value(k, elem) for elem in v]
if ors:
stmt = stmt.where(sa.or_(*ors))
else:
stmt = stmt.where(_exists_clause_for_value(k, v))
return stmt