ComfyUI/alembic_db/versions/0003_add_enrichment_level.py
Luke Mino-Altherr c7368205e3 feat: implement two-phase scanning architecture (fast + enrich)
Phase 1 (FAST): Creates stub records with filesystem metadata only
- path, size, mtime - no file content reading
- Populates asset database quickly on startup

Phase 2 (ENRICH): Extracts metadata and computes hashes
- Safetensors header parsing, MIME types
- Optional blake3 hash computation
- Updates existing stub records

Changes:
- Add ScanPhase enum (FAST, ENRICH, FULL)
- Add enrichment_level column to AssetCacheState (0=stub, 1=metadata, 2=hashed)
- Add build_stub_specs() for fast scanning without metadata extraction
- Add get_unenriched_cache_states(), enrich_asset(), enrich_assets_batch()
- Add start_fast(), start_enrich() convenience methods to AssetSeeder
- Update start() to accept phase parameter (defaults to FULL)
- Split _run_scan() into _run_fast_phase() and _run_enrich_phase()
- Add migration 0003_add_enrichment_level.py
- Update tests for new architecture

Amp-Thread-ID: https://ampcode.com/threads/T-019c4eef-1568-778f-aede-38254728f848
Co-authored-by: Amp <amp@ampcode.com>
2026-02-24 11:34:44 -08:00

45 lines
1.2 KiB
Python

"""
Add enrichment_level column to asset_cache_state for phased scanning
Level 0: Stub record (path, size, mtime only)
Level 1: Metadata extracted (safetensors header, mime type)
Level 2: Hash computed (blake3)
Revision ID: 0003_add_enrichment_level
Revises: 0002_add_is_missing
Create Date: 2025-02-10 00:00:00
"""
from alembic import op
import sqlalchemy as sa
revision = "0003_add_enrichment_level"
down_revision = "0002_add_is_missing"
branch_labels = None
depends_on = None
def upgrade() -> None:
op.add_column(
"asset_cache_state",
sa.Column(
"enrichment_level",
sa.Integer(),
nullable=False,
server_default=sa.text("0"),
),
)
op.create_index(
"ix_asset_cache_state_enrichment_level",
"asset_cache_state",
["enrichment_level"],
)
# Treat existing records as fully enriched (level 1 = metadata done)
# since they were created with the old scanner that extracted metadata
op.execute("UPDATE asset_cache_state SET enrichment_level = 1")
def downgrade() -> None:
op.drop_index("ix_asset_cache_state_enrichment_level", table_name="asset_cache_state")
op.drop_column("asset_cache_state", "enrichment_level")