- Add is_missing column to AssetCacheState for soft-delete
- Replace hard-delete pruning with mark_cache_states_missing_outside_prefixes
- Auto-restore missing cache states when files are re-scanned
- Filter out missing cache states from queries by default
- Rename functions for clarity:
- mark_cache_states_missing_outside_prefixes (was delete_cache_states_outside_prefixes)
- get_unreferenced_unhashed_asset_ids (was get_orphaned_seed_asset_ids)
- mark_assets_missing_outside_prefixes (was prune_orphaned_assets)
- mark_missing_outside_prefixes_safely (was prune_orphans_safely)
- Add restore_cache_states_by_paths for explicit restoration
- Add cleanup_unreferenced_assets for explicit hard-delete when needed
- Update API endpoint /api/assets/prune to use new soft-delete behavior
This preserves user metadata (tags, etc.) when base directories change,
allowing assets to be restored when the original paths become available again.
Amp-Thread-ID: https://ampcode.com/threads/T-019c3114-bf28-73a9-a4d2-85b208fd5462
Co-authored-by: Amp <amp@ampcode.com>
- Create file_utils.py with shared file utilities:
- get_mtime_ns() - extract mtime in nanoseconds from stat
- get_size_and_mtime_ns() - get both size and mtime
- verify_file_unchanged() - check file matches DB mtime/size
- list_files_recursively() - recursive directory listing
- Create bulk_ingest.py for bulk operations:
- BulkInsertResult dataclass
- batch_insert_seed_assets() - batch insert with conflict handling
- prune_orphaned_assets() - clean up orphaned assets
- Update scanner.py to use new service modules instead of
calling database queries directly
- Update ingest.py to use shared get_size_and_mtime_ns()
- Export new functions from services/__init__.py
Amp-Thread-ID: https://ampcode.com/threads/T-019c2ae7-f701-716a-a0dd-1feb988732fb
Co-authored-by: Amp <amp@ampcode.com>