Commit Graph

4 Commits

Author SHA1 Message Date
Luke Mino-Altherr
c268507ea7 Fix asset API security and correctness issues
- Content-Disposition: drop raw filename= parameter, use only RFC 5987
  filename*=UTF-8'' to prevent header injection via ; and special chars
- delete_asset: default delete_content to False (non-destructive) when
  query parameter is omitted
- create_asset_from_hash: return 400 MISSING_INPUT instead of 404 when
  hash not found and no file uploaded (client input error, not missing resource)
- seeder: clear _progress when returning to IDLE so get_status() does not
  return stale progress after scan completion
- hashing: handle non-seekable streams in _hash_file_obj by checking
  seekable() before attempting tell/seek
- bulk_ingest: filter lost_paths to only include paths tied to actually
  inserted asset IDs, preventing inflated counts from ON CONFLICT drops

Amp-Thread-ID: https://ampcode.com/threads/T-019cb67a-9822-7438-ab05-d09991a9f7f3
Co-authored-by: Amp <amp@ampcode.com>
2026-03-03 18:47:48 -08:00
Luke Mino-Altherr
32a6fcf7a8 fix: address code review feedback - round 2
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Build package / Build Test (3.10) (push) Waiting to run
Build package / Build Test (3.11) (push) Waiting to run
Build package / Build Test (3.12) (push) Waiting to run
Build package / Build Test (3.13) (push) Waiting to run
Build package / Build Test (3.14) (push) Waiting to run
- Reject path separators (/, \, os.sep) in tag components for defense-in-depth
- Add comment explaining double-relpath normalization trick
- Add _require_assets_feature_enabled decorator returning 503 when disabled
- Call asset_seeder.disable() when --enable-assets is not passed
- Add iter_chunks to bulk_update_needs_verify, bulk_update_is_missing,
  and delete_references_by_ids to respect SQLite bind param limits
- Fix CacheStateRow.size_bytes NULL coercion (0 -> None) to avoid
  false needs_verify flags on assets with unknown size
- Add PermissionError catch in delete_asset_tags route (403 vs 500)
- Add hash-is-None guard in delete_orphaned_seed_asset
- Validate from_asset_id in reassign_asset_references
- Initialize _prune_first in __init__, remove getattr workaround
- Cap error accumulation in _add_error to 200
- Remove confirmed dead code: seed_assets, compute_filename_for_asset,
  ALLOWED_ROOTS, AssetNotFoundError, SetTagsResult, update_enrichment_level,
  Asset.to_dict, AssetReference.to_dict, _AssetSeeder.enable

Amp-Thread-ID: https://ampcode.com/threads/T-019cb610-1b55-74b6-8dbb-381d73c387c0
Co-authored-by: Amp <amp@ampcode.com>
2026-03-03 16:35:55 -08:00
Luke Mino-Altherr
4d4c2cedd3 fix: address code review feedback
- Fix missing import for compute_filename_for_reference in ingest.py
- Apply code review fixes across routes, queries, scanner, seeder,
  hashing, ingest, path_utils, main, and server
- Update and add tests for sync references and seeder

Amp-Thread-ID: https://ampcode.com/threads/T-019cb61a-ed54-738c-a05f-9b5242e513f3
Co-authored-by: Amp <amp@ampcode.com>
2026-03-03 15:51:52 -08:00
Luke Mino-Altherr
709a721591 feat(assets): async two-phase scanner and background seeder
- Rewrite scanner.py with two-phase scanning architecture (fast scan + enrich)
- Add AssetSeeder for non-blocking background startup scanning
- Implement pause/resume/stop/restart controls and disable/enable for --disable-assets-autoscan
- Add non-destructive asset pruning with is_missing flag
- Wire seeder into main.py and server.py lifecycle
- Skip hidden files/directories, populate mime_type, optional blake3 hashing
- Add comprehensive seeder tests

Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019c9209-37af-757a-b6e4-af59b4267362
2026-03-03 15:51:52 -08:00