Path-derived tags for nested model layouts (e.g.
models/checkpoints/flux/foo.safetensors) emitted only the slash-joined
shape `["models", "checkpoints/flux"]`, which broke the frontend
combo-widget set-membership filter `include_tags=models,checkpoints` —
the literal `checkpoints` token was no longer present in the asset's
tag set.
Add `expand_bucket_prefixes` at the tag-write layer. When a tag's first
slash segment is a registered model category (or input/output/temp
root), the bucket is inserted as a standalone token immediately after
the slash-joined form. This preserves tag[1] as the slash-joined
positional contract cloud emits while restoring the set-membership
token the frontend filter requires.
The expansion is bounded to known buckets so free-form user labels
with slashes (`my-org/team-a`) pass through unchanged. The helper is
applied uniformly in `set_reference_tags`, `add_tags_to_reference`,
and `batch_insert_seed_assets` so HTTP uploads, user-tag mutations,
and path-scanning ingest all converge on the same canonical shape.
Also align the upload-route category validator with
`resolve_destination_from_tags` by extracting the first slash segment
of tag[1], so HTTP uploads matching cloud's slash-joined emission
shape are no longer rejected as `unknown models category`.
Three bugs surfaced by an end-to-end smoke test of the read+write
round-trip; all in this PR's scope.
1. FK violation on uppercase paths
get_name_and_tags_from_asset_path was preserving case on the
subpath (e.g. "diffusers/Kolors/text_encoder"). ensure_tags_exist
lowercases via normalize_tags before inserting into the tags
table, so the asset_reference_tags.tag_name FK to tags.name
failed for any path containing uppercase letters — including
the diffusers case the PR was designed to support.
Fix: lowercase the slash-joined subpath in
get_name_and_tags_from_asset_path to match the canonicalization
ensure_tags_exist applies. Providers keyed on original-case
subpaths need to normalize their lookup key to lowercase.
2. resolve_destination_from_tags rejected the new tag shape
The inverse function only accepted the legacy one-tag-per-dir
shape (["models", "diffusers", "Kolors", "text_encoder"]).
An upload using the slash-joined shape returned by /api/assets
raised "unknown model category" or "invalid path component".
Fix: pre-split every entry after tags[0] on "/" so both shapes
resolve identically. For models, the first expanded segment is
the category and the rest are subdirs; for input/output the
full expansion becomes the subdirs.
3. Within-batch tag order was lost
bulk_ingest wrote every tag in a single batch with the same
added_at = current_time. The retrieval ORDER BY added_at, tag_name
then fell back to the tag_name tiebreaker, sorting the path-derived
pair alphabetically — putting "checkpoints/..." ahead of "models"
since "c" < "m". The tags[0] = root contract was lost on bulk-
ingested rows.
Fix: stagger added_at by microseconds per tag index within a
reference so the retrieval order matches the input list order.
Path-derived tags now consistently land in position-0 = root,
position-1 = subpath.
Tests
- TestGetNameAndTagsFromAssetPath updated: subpath is now lowercase.
- New TestResolveDestinationFromTags covers both tag shapes, the
unknown-category case for slash-joined input, traversal rejection,
and input/output paths.
- Full suite: 333 passed, 10 pre-existing skipped.