diff --git a/docs/server-side-model-downloads-handover.html b/docs/server-side-model-downloads-handover.html new file mode 100644 index 000000000..663d6076a --- /dev/null +++ b/docs/server-side-model-downloads-handover.html @@ -0,0 +1,1273 @@ + + +
+ +feat/server-side-model-downloads
+ HF_CLIENT_ID in app/model_downloader/hf_auth/oauth.py is a
+ placeholder string ("REPLACE_ME_WITH_COMFY_ORG_HF_OAUTH_CLIENT_ID").
+ HuggingFace will reject the authorize redirect until a real app is registered under a
+ Comfy-Org-controlled HF account and the constant is replaced.
+
+ Detailed walkthrough is in + §11 — HuggingFace OAuth app setup at the bottom of this doc; + it lists each field and which boxes to tick. Until the placeholder is replaced, the + backend is otherwise fully functional (state polling, public downloads, gated detection + all work) — only the login flow itself fails. +
+
+ ComfyUI workflows declare model dependencies inline via properties.models
+ entries on loader nodes — each one carries a filename, a directory (e.g. loras,
+ checkpoints), and a URL to fetch the file from. Until this feature, when a
+ workflow loaded with a missing model, the frontend offered the user a download button that
+ triggered a plain browser download via a synthesized <a download> click.
+ Files landed in the user's Downloads folder; users then had to manually move them
+ into ComfyUI/models/<directory>/. Gated HuggingFace models couldn't be
+ downloaded at all without manual huggingface-cli login + hf_hub_download
+ out-of-band.
+
This change moves the fetch to the server, lands files in the correct on-disk location, and adds +authenticated HuggingFace support so gated models can be downloaded after a one-click OAuth flow.
+ +auth_check) with appropriate UI states.Key idea: every concern lives in app/model_downloader/ as a self-contained
+subsystem. Wiring into the rest of ComfyUI is two lines in server.py
+(register_routes(self.app)) and one feature-flag entry in
+comfy_api/feature_flags.py.
+ DOWNLOAD_SERVER (in app/model_downloader/download_server.py) is the
+ process-wide registry of in-flight downloads. It exists so that:
+
model_id can run at a time — preventing two
+ concurrent writers to the same destination path.@dataclass
+class DownloadSession:
+ model_id: str # e.g. "loras/my_lora.safetensors"
+ url: str # the URL we're fetching from
+ progress: Optional[float] # fraction in [0,1]; None until total known
+ bytes_downloaded: int
+ total_bytes: Optional[int]
+ epoch: int # see "atomicity" below
+
+
+ The registry is a plain dict[str, DownloadSession] guarded by a
+ threading.Lock (callable from both the asyncio event-loop thread and
+ the download-worker tasks).
+
Three independent atomicity guarantees, each addressing a different race:
+<final>.tmp and
+ use os.replace for the promotion. A crashed/cancelled download leaves only
+ the .tmp, never a partial-but-named-correct file that a loader would happily
+ load and silently produce garbage outputs.try_register holds the lock and inserts
+ iff no entry exists. A second concurrent request for the same model_id returns
+ None; the route then 409s and rolls back any sessions it had already registered
+ in this batch.DownloadSession carries an
+ epoch counter assigned on registration. If a user cancels and then
+ immediately re-triggers the download (same model_id), a new session
+ with a new epoch is registered. The old worker, still running on the cancelled session,
+ observes is_active(session) as False (epoch mismatch), rolls back its own
+ .tmp, and exits without affecting the new session. Prevents the old worker's
+ late finish() from accidentally evicting the new session.
+ When the server restarts mid-download, any .tmp file is by definition orphaned.
+ DOWNLOAD_SERVER.sweep_orphan_tmp_files() walks every registered model folder
+ and removes *.tmp files. Idempotent; runs on the first download request rather
+ than module import to keep the import path I/O-free.
+
+ When session.url is on huggingface.co and a token is stored,
+ stream_to_disk attaches Authorization: Bearer <access_token>
+ to the GET. Non-HF URLs receive no auth header (avoids token leakage to other hosts).
+ This is HF's documented way to access gated repos with a personal access token — no
+ reliance on huggingface_hub's download API.
+
+ Some HF repos are gated — the user has to accept a license / be approved before they can
+ download. The bearer token from a logged-in HF account passes that gate. Rather than
+ asking the user to paste a personal access token (security-awful UX), we run a proper
+ OAuth 2.0 Authorization Code flow with PKCE, identical pattern to what the
+ huggingface-cli login command does internally.
+
| Layer | Storage | Notes |
|---|---|---|
| In memory | +HF_AUTH_STORE singleton (auth_store.py) |
+ Lazily loaded from disk on first access. Mutations also flushed to disk. | +
| On disk | +<user_dir>/hf_auth_token.json |
+ Atomic write via .tmp + os.replace, chmod 0600
+ so only the OS user can read it. |
+
Token shape (mirrors what HF returns from the token endpoint):
+{
+ "access_token": "hf_oauth_…",
+ "refresh_token": "…", // null if not granted
+ "expires_at": 1739895432.0, // absolute epoch seconds
+ "scope": "openid profile read-repos"
+}
+
+login-start while
+ already logged in (or with a pending login flow) will either lock-conflict (409) or
+ overwrite the existing token on success. This is intentional given the
+ single-tenant scope — see §6.
++ Standard OAuth 2.0 PKCE (RFC 7636) with the SHA-256 method: +
+verifier never leaves the server process.code_challenge in the
+ authorize URL.state validated on callback; mismatches return 400 and
+ the token exchange is skipped (CSRF defence).All routes live under /api/, use kebab-case paths, and POST for input-bearing
+operations even when they're "read-only" — keeps semantics uniform and avoids URL-length
+limits when payloads grow.
/api/models-availability-status
+ 1 Hz poll
+ One-stop status endpoint. Returns per-model state (available / missing / downloading) plus + metadata (file size, HF downloadability) plus current HF auth snapshot, all in one shot.
+ +{
+ "models": {
+ "loras/foo.safetensors": "https://huggingface.co/org/repo/resolve/main/foo.safetensors",
+ "checkpoints/bar.safetensors": "https://huggingface.co/.../bar.safetensors"
+ }
+}
+
+ {
+ "models": {
+ "loras/foo.safetensors": {
+ "state": "downloading", // "available" | "missing" | "downloading"
+ "progress": {
+ "bytes_downloaded": 1024000,
+ "total_bytes": 29145431166,
+ "progress": 0.000035 // null until total known
+ },
+ "file_size": 29145431166, // bytes; null if not probed
+ "is_hf_downloadable": true // null for non-HF / probe failure
+ },
+ "checkpoints/bar.safetensors": {
+ "state": "missing",
+ "progress": null,
+ "file_size": 1234567890,
+ "is_hf_downloadable": false // gated, no access
+ }
+ },
+ "hf_auth": {
+ "token_available": true,
+ "eligible": true
+ }
+}
+
+
+ Called every 1 second by useServerSideDownloadsStore.refresh()
+ while the missing-models card is mounted. Timer auto-stops when no row is downloading
+ and every remaining missing row is gated (no further state changes possible without
+ a user action).
+
+ The polling timer re-arms on user actions: clicking Download, clicking HF login, + or a workflow change that re-registers the model list. +
+/api/download-models
+ 202 on accept
+ Trigger one or more downloads. Atomic: either every model passes + every precondition (valid id, allowed URL, not on disk, not in flight, not gated-to-us) + and all are scheduled, or none are — the request returns an error and the registry is + left unchanged.
+ +{
+ "models": {
+ "loras/foo.safetensors": "https://huggingface.co/.../foo.safetensors"
+ }
+}
+
+ HTTP 202 Accepted
+{
+ "accepted": true,
+ "scheduled": ["loras/foo.safetensors"]
+}
+
+ HTTP 400 / 409
+{
+ "error": {
+ "code": "MODEL_NOT_DOWNLOADABLE", // INVALID_MODEL_ID / URL_NOT_ALLOWED /
+ // ALREADY_AVAILABLE / ALREADY_DOWNLOADING /
+ // MODEL_NOT_DOWNLOADABLE / EMPTY_REQUEST
+ "message": "…human-readable…",
+ "details": { "model_id": "loras/foo.safetensors", "url": "https://…" }
+ }
+}
+
+ Triggered by clicking Download on a row or Download All Available in
+ the card header. On 202, the store immediately calls refresh() so the
+ progress bar appears in the same render tick; the regular 1 Hz polling takes over from there.
/api/cancel-model-download-session
+ { "model_id": "loras/foo.safetensors" }
+ { "cancelled": true } // or HTTP 404 with NOT_DOWNLOADING if no active session
+ The X button on a downloading row. The store re-polls availability immediately + so the UI flips back to "missing" without waiting for the next tick.
+Cancellation is cooperative — the worker checks is_active between chunks
+ (typically <1s latency) and rolls back its own .tmp on the way out.
/api/hf-auth-token-status
+ {
+ "token_available": true,
+ "username": "ogluzman" // resolved via HfApi.whoami(); null if token invalid
+}
+ Used by the HuggingFace settings panel on open and after any login/logout
+ action. The general polling path doesn't need this endpoint — the same boolean is
+ embedded in /api/models-availability-status under hf_auth.token_available.
+ Kept separate so the settings panel doesn't have to query the unrelated models endpoint.
/api/hf-auth-login-start
+ Empty body.
+{
+ "authorize_url": "https://huggingface.co/oauth/authorize?client_id=…&state=…&code_challenge=…"
+}
+ HF_AUTH_NOT_ELIGIBLE — deployment fails the loopback / multi-user gate. See §6.HF_AUTH_IN_PROGRESS — another login attempt holds the callback port.Spins up the OAuth callback server on 127.0.0.1:41954 for up to 5 minutes.
+ See §4 for the full lifecycle.
Triggered from the login banner in the missing-models card, or the Log in with HuggingFace
+ button in the Settings → HuggingFace panel. On 200, the frontend opens the
+ authorize_url in a new tab via window.open(url, "_blank").
/api/hf-auth-logout
+ { "logged_out": true }
+ Settings → HuggingFace → Log out button. Idempotent — succeeds even if no + token was held. Note this does not revoke the token on HF's side; the user can do that + at huggingface.co/settings/tokens if they want full revocation.
+# app/model_downloader/hf_auth/eligibility.py
+
+def is_hf_auth_eligible() -> bool:
+ return _is_loopback(args.listen) and not args.multi_user
+
+HF auth surfaces — both the login flow and the settings panel — appear iff this returns True.
+ ++ Core ComfyUI has no authentication. Any HF token the server holds is implicitly shared + by anyone who can reach the server. In a single-user local install that's fine — the OS + user is the boundary, the loopback bind keeps remote actors out. In any other deployment + it would be a credential-leak by misconfiguration: +
+ +--multi-user mode: multiple declared users (via the
+ unauthenticated comfy-user header) would all share one HF token implicitly —
+ Alice's prompts would silently fetch gated content as Bob.+ Both cases are real credential leakage that the operator probably didn't realize + they were enabling. The gate disables the feature instead of shipping a footgun. +
+ +| Surface | How the gate is applied |
|---|---|
Server feature flag hf_auth_eligible |
+ Computed once at startup, returned by /api/features. Frontend reads it
+ on init to decide whether to render any HF UI at all. |
+
| Login start endpoint | +Returns 403 HF_AUTH_NOT_ELIGIBLE if called when ineligible. Defence in
+ depth — even if the frontend bug rendered the button, the endpoint refuses. |
+
Settings panel (HfAuthSettingsPanel.vue) |
+ Registered in useSettingUI.ts only when
+ api.serverFeatureFlags['hf_auth_eligible'] is true. |
+
| Card login banner | +Conditional render: only shown when eligible and there's at least one + gated row and no token yet. | +
| Per-row gated UI text | +Three variants based on (eligible, logged-in) state — see §8. | +
+ We had to inline a copy of is_loopback in eligibility.py
+ (rather than importing from server.py) because
+ comfy_api/feature_flags.py evaluates its registry at module-import time —
+ earlier than server.py defines the helper. The inlined version is
+ ~20 lines, mirrors server.is_loopback exactly, and is the kind of thing
+ worth flagging if anyone ever does a "shared util" cleanup pass.
+
The polling endpoint runs probe_url(url) for every model on every tick. To
+keep that cheap (HuggingFace round-trip per probe is >100ms), the probe layer caches what's
+safe to cache and recomputes what isn't:
| Field | Cached? | Why |
|---|---|---|
is_gated (intrinsic — "is this repo gated on HF") |
+ ✅ Forever, per URL | +Property of the model, doesn't depend on the user. Determined by a single
+ auth_check(repo_id, token=None) on first probe. |
+
file_size |
+ ✅ Forever, per URL (but only after a successful probe) | +File size doesn't change. We only attempt the HEAD when is_hf_downloadable
+ is True — avoids caching None from a 401-because-gated, which would otherwise
+ survive a later successful login. |
+
is_hf_downloadable |
+ ❌ Recomputed every call | +Depends on the current token state. Has to update within one poll cycle after login /
+ logout / license acceptance. Recomputed via auth_check(repo_id, token=current_token)
+ — but skipped entirely for URLs known to be non-gated (those are trivially True). |
+
| On-disk file existence (state) | +❌ Per call | +os.path.isfile is a microsecond syscall; not worth caching, and we need
+ it fresh so the row flips to "available" the instant a download completes. |
+
+ Single-flight protection: a per-URL asyncio.Lock dedupes concurrent probes
+ for the same URL — when many polls land in the same tick, exactly one of them runs the HF
+ call and the others await the same result. Failures aren't cached (they're transient by
+ nature; retry next call).
+
auth_check with their token now succeeds → is_hf_downloadable
+ flips to true → the size HEAD fires on that same call → the row transitions from
+ gated UI to a Download button with the correct size, all within a second of returning.
+ No frontend cache invalidation, no focus hooks, no manual refresh.
+| Backend | Frontend | |
|---|---|---|
| Repo | +comfyanonymous/ComfyUI (this repo) |
+ Comfy-Org/ComfyUI_frontend (separate repo) |
+
| Language / stack | +Python 3.13, aiohttp, pydantic, pytest | +Vue 3, TypeScript, Pinia, Vite, PrimeVue, Tailwind | +
| Release artefact | +Source-distributed; users pip-install the package | +Built bundle published as the comfyui-frontend-package pip package; ComfyUI
+ imports the static files. |
+
| This feature's files | +app/model_downloader/**, two-line edit to server.py, one-line
+ edit to comfy_api/feature_flags.py, additions to openapi.yaml,
+ two test files under tests-unit/app_test/ |
+ src/platform/missingModel/serverDownloads/** (new directory), a few-line edit
+ to MissingModelCard.vue for the feature-flag switch, and a registration edit
+ in src/platform/settings/composables/useSettingUI.ts |
+
# Backend (one terminal)
+cd ComfyUI
+python main.py --listen 127.0.0.1 --port 8189 --cpu
+
+# Frontend (another terminal)
+cd ComfyUI_frontend
+DEV_SERVER_COMFYUI_URL=http://127.0.0.1:8189 pnpm dev
+# Vite serves at http://localhost:5173 and proxies /api/* to the backend
+
+Open http://localhost:5173 in a browser — you get the Vite dev server with HMR,
+talking to your local backend.
MissingModelCard.vue renders the new
+ MissingModelCardServerSide.vue when isServerSideDownloadsAvailable()
+ returns true (the server_side_model_downloads server feature flag). Old servers
+ silently fall through to the legacy in-browser download path.
+ hf_auth_eligible is true. Read once at startup.
+ useServerSideDownloadsStore (Pinia)
+ holds the entire view of the polling response. Components read; only the store mutates.
+ models/<dir>/ manually."~70 unit tests in two files under tests-unit/app_test/:
model_downloader_test.py — allowlist, path validation, registry
+ lifecycle (including epoch race semantics), orphan .tmp cleanup,
+ precondition gating on all four model routes, atomic batch behavior.hf_auth_test.py — token store (save / load / chmod / corruption /
+ refresh), eligibility under (listen, multi_user) matrix, URL parsing,
+ probe caching (intrinsic + size + skip-when-not-downloadable), all three HF auth
+ routes, PKCE primitives + authorize URL shape.$ pytest tests-unit/app_test/model_downloader_test.py tests-unit/app_test/hf_auth_test.py -q
+71 passed in 0.23s
+
+
+ All six routes are documented in openapi.yaml with request/response schemas.
+ The spec is hand-maintained — there's no codegen between handler signatures and the YAML.
+ §10 flags this as a long-term tech-debt item.
+
+ Lint is enforced in CI via Spectral
+ (.github/workflows/openapi-lint.yml); local run:
+
npx -y @stoplight/spectral-cli@6 lint openapi.yaml --ruleset .spectral.yaml --fail-severity=error
+
+
+
+HF_CLIENT_ID in app/model_downloader/hf_auth/oauth.py is a
+ placeholder string and must be replaced with a real registered HuggingFace OAuth app's
+ client_id before the login flow can succeed. Full instructions are at the top of this
+ document (the yellow "Action required" callout). Until that's done, calling
+ POST /api/hf-auth-login-start succeeds locally but the resulting
+ authorize_url will return an error from huggingface.co.
+is_hf_downloadable: false for those repos with a clear log line:
+ [hf_auth] auth_check forbids …/… (HTTP 403) — treating as gated.
+ The user has to authorize their token via the org's SSO setup at
+ https://huggingface.co/organizations/<org>/sso. Not a code bug — a
+ property of the org's policy.
+--tls-keyfile / --tls-certfile but
+ doesn't enable it by default. Browsers treat http://localhost as a
+ secure context, so Secure cookies / HF auth still work without TLS on
+ loopback. Non-loopback deployments without TLS are correctly excluded by the eligibility
+ gate, so the lack of default TLS isn't a hole for this feature.
+properties.models[*].hash entries carry a SHA. We don't verify; trust the
+ source. Easy to add if needed (one method on stream_to_disk)./api/models-availability-status, etc.). Older endpoints use snake_case;
+ newer assets endpoints use kebab; we picked kebab to match the newer direction.{"error": {"code": "MACHINE_READABLE", "message": "human", "details": {...}}}.
+ Matches the pattern in app/assets/api/routes.py.schemas_in.py, response schemas in schemas_out.py,
+ validated via Schema.model_validate(payload) in handlers.[model_downloader] or
+ [hf_auth] prefixes for grep-ability.# Find every backend file touched by this feature
+ls app/model_downloader app/model_downloader/api app/model_downloader/hf_auth
+
+# Find every place is_loopback is consulted (3 callers)
+grep -rn "is_loopback" --include="*.py" app/ server.py
+
+# Confirm the HF OAuth callback port and redirect URI
+grep -n "CALLBACK_PORT\|REDIRECT_URI" app/model_downloader/hf_auth/oauth.py
+
+# Run the test suite for just this feature
+.venv/bin/python -m pytest tests-unit/app_test/model_downloader_test.py \
+ tests-unit/app_test/hf_auth_test.py -q
+
+
+
+
+ Step-by-step walkthrough for creating the OAuth app whose client_id goes into
+ HF_CLIENT_ID. Reflects what the HuggingFace settings UI looked like at the
+ time this feature was developed; HF occasionally moves things around but the fields
+ themselves are stable.
+
| Field | Value | Notes |
|---|---|---|
| Application Name | +e.g. ComfyUI |
+ Shown on the user's consent screen and in their Connected Apps list. Keep it + recognisable. | +
| Homepage URL | +Optional. Leave blank or use https://www.comfy.org. |
+ Cosmetic. | +
| Logo | +Optional. | +Cosmetic. | +
| Token Expiration | +Default (8 hours) is fine. | +Our code transparently refreshes via the OAuth refresh-token flow; a shorter expiry + just means refresh happens more often. Don't pick an extremely short one — you'd put + needless load on HF's token endpoint. | +
| Default Scopes | +See §11.3 below. | +Critical — this controls what consent the user sees and what the token can do. | +
| Redirect URLs | +http://127.0.0.1:41954/api/auth/huggingface/callback |
+
+ Must match exactly. If you change CALLBACK_PORT in
+ oauth.py, change this in lockstep. Multiple redirect URLs can be
+ registered (one per line) if you need both dev and prod variants later.
+ |
+
+ HF groups scopes into sections. The bare minimum for this feature is three + checkboxes total. Leave everything else off. +
+ +| Section | Scope to check | Why |
|---|---|---|
| User Info | +openid |
+ Required by HF when the app uses OpenID Connect at all (which our PKCE + flow does — it's part of the OAuth2 + OIDC handshake). | +
| User Info | +profile |
+ Lets HfApi.whoami(token=...) return a username. The Settings panel
+ shows that username next to the "Logged in" indicator. Strictly cosmetic but
+ expected by the UI. |
+
| Repository Access | +gated-repos"Read public gated repos only" |
+ The key scope. Grants the token enough to (a) call auth_check against
+ gated repos the user has accepted the license for, and (b) download files from those
+ repos. Public-only — no private-repo access included, no write permissions. |
+
read-repos would also work for the feature (it includes
+ gated-repos plus private-repo read access), but picking it makes the
+ user's consent screen on huggingface.co look scarier ("this app wants to read your
+ private repositories"). Users may bail. Stick to gated-repos.
++ After creation, HF will label the app a Public app and explicitly note: + "No client secret. Use PKCE or device code flow for authentication." This is + expected and correct — we use PKCE (see §4). Do not + click Add client secret; we don't need it and having one without using it would + be a future footgun. +
+ +The Credentials section of the new app shows a Client ID in the form of a UUID
+(e.g. a8189e14-9246-4f19-bd6a-a307bdcb9276). Copy that value and paste it
+verbatim into:
# app/model_downloader/hf_auth/oauth.py (around line 49)
+HF_CLIENT_ID = "paste-the-uuid-here"
+
+That's the only code change required. Restart ComfyUI; POST /api/hf-auth-login-start
+should now produce an authorize_url that huggingface.co accepts.
python main.py --listen 127.0.0.1 --port 8189curl -s http://127.0.0.1:8189/api/features | grep hf_auth_eligible
+# expect: "hf_auth_eligible": true
+ curl -s -X POST http://127.0.0.1:8189/api/hf-auth-login-start | python3 -m json.tool
+# expect: {"authorize_url": "https://huggingface.co/oauth/authorize?client_id=<your-uuid>&..."}
+ authorize_url in a browser. The consent screen should display the
+ Application Name you chose and list the three scopes (openid, profile,
+ gated-repos). Click Authorize.http://127.0.0.1:41954/api/auth/huggingface/callback?code=...&state=....
+ Our local callback server completes the token exchange and renders a "Login complete" page.curl -s http://127.0.0.1:8189/api/hf-auth-token-status | python3 -m json.tool
+# expect: {"token_available": true, "username": "your-hf-username"}
+ + Once that round-trip works, the missing-models card will use the token automatically for + every subsequent gated probe and download. +
+ +The port 41954 is arbitrary — chosen to be high and unlikely to collide.
+If you ever need to change it, three things must move together:
CALLBACK_PORT in app/model_downloader/hf_auth/oauth.py.tests-unit/app_test/hf_auth_test.py).If they drift out of sync, HF will reject the redirect with a
+redirect_uri_mismatch error and the callback never lands.
+ Generated as a feature handover. Living document — keep it updated as the feature evolves, + or replace with a proper docs site entry once one exists. +
+ + +