feat/server-side-model-downloads
HF_CLIENT_ID in app/model_downloader/hf_auth/oauth.py is a
placeholder string ("REPLACE_ME_WITH_COMFY_ORG_HF_OAUTH_CLIENT_ID").
HuggingFace will reject the authorize redirect until a real app is registered under a
Comfy-Org-controlled HF account and the constant is replaced.
Detailed walkthrough is in §11 — HuggingFace OAuth app setup at the bottom of this doc; it lists each field and which boxes to tick. Until the placeholder is replaced, the backend is otherwise fully functional (state polling, public downloads, gated detection all work) — only the login flow itself fails.
ComfyUI workflows declare model dependencies inline via properties.models
entries on loader nodes — each one carries a filename, a directory (e.g. loras,
checkpoints), and a URL to fetch the file from. Until this feature, when a
workflow loaded with a missing model, the frontend offered the user a download button that
triggered a plain browser download via a synthesized <a download> click.
Files landed in the user's Downloads folder; users then had to manually move them
into ComfyUI/models/<directory>/. Gated HuggingFace models couldn't be
downloaded at all without manual huggingface-cli login + hf_hub_download
out-of-band.
This change moves the fetch to the server, lands files in the correct on-disk location, and adds authenticated HuggingFace support so gated models can be downloaded after a one-click OAuth flow.
auth_check) with appropriate UI states.Key idea: every concern lives in app/model_downloader/ as a self-contained
subsystem. Wiring into the rest of ComfyUI is two lines in server.py
(register_routes(self.app)) and one feature-flag entry in
comfy_api/feature_flags.py.
DOWNLOAD_SERVER (in app/model_downloader/download_server.py) is the
process-wide registry of in-flight downloads. It exists so that:
model_id can run at a time — preventing two
concurrent writers to the same destination path.@dataclass
class DownloadSession:
model_id: str # e.g. "loras/my_lora.safetensors"
url: str # the URL we're fetching from
progress: Optional[float] # fraction in [0,1]; None until total known
bytes_downloaded: int
total_bytes: Optional[int]
epoch: int # see "atomicity" below
The registry is a plain dict[str, DownloadSession] guarded by a
threading.Lock (callable from both the asyncio event-loop thread and
the download-worker tasks).
Three independent atomicity guarantees, each addressing a different race:
<final>.tmp and
use os.replace for the promotion. A crashed/cancelled download leaves only
the .tmp, never a partial-but-named-correct file that a loader would happily
load and silently produce garbage outputs.try_register holds the lock and inserts
iff no entry exists. A second concurrent request for the same model_id returns
None; the route then 409s and rolls back any sessions it had already registered
in this batch.DownloadSession carries an
epoch counter assigned on registration. If a user cancels and then
immediately re-triggers the download (same model_id), a new session
with a new epoch is registered. The old worker, still running on the cancelled session,
observes is_active(session) as False (epoch mismatch), rolls back its own
.tmp, and exits without affecting the new session. Prevents the old worker's
late finish() from accidentally evicting the new session.
When the server restarts mid-download, any .tmp file is by definition orphaned.
DOWNLOAD_SERVER.sweep_orphan_tmp_files() walks every registered model folder
and removes *.tmp files. Idempotent; runs on the first download request rather
than module import to keep the import path I/O-free.
When session.url is on huggingface.co and a token is stored,
stream_to_disk attaches Authorization: Bearer <access_token>
to the GET. Non-HF URLs receive no auth header (avoids token leakage to other hosts).
This is HF's documented way to access gated repos with a personal access token — no
reliance on huggingface_hub's download API.
Some HF repos are gated — the user has to accept a license / be approved before they can
download. The bearer token from a logged-in HF account passes that gate. Rather than
asking the user to paste a personal access token (security-awful UX), we run a proper
OAuth 2.0 Authorization Code flow with PKCE, identical pattern to what the
huggingface-cli login command does internally.
| Layer | Storage | Notes |
|---|---|---|
| In memory | HF_AUTH_STORE singleton (auth_store.py) |
Lazily loaded from disk on first access. Mutations also flushed to disk. |
| On disk | <user_dir>/hf_auth_token.json |
Atomic write via .tmp + os.replace, chmod 0600
so only the OS user can read it. |
Token shape (mirrors what HF returns from the token endpoint):
{
"access_token": "hf_oauth_…",
"refresh_token": "…", // null if not granted
"expires_at": 1739895432.0, // absolute epoch seconds
"scope": "openid profile read-repos"
}
login-start while
already logged in (or with a pending login flow) will either lock-conflict (409) or
overwrite the existing token on success. This is intentional given the
single-tenant scope — see §6.
Standard OAuth 2.0 PKCE (RFC 7636) with the SHA-256 method:
verifier never leaves the server process.code_challenge in the
authorize URL.state validated on callback; mismatches return 400 and
the token exchange is skipped (CSRF defence).All routes live under /api/, use kebab-case paths, and POST for input-bearing
operations even when they're "read-only" — keeps semantics uniform and avoids URL-length
limits when payloads grow.
/api/models-availability-status
1 Hz poll
One-stop status endpoint. Returns per-model state (available / missing / downloading) plus metadata (file size, HF downloadability) plus current HF auth snapshot, all in one shot.
{
"models": {
"loras/foo.safetensors": "https://huggingface.co/org/repo/resolve/main/foo.safetensors",
"checkpoints/bar.safetensors": "https://huggingface.co/.../bar.safetensors"
}
}
{
"models": {
"loras/foo.safetensors": {
"state": "downloading", // "available" | "missing" | "downloading"
"progress": {
"bytes_downloaded": 1024000,
"total_bytes": 29145431166,
"progress": 0.000035 // null until total known
},
"file_size": 29145431166, // bytes; null if not probed
"is_hf_downloadable": true // null for non-HF / probe failure
},
"checkpoints/bar.safetensors": {
"state": "missing",
"progress": null,
"file_size": 1234567890,
"is_hf_downloadable": false // gated, no access
}
},
"hf_auth": {
"token_available": true,
"eligible": true
}
}
Called every 1 second by useServerSideDownloadsStore.refresh()
while the missing-models card is mounted. Timer auto-stops when no row is downloading
and every remaining missing row is gated (no further state changes possible without
a user action).
The polling timer re-arms on user actions: clicking Download, clicking HF login, or a workflow change that re-registers the model list.
/api/download-models
202 on accept
Trigger one or more downloads. Atomic: either every model passes every precondition (valid id, allowed URL, not on disk, not in flight, not gated-to-us) and all are scheduled, or none are — the request returns an error and the registry is left unchanged.
{
"models": {
"loras/foo.safetensors": "https://huggingface.co/.../foo.safetensors"
}
}
HTTP 202 Accepted
{
"accepted": true,
"scheduled": ["loras/foo.safetensors"]
}
HTTP 400 / 409
{
"error": {
"code": "MODEL_NOT_DOWNLOADABLE", // INVALID_MODEL_ID / URL_NOT_ALLOWED /
// ALREADY_AVAILABLE / ALREADY_DOWNLOADING /
// MODEL_NOT_DOWNLOADABLE / EMPTY_REQUEST
"message": "…human-readable…",
"details": { "model_id": "loras/foo.safetensors", "url": "https://…" }
}
}
Triggered by clicking Download on a row or Download All Available in
the card header. On 202, the store immediately calls refresh() so the
progress bar appears in the same render tick; the regular 1 Hz polling takes over from there.
/api/cancel-model-download-session
{ "model_id": "loras/foo.safetensors" }
{ "cancelled": true } // or HTTP 404 with NOT_DOWNLOADING if no active session
The X button on a downloading row. The store re-polls availability immediately so the UI flips back to "missing" without waiting for the next tick.
Cancellation is cooperative — the worker checks is_active between chunks
(typically <1s latency) and rolls back its own .tmp on the way out.
/api/hf-auth-token-status
{
"token_available": true,
"username": "ogluzman" // resolved via HfApi.whoami(); null if token invalid
}
Used by the HuggingFace settings panel on open and after any login/logout
action. The general polling path doesn't need this endpoint — the same boolean is
embedded in /api/models-availability-status under hf_auth.token_available.
Kept separate so the settings panel doesn't have to query the unrelated models endpoint.
/api/hf-auth-login-start
Empty body.
{
"authorize_url": "https://huggingface.co/oauth/authorize?client_id=…&state=…&code_challenge=…"
}
HF_AUTH_NOT_ELIGIBLE — deployment fails the loopback / multi-user gate. See §6.HF_AUTH_IN_PROGRESS — another login attempt holds the callback port.Spins up the OAuth callback server on 127.0.0.1:41954 for up to 5 minutes.
See §4 for the full lifecycle.
Triggered from the login banner in the missing-models card, or the Log in with HuggingFace
button in the Settings → HuggingFace panel. On 200, the frontend opens the
authorize_url in a new tab via window.open(url, "_blank").
/api/hf-auth-logout
{ "logged_out": true }
Settings → HuggingFace → Log out button. Idempotent — succeeds even if no token was held. Note this does not revoke the token on HF's side; the user can do that at huggingface.co/settings/tokens if they want full revocation.
# app/model_downloader/hf_auth/eligibility.py
def is_hf_auth_eligible() -> bool:
return _is_loopback(args.listen) and not args.multi_user
HF auth surfaces — both the login flow and the settings panel — appear iff this returns True.
Core ComfyUI has no authentication. Any HF token the server holds is implicitly shared by anyone who can reach the server. In a single-user local install that's fine — the OS user is the boundary, the loopback bind keeps remote actors out. In any other deployment it would be a credential-leak by misconfiguration:
--multi-user mode: multiple declared users (via the
unauthenticated comfy-user header) would all share one HF token implicitly —
Alice's prompts would silently fetch gated content as Bob.Both cases are real credential leakage that the operator probably didn't realize they were enabling. The gate disables the feature instead of shipping a footgun.
| Surface | How the gate is applied |
|---|---|
Server feature flag hf_auth_eligible |
Computed once at startup, returned by /api/features. Frontend reads it
on init to decide whether to render any HF UI at all. |
| Login start endpoint | Returns 403 HF_AUTH_NOT_ELIGIBLE if called when ineligible. Defence in
depth — even if the frontend bug rendered the button, the endpoint refuses. |
Settings panel (HfAuthSettingsPanel.vue) |
Registered in useSettingUI.ts only when
api.serverFeatureFlags['hf_auth_eligible'] is true. |
| Card login banner | Conditional render: only shown when eligible and there's at least one gated row and no token yet. |
| Per-row gated UI text | Three variants based on (eligible, logged-in) state — see §8. |
We had to inline a copy of is_loopback in eligibility.py
(rather than importing from server.py) because
comfy_api/feature_flags.py evaluates its registry at module-import time —
earlier than server.py defines the helper. The inlined version is
~20 lines, mirrors server.is_loopback exactly, and is the kind of thing
worth flagging if anyone ever does a "shared util" cleanup pass.
The polling endpoint runs probe_url(url) for every model on every tick. To
keep that cheap (HuggingFace round-trip per probe is >100ms), the probe layer caches what's
safe to cache and recomputes what isn't:
| Field | Cached? | Why |
|---|---|---|
is_gated (intrinsic — "is this repo gated on HF") |
✅ Forever, per URL | Property of the model, doesn't depend on the user. Determined by a single
auth_check(repo_id, token=None) on first probe. |
file_size |
✅ Forever, per URL (but only after a successful probe) | File size doesn't change. We only attempt the HEAD when is_hf_downloadable
is True — avoids caching None from a 401-because-gated, which would otherwise
survive a later successful login. |
is_hf_downloadable |
❌ Recomputed every call | Depends on the current token state. Has to update within one poll cycle after login /
logout / license acceptance. Recomputed via auth_check(repo_id, token=current_token)
— but skipped entirely for URLs known to be non-gated (those are trivially True). |
| On-disk file existence (state) | ❌ Per call | os.path.isfile is a microsecond syscall; not worth caching, and we need
it fresh so the row flips to "available" the instant a download completes. |
Single-flight protection: a per-URL asyncio.Lock dedupes concurrent probes
for the same URL — when many polls land in the same tick, exactly one of them runs the HF
call and the others await the same result. Failures aren't cached (they're transient by
nature; retry next call).
auth_check with their token now succeeds → is_hf_downloadable
flips to true → the size HEAD fires on that same call → the row transitions from
gated UI to a Download button with the correct size, all within a second of returning.
No frontend cache invalidation, no focus hooks, no manual refresh.
| Backend | Frontend | |
|---|---|---|
| Repo | comfyanonymous/ComfyUI (this repo) |
Comfy-Org/ComfyUI_frontend (separate repo) |
| Language / stack | Python 3.13, aiohttp, pydantic, pytest | Vue 3, TypeScript, Pinia, Vite, PrimeVue, Tailwind |
| Release artefact | Source-distributed; users pip-install the package | Built bundle published as the comfyui-frontend-package pip package; ComfyUI
imports the static files. |
| This feature's files | app/model_downloader/**, two-line edit to server.py, one-line
edit to comfy_api/feature_flags.py, additions to openapi.yaml,
two test files under tests-unit/app_test/ |
src/platform/missingModel/serverDownloads/** (new directory), a few-line edit
to MissingModelCard.vue for the feature-flag switch, and a registration edit
in src/platform/settings/composables/useSettingUI.ts |
# Backend (one terminal)
cd ComfyUI
python main.py --listen 127.0.0.1 --port 8189 --cpu
# Frontend (another terminal)
cd ComfyUI_frontend
DEV_SERVER_COMFYUI_URL=http://127.0.0.1:8189 pnpm dev
# Vite serves at http://localhost:5173 and proxies /api/* to the backend
Open http://localhost:5173 in a browser — you get the Vite dev server with HMR,
talking to your local backend.
MissingModelCard.vue renders the new
MissingModelCardServerSide.vue when isServerSideDownloadsAvailable()
returns true (the server_side_model_downloads server feature flag). Old servers
silently fall through to the legacy in-browser download path.
hf_auth_eligible is true. Read once at startup.
useServerSideDownloadsStore (Pinia)
holds the entire view of the polling response. Components read; only the store mutates.
models/<dir>/ manually."~70 unit tests in two files under tests-unit/app_test/:
model_downloader_test.py — allowlist, path validation, registry
lifecycle (including epoch race semantics), orphan .tmp cleanup,
precondition gating on all four model routes, atomic batch behavior.hf_auth_test.py — token store (save / load / chmod / corruption /
refresh), eligibility under (listen, multi_user) matrix, URL parsing,
probe caching (intrinsic + size + skip-when-not-downloadable), all three HF auth
routes, PKCE primitives + authorize URL shape.$ pytest tests-unit/app_test/model_downloader_test.py tests-unit/app_test/hf_auth_test.py -q
71 passed in 0.23s
All six routes are documented in openapi.yaml with request/response schemas.
The spec is hand-maintained — there's no codegen between handler signatures and the YAML.
§10 flags this as a long-term tech-debt item.
Lint is enforced in CI via Spectral
(.github/workflows/openapi-lint.yml); local run:
npx -y @stoplight/spectral-cli@6 lint openapi.yaml --ruleset .spectral.yaml --fail-severity=error
HF_CLIENT_ID in app/model_downloader/hf_auth/oauth.py is a
placeholder string and must be replaced with a real registered HuggingFace OAuth app's
client_id before the login flow can succeed. Full instructions are at the top of this
document (the yellow "Action required" callout). Until that's done, calling
POST /api/hf-auth-login-start succeeds locally but the resulting
authorize_url will return an error from huggingface.co.
is_hf_downloadable: false for those repos with a clear log line:
[hf_auth] auth_check forbids …/… (HTTP 403) — treating as gated.
The user has to authorize their token via the org's SSO setup at
https://huggingface.co/organizations/<org>/sso. Not a code bug — a
property of the org's policy.
--tls-keyfile / --tls-certfile but
doesn't enable it by default. Browsers treat http://localhost as a
secure context, so Secure cookies / HF auth still work without TLS on
loopback. Non-loopback deployments without TLS are correctly excluded by the eligibility
gate, so the lack of default TLS isn't a hole for this feature.
properties.models[*].hash entries carry a SHA. We don't verify; trust the
source. Easy to add if needed (one method on stream_to_disk)./api/models-availability-status, etc.). Older endpoints use snake_case;
newer assets endpoints use kebab; we picked kebab to match the newer direction.{"error": {"code": "MACHINE_READABLE", "message": "human", "details": {...}}}.
Matches the pattern in app/assets/api/routes.py.schemas_in.py, response schemas in schemas_out.py,
validated via Schema.model_validate(payload) in handlers.[model_downloader] or
[hf_auth] prefixes for grep-ability.# Find every backend file touched by this feature
ls app/model_downloader app/model_downloader/api app/model_downloader/hf_auth
# Find every place is_loopback is consulted (3 callers)
grep -rn "is_loopback" --include="*.py" app/ server.py
# Confirm the HF OAuth callback port and redirect URI
grep -n "CALLBACK_PORT\|REDIRECT_URI" app/model_downloader/hf_auth/oauth.py
# Run the test suite for just this feature
.venv/bin/python -m pytest tests-unit/app_test/model_downloader_test.py \
tests-unit/app_test/hf_auth_test.py -q
Step-by-step walkthrough for creating the OAuth app whose client_id goes into
HF_CLIENT_ID. Reflects what the HuggingFace settings UI looked like at the
time this feature was developed; HF occasionally moves things around but the fields
themselves are stable.
| Field | Value | Notes |
|---|---|---|
| Application Name | e.g. ComfyUI |
Shown on the user's consent screen and in their Connected Apps list. Keep it recognisable. |
| Homepage URL | Optional. Leave blank or use https://www.comfy.org. |
Cosmetic. |
| Logo | Optional. | Cosmetic. |
| Token Expiration | Default (8 hours) is fine. | Our code transparently refreshes via the OAuth refresh-token flow; a shorter expiry just means refresh happens more often. Don't pick an extremely short one — you'd put needless load on HF's token endpoint. |
| Default Scopes | See §11.3 below. | Critical — this controls what consent the user sees and what the token can do. |
| Redirect URLs | http://127.0.0.1:41954/api/auth/huggingface/callback |
Must match exactly. If you change CALLBACK_PORT in
oauth.py, change this in lockstep. Multiple redirect URLs can be
registered (one per line) if you need both dev and prod variants later.
|
HF groups scopes into sections. The bare minimum for this feature is three checkboxes total. Leave everything else off.
| Section | Scope to check | Why |
|---|---|---|
| User Info | openid |
Required by HF when the app uses OpenID Connect at all (which our PKCE flow does — it's part of the OAuth2 + OIDC handshake). |
| User Info | profile |
Lets HfApi.whoami(token=...) return a username. The Settings panel
shows that username next to the "Logged in" indicator. Strictly cosmetic but
expected by the UI. |
| Repository Access | gated-repos"Read public gated repos only" |
The key scope. Grants the token enough to (a) call auth_check against
gated repos the user has accepted the license for, and (b) download files from those
repos. Public-only — no private-repo access included, no write permissions. |
read-repos would also work for the feature (it includes
gated-repos plus private-repo read access), but picking it makes the
user's consent screen on huggingface.co look scarier ("this app wants to read your
private repositories"). Users may bail. Stick to gated-repos.
After creation, HF will label the app a Public app and explicitly note: "No client secret. Use PKCE or device code flow for authentication." This is expected and correct — we use PKCE (see §4). Do not click Add client secret; we don't need it and having one without using it would be a future footgun.
The Credentials section of the new app shows a Client ID in the form of a UUID
(e.g. a8189e14-9246-4f19-bd6a-a307bdcb9276). Copy that value and paste it
verbatim into:
# app/model_downloader/hf_auth/oauth.py (around line 49)
HF_CLIENT_ID = "paste-the-uuid-here"
That's the only code change required. Restart ComfyUI; POST /api/hf-auth-login-start
should now produce an authorize_url that huggingface.co accepts.
python main.py --listen 127.0.0.1 --port 8189curl -s http://127.0.0.1:8189/api/features | grep hf_auth_eligible
# expect: "hf_auth_eligible": true
curl -s -X POST http://127.0.0.1:8189/api/hf-auth-login-start | python3 -m json.tool
# expect: {"authorize_url": "https://huggingface.co/oauth/authorize?client_id=<your-uuid>&..."}
authorize_url in a browser. The consent screen should display the
Application Name you chose and list the three scopes (openid, profile,
gated-repos). Click Authorize.http://127.0.0.1:41954/api/auth/huggingface/callback?code=...&state=....
Our local callback server completes the token exchange and renders a "Login complete" page.curl -s http://127.0.0.1:8189/api/hf-auth-token-status | python3 -m json.tool
# expect: {"token_available": true, "username": "your-hf-username"}
Once that round-trip works, the missing-models card will use the token automatically for every subsequent gated probe and download.
The port 41954 is arbitrary — chosen to be high and unlikely to collide.
If you ever need to change it, three things must move together:
CALLBACK_PORT in app/model_downloader/hf_auth/oauth.py.tests-unit/app_test/hf_auth_test.py).If they drift out of sync, HF will reject the redirect with a
redirect_uri_mismatch error and the callback never lands.
Generated as a feature handover. Living document — keep it updated as the feature evolves, or replace with a proper docs site entry once one exists.