* Add jobs-namespace cancel endpoints
Add two cancel endpoints under the jobs namespace so a job can be
cancelled by id without the caller needing to know whether the job is
running or pending, or branching between /interrupt and /queue.
- POST /api/jobs/{job_id}/cancel cancels one job by id. Idempotent: an
already-finished or unknown id returns 200 {"cancelled": false} rather
than an error.
- POST /api/jobs/cancel takes {"job_ids": [...]} and cancels a batch.
Fail-fast: if any id is unknown the request returns 404 listing the
unknown ids and cancels nothing (no partial side effects).
Both are state-agnostic and map onto the existing queue mechanics: a
running job is interrupted (same path as /interrupt), a pending job is
dequeued (same path as /queue {"delete": [...]}). The cancel logic lives
in comfy_execution.jobs as pure, unit-tested helpers; the server handlers
are thin wrappers. openapi.yaml documents both routes.
* fix: resolve review feedback on cancel endpoints
- Guard cancel_job() against TOCTOU: when dequeue() returns False the
pending job left the queue between snapshot and delete; return
CANCEL_UNKNOWN so callers never report cancelled=True for a remove
that did not happen.
- Validate each job_ids element in the batch cancel endpoint before
any queue access; unhashable or non-UUID values now return 400
instead of raising TypeError (500).
- Update batch HTTP tests to use canonical UUID ids (required now that
the endpoint validates id format) and add tests for the new guards.
* fix: make job cancel atomic and best-effort
Addresses two cancel races/edges raised in review.
Targeted, atomic interrupt. cancel_job's interrupt callback now takes the
prompt id and returns whether it fired; the single-cancel route backs it
with the new PromptQueue.interrupt_if_running, which checks the running set
and signals the interrupt under the queue mutex. This closes the TOCTOU
where a pending job that starts executing between the snapshot and dequeue
(or a running job that finishes between the snapshot and interrupt) could be
missed or, worse, cause an unrelated prompt to be interrupted. The per-prompt
interrupt-flag reset in execute_async keeps a finished job from leaking the
interrupt onto its successor.
Best-effort batch cancel. POST /api/jobs/cancel no longer fails the whole
batch with 404 when one id is unknown/finished; such ids are treated as
no-ops, so "cancel all" still cancels the in-progress jobs even if some
finished between the client's snapshot and the request. Malformed ids are
still rejected with 400.
The job_ids query filter added in #13998 has no live consumer: the
frontend Generated tab kept sourcing from GET /jobs, and the cloud side
removed its equivalent filter from the shared asset spec. Carrying it on
the local server only re-introduces Core<->Cloud drift on the shared
contract, so remove it to match.
Removed: the job_ids field + validator on ListAssetsQuery, the IN(...)
clauses in list_references_page, the service/route passthrough, and the
filter-only tests.
Kept: the canonical-UUID prompt_id enforcement at job creation (also
landed in #13998). It stands on its own -- job ids are matched verbatim
by history keys, websocket correlation, and /interrupt -- and cloud
inherits it by running core for execution, so no divergence is created.
* feat(assets): add job_ids filter to GET /api/assets
Mirrors the existing cloud `job_ids` query param on the local Python server:
clients can pass a comma-separated list (or repeated query params) of UUIDs
to filter assets by their associated job.
The `AssetReference.job_id` column already exists, so no migration is
needed — this just plumbs the filter through schema → service → query.
Marks the parameter as available in both runtimes by dropping the
`[cloud-only]` description prefix and the `x-runtime: [cloud]` tag from
the OpenAPI spec, per the OSS field-drift convention (absent runtime tag
= populated by both local and cloud).
* fix(assets): tighten job_ids — array schema, max_length, narrow except
From cursor-reviews on the parent commit:
- OpenAPI: declare job_ids as `type: array, items: string format: uuid`
with `style: form, explode: true` so it matches the documented
contract (and matches sibling include_tags/exclude_tags shape).
Description now states both accepted shapes explicitly.
- Schema: cap `job_ids` at 500 entries (max_length on the Pydantic
field) so a client can't splice an unbounded list into the IN clauses.
- Schema: drop `AttributeError` from the except — `raw` only contains
`str` items by construction, so `uuid.UUID(<str>)` raises `ValueError`
exclusively; the second clause was dead code.
* fix(assets): tighten job_ids validator + add schema-level tests
Aligns with the parallel hardening from draft PR #13848 (now closed as
a duplicate). The validator now:
- Raises ValueError on non-string list items (was: silently dropped).
- Raises ValueError on non-string / non-list top-level values like dict
or int (was: silently passed through to Pydantic's downstream coercion).
Adds tests-unit/assets_test/queries/test_list_assets_query.py covering
the validator end-to-end: CSV canonicalization, dedup order, default
empty, invalid UUID, non-string list item, non-string non-list value,
and the max_length=500 boundary.
* feat(prompt): enforce canonical UUID prompt_id at job creation
POST /prompt previously accepted any client-supplied prompt_id verbatim,
str()-coercing even non-strings, and minting the literal job id "None"
for an explicit JSON null. The new GET /api/assets job_ids filter matches
stored job ids as canonical UUIDs exactly, so a non-UUID id minted a job
whose assets could never be filtered.
- validate_job_id (comfy_execution/jobs.py): requires a string in the
canonical lowercase hyphenated UUID form; raises ValueError otherwise,
including parseable-but-non-canonical spellings (uppercase, braced, URN,
bare hex), which would otherwise be silently rewritten and then miss
every exact-match lookup downstream (history keys, websocket
correlation, /interrupt, the assets job_ids filter).
- POST /prompt: absent or null prompt_id means the server mints uuid4;
invalid means 400 invalid_prompt_id on the standard error envelope.
- openapi.yaml: document the request-side prompt_id (format uuid,
nullable) on PromptRequest.
- tests: unit matrix for validate_job_id; integration tests against the
booted server covering rejection, acceptance, and null handling.
---------
Co-authored-by: guill <jacob.e.segal@gmail.com>
Move count increment before isinstance(item, dict) check so that
non-dict output items (like text strings from PreviewAny node)
are included in outputs_count.
This aligns OSS Python with Cloud's Go implementation which uses
len(itemsArray) to count ALL items regardless of type.
Amp-Thread-ID: https://ampcode.com/threads/T-019c0bb5-14e0-744f-8808-1e57653f3ae3
Co-authored-by: Amp <amp@ampcode.com>
* feat: create a /jobs api to return queue and history jobs
* update unused vars
* include priority
* create jobs helper file
* fix ruff
* update how we set error message
* include execution error in both responses
* rename error -> failed, fix output shape
* re-use queue and history functions
* set workflow id
* allow srot by exec duration
* fix tests
* send priority and remove error msg
* use ws messages to get start and end times
* revert main.py fully
* refactor: move all /jobs business logic to jobs.py
* fix failing test
* remove some tests
* fix non dict nodes
* address comments
* filter by workflow id and remove null fields
* add clearer typing - remove get("..") or ..
* refactor query params to top get_job(s) doc, add remove_sensitive_from_queue
* add brief comment explaining why we skip animated
* comment that format field is for frontend backward compatibility
* fix whitespace
---------
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
Co-authored-by: guill <jacob.e.segal@gmail.com>