Commit Graph

4 Commits

Author SHA1 Message Date
John Pollock
9250191c65 feat(isolation): DynamicVRAM compatibility for process isolation
DynamicVRAM's on-demand model loading/offloading conflicted with  process isolation in three ways: RPC tensor transport stalls from mid-call GPU offload, race conditions between model lifecycle and active RPC operations, and false positive memory leak detection from changed finalizer patterns.

- Marshal CUDA tensors to CPU before RPC transport for dynamic models
- Add operation state tracking + quiescence waits at workflow boundaries
- Distinguish proxy reference release from actual leaks in cleanup_models_gc
- Fix init order: DynamicVRAM must initialize before isolation proxies
- Add RPC timeouts to prevent indefinite hangs on model unavailability
- Prevent proxy-of-proxy chains from DynamicVRAM model reload cycles
- Add torch.device/torch.dtype serializers for new DynamicVRAM RPC paths
- Guard isolation overhead so non-isolated workflows are unaffected
- Migrate env var to PYISOLATE_CHILD
2026-03-04 23:48:02 -06:00
guill
00fff6019e
feat(jobs): add 3d to PREVIEWABLE_MEDIA_TYPES for first-class 3D output support (#12381)
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
2026-02-10 14:37:14 -08:00
ric-yu
04c49a29b4
feat: add cancelled filter to /jobs (#11680) 2026-01-08 21:57:36 -08:00
ric-yu
1ca89b810e
Add unified jobs API with /api/jobs endpoints (#11054)
Some checks are pending
Python Linting / Run Ruff (push) Waiting to run
Python Linting / Run Pylint (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.10, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.11, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-stable (12.1, , linux, 3.12, [self-hosted Linux], stable) (push) Waiting to run
Full Comfy CI Workflow Runs / test-unix-nightly (12.1, , linux, 3.11, [self-hosted Linux], nightly) (push) Waiting to run
Execution Tests / test (macos-latest) (push) Waiting to run
Execution Tests / test (ubuntu-latest) (push) Waiting to run
Execution Tests / test (windows-latest) (push) Waiting to run
Test server launches without errors / test (push) Waiting to run
Unit Tests / test (macos-latest) (push) Waiting to run
Unit Tests / test (ubuntu-latest) (push) Waiting to run
Unit Tests / test (windows-2022) (push) Waiting to run
* feat: create a /jobs api to return queue and history jobs

* update unused vars

* include priority

* create jobs helper file

* fix ruff

* update how we set error message

* include execution error in both responses

* rename error -> failed, fix output shape

* re-use queue and history functions

* set workflow id

* allow srot by exec duration

* fix tests

* send priority and remove error msg

* use ws messages to get start and end times

* revert main.py fully

* refactor: move all /jobs business logic to jobs.py

* fix failing test

* remove some tests

* fix non dict nodes

* address comments

* filter by workflow id and remove null fields

* add clearer typing - remove get("..") or ..

* refactor query params to top get_job(s) doc, add remove_sensitive_from_queue

* add brief comment explaining why we skip animated

* comment that format field is for frontend backward compatibility

* fix whitespace

---------

Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
Co-authored-by: guill <jacob.e.segal@gmail.com>
2025-12-17 21:44:31 -08:00