SageAttention research confirms it remains useful (2-5x speedup over FA,
active development through SA3, broad community adoption for video/HR).
Restore it as an opt-in startup-compiled feature (FORCE_SAGE_ATTENTION=1).
Entrypoint is cleaned up vs the original: simplified GPU probe (drops
per-arch flag exports, keeps what's needed for strategy selection),
cleaner build logic with merged clone/update paths, removed dead code.
- Dockerfile: restore SAGE_ATTENTION_AVAILABLE=0 default env var
- entrypoint.sh: simplified but functionally equivalent SageAttention
support; removes ~150 lines of redundant per-arch flag tracking
- README: fix docker-compose gpus syntax to 'gpus: all'; restore
SageAttention docs with accurate description of behavior
https://claude.ai/code/session_01WQc56fWdK329K11kRGnb5g
- Dockerfile: fix glibc 2.41 patch path (cuda-12.9 -> cuda-12.8 to match
installed packages); remove SAGE_ATTENTION_AVAILABLE env var
- sync-build-release.yml: add always() to publish job condition so it runs
even when build-self is skipped (the primary GitHub runner path succeeds),
fixing releases never being created on normal builds
- entrypoint.sh: remove SageAttention compilation and GPU detection logic;
simplify to permissions setup, ComfyUI-Manager sync, custom node install,
and launch
- README: update CUDA version references from 12.9/cu129 to 12.8/cu128;
remove SageAttention documentation; fix docker-compose GPU syntax
https://claude.ai/code/session_01WQc56fWdK329K11kRGnb5g
Updates README to match the Dockerfile and entrypoint: Python 3.12 slim trixie with CUDA 12.9 dev libs and PyTorch via cu129 wheels; SageAttention is built at startup but only enabled when FORCE_SAGE_ATTENTION=1 and the import test passes; Compose example uses Deploy device reservations with driver:nvidia and capabilities:[gpu]; documents PUID/PGID, COMFY_AUTO_INSTALL, and FORCE_SAGE_ATTENTION; clarifies port 8188 mapping and how to change ports.
Update README to reflect that SageAttention 2.2/2++ is compiled into the
image at build time and enabled automatically on launch using
--use-sage-attention. Clarifies NVIDIA GPU setup expectations and that no
extra steps are required to activate SageAttention in container runs.
Changes:
- Features: add “SageAttention 2.2 baked in” and “Auto-enabled at launch”.
- Getting Started: note that SageAttention is compiled during docker build
and requires no manual install.
- Docker Compose: confirm the image launches with SageAttention enabled by default.
- Usage: add a SageAttention subsection with startup log verification notes.
- General cleanup and wording to align with current image behavior.
No functional code changes; documentation only.
* Change bf16 check and switch non-blocking to off default with option to force to regain speed on certain classes of iGPUs and refactor xpu check.
* Turn non_blocking off by default for xpu.
* Update README.md for Intel GPUs.