EasyAI代码托管平台

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-06-24 16:59:29 +08:00

Author	SHA1	Message	Date
Jedrzej Kosinski	aeb3c77ae9	Cube3D: route VAE decode through managed comfy.sd.VAE.decode Stop fighting ComfyUI's model management. VAEDecodeCube was manually calling load_models_gpu + .to(vae.device) and the VAE forced disable_offload=True because it bypassed the managed decode path. Now CubeShapeVAE.decode(samples) is the entry point that comfy.sd.VAE.decode calls, so loading/device/dtype are handled automatically (like Hunyuan3Dv2): - removed disable_offload=True (let the offload system manage weights) - removed manual load_models_gpu + .to(device) from the node - process_output set to identity (default clamps [0,1] in-place and would destroy the occupancy isosurface) - decode() pre-inverts VAE.decode's trailing movedim(1,-1) so the node receives grid logits unchanged (parity preserved) - memory_used_decode sized by num_tokens (shape[-1]) for the new latent layout Amp-Thread-ID: https://ampcode.com/threads/T-019ec361-addb-70d8-a74b-438ce8a1e096 Co-authored-by: Amp <amp@ampcode.com>	2026-06-14 23:28:22 -07:00
Jedrzej Kosinski	a6c7397b71	Cube3D: use channels-first 1D latent (B,1,L) like Hunyuan3Dv2 Replaces the dummy trailing-dim latent with a channels-first 1D latent (B, 1, num_tokens) and a dedicated latent_formats.Cube3D (latent_channels=1, latent_dimensions=1). This mirrors the existing native 3D model Hunyuan3Dv2's (B, C, L) convention and avoids fix_empty_latent_channels truncating the token sequence (it narrows dim=1 to latent_channels for empty latents). Requires no core sampler changes: encode_model_conds sees a valid noise.shape[2]. - latent_formats.Cube3D added; wired into supported_models.Cube3D - EmptyCubeLatent emits (B, 1, num_tokens) - sample_cube takes T from x.shape[-1], returns (B, 1, T), and repeats conditioning to the latent batch size Amp-Thread-ID: https://ampcode.com/threads/T-019ec361-addb-70d8-a74b-438ce8a1e096 Co-authored-by: Amp <amp@ampcode.com>	2026-06-14 23:14:17 -07:00
Jedrzej Kosinski	871f7bc390	Cube3D: fix graph integration (3D latent, VAE device, fp32 cond, scikit-image) Amp-Thread-ID: https://ampcode.com/threads/T-019ec361-addb-70d8-a74b-438ce8a1e096 Co-authored-by: Amp <amp@ampcode.com>	2026-06-14 22:59:11 -07:00
Jedrzej Kosinski	01a8783bee	Add native Roblox Cube3D text-to-3D support Cube3D is an autoregressive VQ-token shape model (DualStreamRoformer) plus a VQ-VAE shape tokenizer (OneDAutoEncoder), not a diffusion model. It is wired natively following the Causal-WAN AR-video pattern: the GPT loads as a normal MODEL and generation runs through a dedicated 'cube' sampler instead of KSampler. - comfy/ldm/cube/gpt.py: DualStreamRoformer port (dual-stream RoPE attention, per-head RMSNorm, SwiGLU, KV cache; rope_theta=10000). - comfy/ldm/cube/vae.py: OneDAutoEncoder decode path (codebook lookup, decoder, occupancy decoder, dense-grid extraction + skimage marching cubes). - model_detection/supported_models/model_base: register shape_gpt as Cube3D MODEL (dims inferred from state dict; apply_model guarded to point at SamplerCube). - sd.py: detect shape_tokenizer and build CubeShapeVAE. - k_diffusion/sampling.py: sample_cube autoregressive sampler (decaying CFG + optional top-p), faithful to upstream Engine.run_gpt. - comfy_extras/nodes_cube.py: EmptyCubeLatent, CubeCodebookPatch (inject VQ codebook into wte), SamplerCube, VAEDecodeCube (-> MESH). Reuses CLIP-L conditioning, CFGGuider/SamplerCustomAdvanced, and SaveGLB. Amp-Thread-ID: https://ampcode.com/threads/T-019ec361-addb-70d8-a74b-438ce8a1e096 Co-authored-by: Amp <amp@ampcode.com>	2026-06-14 20:21:37 -07:00

4 Commits