mirror of
https://github.com/comfyanonymous/ComfyUI.git
synced 2026-06-24 00:39:30 +08:00
Cube3D is an autoregressive VQ-token shape model (DualStreamRoformer) plus a VQ-VAE shape tokenizer (OneDAutoEncoder), not a diffusion model. It is wired natively following the Causal-WAN AR-video pattern: the GPT loads as a normal MODEL and generation runs through a dedicated 'cube' sampler instead of KSampler. - comfy/ldm/cube/gpt.py: DualStreamRoformer port (dual-stream RoPE attention, per-head RMSNorm, SwiGLU, KV cache; rope_theta=10000). - comfy/ldm/cube/vae.py: OneDAutoEncoder decode path (codebook lookup, decoder, occupancy decoder, dense-grid extraction + skimage marching cubes). - model_detection/supported_models/model_base: register shape_gpt as Cube3D MODEL (dims inferred from state dict; apply_model guarded to point at SamplerCube). - sd.py: detect shape_tokenizer and build CubeShapeVAE. - k_diffusion/sampling.py: sample_cube autoregressive sampler (decaying CFG + optional top-p), faithful to upstream Engine.run_gpt. - comfy_extras/nodes_cube.py: EmptyCubeLatent, CubeCodebookPatch (inject VQ codebook into wte), SamplerCube, VAEDecodeCube (-> MESH). Reuses CLIP-L conditioning, CFGGuider/SamplerCustomAdvanced, and SaveGLB. Amp-Thread-ID: https://ampcode.com/threads/T-019ec361-addb-70d8-a74b-438ce8a1e096 Co-authored-by: Amp <amp@ampcode.com> |
||
|---|---|---|
| .. | ||
| gpt.py | ||
| vae.py | ||