Yousef Rafat
3dd39efa03
some fixes
2025-12-02 23:40:31 +02:00
Yousef Rafat
a58133f188
updated the hunyuan moe forward
...
splitted the forward statement between ready and pending experts
2025-12-01 22:58:53 +02:00
Yousef Rafat
76e14d69b2
operations + device + dtype | checkpoint skip
2025-12-01 18:53:43 +02:00
Yousef Rafat
88c350bfed
corrected img_ratio
2025-11-28 20:43:01 +02:00
Yousef Rafat
bd2c2f7375
removed additional token injection
2025-11-28 18:21:20 +02:00
Yousef Rafat
334041f6a6
efficiency improvements
2025-11-26 21:21:45 +02:00
Yousef Rafat
823870db53
fixed some mistakes/errors
2025-11-25 21:34:03 +02:00
Yousef Rafat
b4001bbd27
input correction and improvements
2025-11-25 00:25:27 +02:00
Yousef Rafat
86e9a7a669
updates to the input
2025-11-23 22:20:24 +02:00
Yousef Rafat
a3ac798d4e
decrease peak memory in moe forward
2025-11-22 23:02:42 +02:00
Yousef Rafat
ae8592ebf5
zero-copy and optimized moe loader
2025-11-22 12:37:10 +02:00
Yousef Rafat
4d982e83f6
added clip encoder
2025-11-21 17:22:37 +02:00
Yousef Rafat
b84af5b947
small attention fix
2025-11-17 23:03:52 +02:00
Yousef Rafat
3f71760913
resblock fix
2025-11-17 06:50:54 +02:00
Yousef Rafat
61b1efdaf0
vectrozied correct implementation of moe forward
2025-11-16 19:25:37 +02:00
Yousef Rafat
4a5509a4c5
.
2025-11-16 16:20:35 +02:00
Yousef Rafat
d731c58353
improving performance and fixing race condition
2025-11-16 16:19:39 +02:00
Yousef Rafat
12cc6924ac
meta init
2025-11-14 20:10:52 +02:00
Yousef Rafat
7b4c1e8031
async cache revamp
...
Added an async loading and offloading of moe layers, having consistent memory with oom errors.
Used to give oom error after the third layer with 24 giga bytes gpu, now goes to the end with consistent memory with minimal latency
2025-11-14 09:15:16 +02:00
Yousef Rafat
44346c4251
removed all errors
2025-11-08 19:49:02 +02:00
Yousef Rafat
5056a1f4d4
important fixes
2025-11-06 00:24:49 +02:00
Yousef Rafat
9e9c536c8e
fixes from testing
2025-11-04 23:55:16 +02:00
Yousef Rafat
ca119c44fb
returned kv cache for image generation
2025-11-01 23:06:11 +02:00
Yousef Rafat
10a17dc85d
a bunch of fixes
2025-11-01 16:40:49 +02:00
Yousef Rafat
a2fff60d4c
vectorized implementation of moe/fixes for issues
2025-10-31 23:53:13 +02:00
Yousef Rafat
de43880bdb
Hunyuan Image 3.0
2025-10-31 18:56:20 +02:00