Commit Graph

53 Commits

Author SHA1 Message Date
Rando717
5dcd8d2428
add files via upload
uploaded nvcuda.zluda_get_nightly_flag.py to get nightly flag info inside batch
2025-09-21 19:31:18 +02:00
Rando717
4057f2984c
Update zluda.py (MEM_BUS_WIDTH#3)
Lower casing the lookup inside MEM_BUS_WIDTH, just in case of incorrect casing on Radeon Pro (PRO) GPUs.

fixed/lower-casing "Triton device properties" lookup inside MEM_BUS_WIDTH.
2025-09-09 20:04:20 +02:00
Rando717
13ba6a8a8d
Update zluda.py (cleanup print Triton version)
compacted, without exception, silent if no version string
2025-09-09 19:30:54 +02:00
Rando717
ce8900fa25
Update zluda.py (gpu_name_to_gfx)
-function changed into list of rules

-correct gfx codes attached to each GPU name

-addressed potential incorrect designation for  RX 6000 S Series, sort priority
2025-09-09 18:51:41 +02:00
Rando717
e7d48450a3
Update zluda.py (removed previously added gfx90c)
'radeon graphics' check is not viable enough
considering 'radeon (tm) graphics' also exists on Vega.

Plus gfx1036 Raphael (Ryzen 7000) is called 'radeon (tm) graphics' , same with Granite Ridge (Ryzen 9000).
2025-09-08 21:10:20 +02:00
Rando717
590f46ab41
Update zluda.py (typo) 2025-09-08 20:31:49 +02:00
Rando717
675d6d8f4c
Update zluda.py (gfx gpu names)
-expanded GPU gfx names
-added RDNA4, RDNA3.5, ...
-added missing Polaris cards to prevent 'gfx1010' and 'gfx1030' fallback
-kept gfx designation mostly the same, based on available custom lib's for hip57/62

might need some post adjustments
2025-09-08 17:55:29 +02:00
Rando717
ddb1e3da47
Update zluda.py (typo) 2025-09-08 17:22:41 +02:00
Rando717
a7336ad630
Update zluda.py (MEM_BUS_WIDTH#2)
Added Vega10/20 cards.
Can't test, no clue if it has effect or just a placebo effect.
2025-09-08 17:19:03 +02:00
Rando717
40199a5244
Update zluda.py (print Triton version)
Added check for Triton version string, if it exists.
Could be useful info for troubleshooting reports.
2025-09-08 17:00:40 +02:00
patientx
3ca065a755
fix 2025-09-05 23:11:57 +03:00
patientx
0488fe3748
rmsnorm patch second try 2025-09-05 23:10:27 +03:00
patientx
8966009181
added rmsnorm patch for torch's older than 2.4 2025-09-05 22:43:39 +03:00
Christopher Anderson
cf22cbd8d5 Added env_var for cudnn.benchmark 2025-08-28 09:00:08 +10:00
Christopher Anderson
110cb0a9d9 Deleted torch.backends.cudnn.benchmark line, defaults are fine 2025-08-26 08:43:31 +10:00
Christopher Anderson
1b9a3b12c2 had to move cudnn disablement up much higher 2025-08-25 14:11:54 +10:00
Christopher Anderson
cd3d60254b argggh, white space hell 2025-08-25 09:52:58 +10:00
Christopher Anderson
184fa5921f worst PR ever, really. 2025-08-25 09:42:27 +10:00
Christopher Anderson
33c43b68c3 worst PR ever 2025-08-25 09:38:22 +10:00
Christopher Anderson
2a06dc8e87 Merge remote-tracking branch 'origin/sfink-cudnn-env' into sfink-cudnn-env
# Conflicts:
#	comfy/customzluda/zluda.py
2025-08-25 09:34:32 +10:00
Christopher Anderson
3504eeeb4a rebased onto upstream master (woops) 2025-08-25 09:32:34 +10:00
Christopher Anderson
7eda4587be Added env var TORCH_BACKENDS_CUDNN_ENABLED, defaults to 1. 2025-08-25 09:31:12 +10:00
Christopher Anderson
954644ef83 Added env var TORCH_BACKENDS_CUDNN_ENABLED, defaults to 1. 2025-08-25 08:56:48 +10:00
Rando717
053a6b95e5
Update zluda.py (MEM_BUS_WIDTH)
Added more cards, mostly RDNA(1) and Radeon Pro.

Reasoning: Every time zluda.py gets update I have to manually add 256 for my RX 5700, otherwise it default to 128. Also, manual local edits fails at git pull.
2025-08-24 18:39:40 +02:00
patientx
c92a07594b
Update zluda.py 2025-08-24 12:01:20 +03:00
patientx
dba9d20791
Update zluda.py 2025-08-24 10:23:30 +03:00
patientx
2e39e0999f
Update zluda.py 2025-08-05 19:21:20 +03:00
Christopher Anderson
4f853403fe Bad ideas from zluda update. 2025-08-05 06:00:55 +10:00
patientx
37415c40c1
device identification and setting triton arch override 2025-08-04 10:44:18 +03:00
patientx
8ad7604b12
Add files via upload 2025-06-26 01:45:32 +03:00
patientx
e34ff57933
Add files via upload 2025-06-17 00:09:40 +03:00
patientx
a5e9b6729c
Update zluda.py 2025-06-13 01:40:34 +03:00
patientx
7bd5bcd135
Update zluda-default.py 2025-06-13 01:40:11 +03:00
patientx
c06a15d8a5
Update zluda.py 2025-06-13 01:03:45 +03:00
patientx
896bda9003
Update zluda-default.py 2025-06-13 01:03:14 +03:00
patientx
7d5f4074b6
Add files via upload 2025-06-12 16:10:32 +03:00
patientx
08784dc90d
Update zluda.py 2025-06-12 13:19:59 +03:00
patientx
11af025690
Update zluda.py 2025-06-12 13:11:03 +03:00
patientx
9aeff135b2
Update zluda.py 2025-06-02 02:55:19 +03:00
patientx
1cf25c6980
Add files via upload 2025-05-15 13:55:15 +03:00
patientx
44cac886c4
Create quant_per_block.py 2025-05-15 13:54:47 +03:00
patientx
01aae8eddc
Create fwd_prefill.py 2025-05-15 13:52:35 +03:00
patientx
3609a0cf35
Update zluda.py 2025-05-13 18:57:09 +03:00
patientx
8ced886a3d
Update zluda.py 2025-05-13 16:23:24 +03:00
patientx
05d6c876ad
Update zluda.py 2025-05-13 01:06:08 +03:00
patientx
cd7eb9bd36
"Boolean value of Tensor with more than one value is ambiguous" fix 2025-05-11 20:39:42 +03:00
patientx
8abcc4ec4f
Update zluda.py 2025-05-11 16:51:43 +03:00
patientx
8a3424f354
Update zluda.py 2025-05-09 00:17:15 +03:00
patientx
9444d18408
Update zluda.py 2025-05-08 23:44:22 +03:00
patientx
6c2370a577
Update zluda.py 2025-05-08 20:18:26 +03:00