add dense matmul benchmark #256

jorendumoulin · 2024-09-24T06:25:12Z

This adds a benchmark that measures peak performance of the gemm accelerator measuring from the time it is launched until it is finished.

It is configured to run the benchmark on every push to main, and uploads the output report to github artifacts, sneek peak available here: https://github.com/KULeuven-MICAS/snax-mlir/actions/runs/11012799975

JosseVanDelm

Yeah let's do this!
We can probably refactor a bunch of things out of here though.
Furthermore, can you refactor everything that says tiled_matmul to not say it?
This is not a tiled matrix multiplication after all ;)

JosseVanDelm · 2024-09-25T13:16:44Z

benchmarks/dense_matmul/genbenchmark.py

+from util.snax_benchmark import SNAXBenchmark
+
+
+def create_tiled_matrix_multiply(k, m, n):


This is not a tiled_matrix_multiply anymore :)

JosseVanDelm · 2024-09-25T13:17:07Z

benchmarks/dense_matmul/genbenchmark.py

+    return module
+
+
+def write_module_to_file(module, file):


I want to move this to the tools folder at some point

JosseVanDelm · 2024-09-25T13:19:14Z

benchmarks/dense_matmul/gendata.py

+def create_test_data(n, m, k):
+    print(f"Creating test data with n={n}, m={m}, k={k}")
+    # Reset random seed for reproducible behavior
+
+    np.random.seed(0)
+
+    A_size = [n, k]
+    B_size = [k, m]
+
+    # C = A.B
+    low_bound = -128
+    high_bound = 127
+
+    A = np.random.randint(low_bound, high_bound, size=A_size, dtype=np.dtype("int8"))
+    B = np.random.randint(low_bound, high_bound, size=B_size, dtype=np.dtype("int8"))
+
+    # Make sure the product is possible!
+    assert A.shape[1] == B.shape[0]
+    C_golden = np.matmul(A.astype(np.dtype("int32")), B.astype(np.dtype("int32")))
+    C = np.zeros(C_golden.shape, np.dtype("int32"))
+
+    # Perform layout transformations before writing to memory
+
+    # only thing necessary: transform B from row-major to column-major
+    B_new_layout = np.transpose(B)
+
+    # C are just all zeros, so layout not important
+    sizes = {
+        "N_size": A.shape[0],
+        "K_size": A.shape[1],
+        "M_size": B.shape[1],
+    }
+    variables = {
+        "A": A,
+        "B": B_new_layout,
+        "C_golden": C_golden,
+        "C": C,
+    }
+
+    create_header("data.h", sizes, variables)
+    create_data("data.c", variables)


We can probably move this to a more common folder :)

JosseVanDelm · 2024-09-25T13:19:41Z

benchmarks/dense_matmul/main.c

+      }
+    }
+
+    // insert mcycle to show fault in trace


What do you mean by this?

just a quick way to check if an operation succeeds by looking at the trace. if there are 3 mcycles, it was successful. if there are 4, an error has occured

jorendumoulin force-pushed the joren/dense-benchmark branch from f1669ed to 9d1610d Compare September 24, 2024 08:45

jorendumoulin added 3 commits September 24, 2024 13:14

add dense matmul benchmark

8631594

formatting

abaff5d

finish benchmark

00e9e5a

jorendumoulin force-pushed the joren/dense-benchmark branch from 9d1610d to 00e9e5a Compare September 24, 2024 11:14

jorendumoulin added 5 commits September 24, 2024 13:18

add benchmarks workflow

77be0d8

update workflow

1271189

update workflow

9a4e288

update workflow

a22e4cf

only run on push to main

8a02eee

jorendumoulin marked this pull request as ready for review September 24, 2024 11:33

jorendumoulin requested a review from JosseVanDelm September 24, 2024 11:33

JosseVanDelm approved these changes Sep 25, 2024

View reviewed changes

jorendumoulin added 2 commits September 25, 2024 16:45

resolve comments

57471ba

adjust ideal computation calculation

fd12ec7

jorendumoulin merged commit ae4ec7f into main Sep 25, 2024
14 checks passed

jorendumoulin deleted the joren/dense-benchmark branch September 25, 2024 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add dense matmul benchmark #256

add dense matmul benchmark #256

jorendumoulin commented Sep 24, 2024 •

edited

Loading

JosseVanDelm left a comment

JosseVanDelm Sep 25, 2024

JosseVanDelm Sep 25, 2024

JosseVanDelm Sep 25, 2024

JosseVanDelm Sep 25, 2024

jorendumoulin Sep 25, 2024

		from util.snax_benchmark import SNAXBenchmark


		def create_tiled_matrix_multiply(k, m, n):

add dense matmul benchmark #256

add dense matmul benchmark #256

Conversation

jorendumoulin commented Sep 24, 2024 • edited Loading

JosseVanDelm left a comment

Choose a reason for hiding this comment

JosseVanDelm Sep 25, 2024

Choose a reason for hiding this comment

JosseVanDelm Sep 25, 2024

Choose a reason for hiding this comment

JosseVanDelm Sep 25, 2024

Choose a reason for hiding this comment

JosseVanDelm Sep 25, 2024

Choose a reason for hiding this comment

jorendumoulin Sep 25, 2024

Choose a reason for hiding this comment

jorendumoulin commented Sep 24, 2024 •

edited

Loading