Skip to content

Commit

Permalink
Fix tensor core instruction shape (#19)
Browse files Browse the repository at this point in the history
* fix tensor core instruction shape

* remove no-op constraint
  • Loading branch information
nlaanait authored Mar 9, 2025
1 parent 5732f20 commit 5039ede
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 1 deletion.
1 change: 1 addition & 0 deletions custom-ops-matrix-multiplication/benchmarks.mojo
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ def matmul():
bench_matmul_kernel["tiled_register"]()
bench_matmul_kernel["block_tiled"]()
bench_matmul_kernel["block_tiled_vectorized"]()
bench_matmul_kernel["tensor_core"]()

bench.config.verbose_metric_names = False
print(bench)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -975,7 +975,7 @@ struct MatrixMultiplication[algorithm: StringLiteral]:
alias WN = 32
alias MMA_M = 16
alias MMA_N = 8
alias MMA_K = 8
alias MMA_K = 4
alias NUM_WARPS = (BM // WM) * (BN // WN)
gpu_ctx.enqueue_function[
tensor_core_matrix_multiplication[
Expand Down

0 comments on commit 5039ede

Please sign in to comment.