AssertionError: Errors of the kernel fp8_gemm in the profiling table #9

vlluvia · 2025-02-26T05:29:04Z

Python 3.12.3
CUDA 12.6
pytorch 2.6
CUTLASS 3.8

LyricZhao · 2025-02-26T06:40:48Z

This means the bench_kineto function does not detect any kernel running.

For all bench_kineto function call, you can set suppress_kineto_output=False, which may print more error information and share to us.

vlluvia · 2025-02-26T09:48:00Z

Library path:

['/root/DeepGEMM/deep_gemm']

Testing GEMM:
WARNING:2025-02-26 09:43:38 68350:68350 init.cpp:178] function cbapi->getCuptiStatus() failed with error CUPTI_ERROR_NOT_INITIALIZED (15)
WARNING:2025-02-26 09:43:38 68350:68350 init.cpp:179] CUPTI initialization failed - CUDA profiler activities will be missing
INFO:2025-02-26 09:43:38 68350:68350 init.cpp:181] If you see CUPTI_ERROR_INSUFFICIENT_PRIVILEGES, refer to https://developer.nvidia.com/nvidia-development-tools-solutions-err-nvgpuctrperm-cupti
Traceback (most recent call last):
File "/root/DeepGEMM/tests/test_core.py", line 156, in
test_gemm()
File "/root/DeepGEMM/tests/test_core.py", line 80, in test_gemm
t = bench_kineto(test_func, 'fp8_gemm', suppress_kineto_output=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/DeepGEMM/deep_gemm/utils.py", line 119, in bench_kineto
assert sum([name in line for line in prof_lines]) == 1, f'Errors of the kernel {name} in the profiling table'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Errors of the kernel fp8_gemm in the profiling table

LyricZhao · 2025-02-26T09:58:40Z

Seems you don't have sufficient privilege for CUPTI profiling, which is important for microsecond-level accurate timing.

Try to follow the PyTorch Kineto profile warning information to solve the issue. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssertionError: Errors of the kernel fp8_gemm in the profiling table #9

AssertionError: Errors of the kernel fp8_gemm in the profiling table #9

vlluvia commented Feb 26, 2025

LyricZhao commented Feb 26, 2025

vlluvia commented Feb 26, 2025

LyricZhao commented Feb 26, 2025

AssertionError: Errors of the kernel fp8_gemm in the profiling table #9

AssertionError: Errors of the kernel fp8_gemm in the profiling table #9

Comments

vlluvia commented Feb 26, 2025

LyricZhao commented Feb 26, 2025

vlluvia commented Feb 26, 2025

LyricZhao commented Feb 26, 2025