-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update docker images to upgrade CUDNN from 9.5 to 9.6 #23244
base: main
Are you sure you want to change the base?
Conversation
@tianleiwu , there is a strange error from CUDNN frontend , which was caused by upgrading CUDNN from 9.5 to 9.6. Could you please help me take a look? |
Tried upgrade both cudnn-frontend and cudnn, and submitted a test build: https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1579205&view=results Worst case is that we may add an fallback to cudnn backend directly as before for the case that cannot be handled by cudnn frontend. |
The error was: [E:onnxruntime:yolov3, sequential_executor.cc:505 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'conv2d_2_0' Status Message: Failed to initialize CUDNN Frontend/onnxruntime_src/onnxruntime/core/providers/cuda/cudnn_fe_call.cc:99 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnn_frontend::error_object; bool THRW = true; SUCCTYPE = cudnn_frontend::error_code_t; std::conditional_t<THRW, void, common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cudnn_fe_call.cc:91 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnn_frontend::error_object; bool THRW = true; SUCCTYPE = cudnn_frontend::error_code_t; std::conditional_t<THRW, void, common::Status> = void] CUDNN_FE failure 8: HEURISTIC_QUERY_FAILED ; GPU=0 ; hostname=98d137446008 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/nn/conv.cc ; line=225 ; expr=s_.cudnn_fe_graph->create_execution_plans({heur_mode}); |
@gedoensmax, @JTischbein, is it a known issue that cudnn 9.6 has regression of support convolution for yolo v3? Here is cudnn 9.6 debug log:
|
c9c52e1
to
5944339
Compare
…oft/onnxruntime into snnn/update_docker_images
The CUDNN in the Linux docker images have been upgraded from 9.5.1 to 9.6.0.