fix(cuda): install NVML development library #4621
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Recently, after running
setup_dev_env.sh
and installing NVIDIA libraries, there's an issue where part of NVML (nvml.h
) is not installed. This affects thegpu_monitor
node insystem_monitor
, which uses NVML. Thegpu_monitor
recognized NVML doesn't exist and publish errors as it is unable to access the GPU.See also autowarefoundation/autoware.universe#6787.
I'd like to explicitly install NVML as a workaround for this issue.
Tests performed
Completely remove NVIDIA drivers and libraries.
Confirm that only
hwloc/nvml.h
exists.Run
setup_dev_env.sh
and installi NVIDIA libraries.Confirm that the NVIDIA's
nvml.h
is installedDelete the build and install directories for system_monitor.
❯ rm -rf install/system_monitor/ build/system_monitor/
Build
system_monito
r and ensure build uses NVML (GPU PLATFORM: nvml
), and build completes successfully.Run Autoware.
ros2 launch autoware_launch planning_simulator.launch.xml map_path:=/data/sample-map-planning vehicle_model:=sample_vehicle sensor_model:=sample_sensor_kit launch_system_monitor:=true
Run runtime_monitor and Confirm the
gpu_monitor
does not report an error.ros2 run rqt_runtime_monitor rqt_runtime_monitor
Effects on system behavior
Not applicable.
Pre-review checklist for the PR author
The PR author must check the checkboxes below when creating the PR.
In-review checklist for the PR reviewers
The PR reviewers must check the checkboxes below before approval.
Post-review checklist for the PR author
The PR author must check the checkboxes below before merging.
After all checkboxes are checked, anyone who has write access can merge the PR.