NVIDIA vGPU with A16 #10076
-
Hi all, I'm trying to enable NVIDIA vGPU with an A16 card, which is essentially multiple A2 chips on a single board (and, therefore, driver-compatible with the A2). With SR-IOV enabled, you can use up to 68 individual profiles on the card at the same time. CloudStack has support built-in for the A2, but it doesn't seem to recognize the A16 as many A2s. I've set the offering to be deployed with the A2-1B profile, however, the virtual devices list A16 profiles instead:
The management log confirms that CloudStack doesn't see the appropriate card present in any of the hosts (there are 6 hosts, all with an A16 in them):
The devices are listed in lspci as such:
Has anyone successfully used an A16 with CloudStack? If not, can support for the A16 be added? Thanks, |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 3 replies
-
@meisenst-dnd Although I haven't used this specific GPU card, I just want to give you a heads-up that the supported GPUs you see in the compute offerings work only with XenServer. I assume you are using XenServer. I do see 'NVIDIA RTX A2' in the list of supported GPUs. Can the hypervisor report A16 as 2 x A2 to CloudStack? Can you check the table below to see if the GPU is discovered?
Can you run a force reconnect on the XenServer host from CloudStack and check the management server logs to see if the GPUs are being discovered? You can find them in the log with a line matching 'Startup request from directly connected host'. |
Beta Was this translation helpful? Give feedback.
-
Dont we need an enterprise license with NVIDIA to use their vGPU? 🤔 We managed to get GPU working via Passthrough with Ubuntu as the Hypervisor and NVIDIA L4s. But it is done manually by the Admin and not with Cloudstack. Its done outside out cloudstack. Admin has to manually assign the GPU to the specified guest VM. After that, it works but then the GPU, VM and Host are technically married together. Cant shut down and restart it in another Hpervisor with the same GPU because of GPU serial number not being the same. |
Beta Was this translation helpful? Give feedback.
-
Yes, licenses are required. We have them as we previously used this functionality with VMware. I missed the bit in the docs where XenServer is specifically mentioned. That's on me. Thanks, folks. I will look for another way. |
Beta Was this translation helpful? Give feedback.
@meisenst-dnd GPU support as a first-class feature is a backlog item for CloudStack. A talk was given at CCC in Madrid last month regarding some of the issues and a feature proposal. The recording will be available on the YouTube channel for Apache cloudStack soon.
In the meantime, you can try some workaround methods to enable GPU with CloudStack and KVM. I am sharing my notes that should help but I haven't tested these steps.
https://gist.github.com/rajujith/4cc3f17379b63e86f73b041f2be75528
https://gist.github.com/rajujith/f3b3854ed77f2cab8dc4fb5e3ee260c4
References:
https://lab.piszki.pl/cloudstack-kvm-and-running-vm-with-vgpu/
https://www.shapeblue.com/cloudstack-feature-first-look-ena…