On Mon, Jun 14, 2021 at 4:37 PM António Paulo <antonio.paulo@cern.ch> wrote:
Hi!

Has anyone looked into instancing VMs with NVIDIA's Multi-Instance GPU
(MIG) devices [1] without having to rely on vGPUs? Unfortunately, NVIDIA
vGPUs lack tracing and profiling support that our users need.

I could not find anything specific to MIG in the OpenStack docs but I
was wondering if doing PCI passthrough [2] of MIG devices is an option
that someone has seen or tested?

Maybe some massaging to expose the MIG as a Linux device is required [3]?


Nividia MIG feature is orthogonal to virtual GPUs and hardware dependent.
As the latter, this is not really something we can "support" upstream as our upstream CI can't just verify it.

Some downstream vendors tho have work efforts for trying to test this with their own solutions but again, not something we can discuss it here.

Cheers,
António

[1] https://docs.nvidia.com/datacenter/tesla/mig-user-guide/
[2] https://docs.openstack.org/nova/pike/admin/pci-passthrough.html
[3] https://docs.nvidia.com/datacenter/tesla/mig-user-guide/#device-nodes