[Openstack-operators] How are folks providing GPU instance types?

Blair Bethwaite blair.bethwaite at gmail.com
Thu Jun 9 01:58:00 UTC 2016


Finally circled back to this thread...

Joe - those are great notes!

On 12 May 2016 at 02:51, Joe Topjian <joe at topjian.net> wrote:
> * I found that I didn't have to use EFI-based images. I wonder why that is?

Yeah, we've never run into this as a requirement either.
Peter - can you clarify?

We've had success with Dell R720s and R730s (also about to try C4130s)
passing through various NVIDIA cards: M2070[Q], GRID K1 and K2, K5000,
K40, K80. With the K1s and K2s we define flavors that on individual
GPUs of the board (so a K1 can support 4x GPU instances and a K2 can
support 2x GPU instances - NB: this is not vGPU, which requires a
special hypervisor and only supports windoze guests). Current
hypervisor OS is Ubuntu Trusty, using libvirt+KVM.

Further to the vGPU comment above - apparently the new AMD FirePro
S7150 cards use SRIOV for their multi-gpu support, so I hope may be
more amenable to sharing with wider hypervisor and guest support
(however the only supported hypervisor listed is VMware). Seems like
they might be good for remote visualisation use-cases (opengl and
linux support). Has anyone tried these cards?

A niggling issue we have with our OpenStack GPU infrastructure is how
to deal with special host consumable resources like GPUs in the
scheduler. In practice we have to reserve the whole host only for GPU
instances as otherwise it could be filled with non-GPU instances even
when GPU/s are available (not a good outcome when the GPUs make up
>=50% of the value of the box). But the corollary that follows is that
the GPU hypervisors are often not well utilised, so ideally we'd have
some way to limit the quantity of regular non-GPU instances on those
boxes so that there was always capacity for GPU instances.

-- 
Cheers,
~Blairo



More information about the OpenStack-operators mailing list