[openstack-hpc] What's the state of openstack-hpc now?
Andrew J Younge
ajyounge at indiana.edu
Tue Mar 15 18:28:45 UTC 2016
In regard to GPU and InfiniBand performance within KVM, I'd like to
point to a recent publication of ours at VEE15 titled "Supporting High
Performance Molecular Dynamics in Virtualized Clusters using IOMMU,
SR-IOV, and GPUDirect." In the paper we show that using Nvidia GPUs
and SR-IOV Mellanox CX3 InfiniBand adapters, we can support two MPI
based HPC MD simulation applications, LAMMPS and HOOMD, running on a
small cluster. We found overhead to be under 2% when compared to bare
metal (no virtualization) for our HPC applications, which we consider
to be very good. I'll leave the details for the paper itself, but if
anybody has any specific questions, feel free to send me and/or my
co-authors an email.
Andrew J. Younge
School of Informatics & Computing
Indiana University / Bloomington, IN USA
ajyounge at indiana.edu / http://ajyounge.com
On Tue, Mar 15, 2016 at 12:27 PM, Erez Cohen <erezc at mellanox.com> wrote:
> If I may add, on the networking side OpenStack with KVM can get very close to bare meta performance and functionality. With the native support of SR-IOV, supported NICs can provide near bare metal latency and throughput.
> Another very popular network capability for HPC is RDMA. With SR-IOV capable NIC devices will expose RDMA interfaces to the guest allowing them to run verbs based applications (like MPI) with similar efficiency as bare metal. RDMA is supported natively over RoCE (RDMA over Converged Ethernet) and is also supported over InfiniBand.
> -----Original Message-----
> From: Blair Bethwaite [mailto:blair.bethwaite at gmail.com]
> Sent: Tuesday, March 15, 2016 3:54 PM
> To: appleorchard2000 at gmail.com
> Cc: openstack-hpc at lists.openstack.org
> Subject: Re: [openstack-hpc] What's the state of openstack-hpc now?
> Apologies for top-posting but I don't intend to answer all the historical project points you've raised. Regarding old things floating around on github, your mileage may vary, but I doubt at this point you want to be looking at any of that in great detail. You haven't really explained what you mean by or want from HPC in this context, so I'm guessing a little based on your other questions...
> OpenStack is many things to different people and organisations, but at the software core is a very flexible infrastructure provisioning framework. HPC requires infrastructure (compute, network, storage), and OpenStack can certainly deliver it - make your deployment choices to suit your use-cases. A major choice would be whether you will use full system virtualisation or bare-metal or containers or <insert next
> trend> - that choice largely depends on your typical workloads and
> what style of cluster you want. Beyond that, compared to "typical"
> cloud hardware - faster CPUs, faster memory, faster network (probably with much greater east-west capacity), integration of a suitable parallel file-system.
> However, OpenStack is not a HPC management / scheduling / queuing / middleware system - there are lots of those already and you should pick one that fits your requirements and then (if it helps) run it atop an OpenStack cloud (it might help, e.g., if you want to run multiple logical clusters on the same physical infrastructure, if you want to mix other more traditional cloud workloads in, if you're just doing everything with OpenStack like the other cool kids). There are lots of nuances here, e.g., where one scheduler might lend itself better to more dynamic infrastructure (adding/removing instances), another might be lighter-weight for use with a Cluster-as-a-Service deployment model, whilst another suits a multi-user managed service style cluster. I'm sure there is good experience and opinion hidden on this list if you want to interrogate those sorts of choices more specifically.
> Most of the relevant choices you need to make with respect to running HPC workloads on infrastructure that is provisioned through OpenStack will come down to your hypervisor choices. My preference for now is to stick with the OpenStack community's most popular free OS and hypervisor (Ubuntu and KVM+Libvirt) - when I facilitated the hypervisor-tuning ops session at the Vancouver summit (with a bunch of folks interested in HPC on OpenStack) there was no-one in the room running a different hypervisor, though several were using RHEL. With the right tuning KVM can get you to within a hair's breadth of bare-metal performance for a wide range of CPU, memory and inter-process comms benchmarks, plus you can easily make use of PCI passthrough for latency sensitive or "difficult" devices like NICs/HCAs and GPGPUs. And the "right tuning" is not really some arcane knowledge, it's mainly about exposing host CPU capabilities, pinning vCPUs to pCPUs, and tuning or pinning and exposing NUMA !
> topology - most of this is supported directly through OpenStack-native features now.
> To answer the GPU question more explicitly - yes you can do this.
> Mainly you need to ensure you're getting compatible hardware (GPU and relevant motherboard components) - most of the typical GPGPU choices (e.g. K80, K40, M60) will work, and you should probably be wary of PCIe switches unless you know exactly what you're doing (recommend trying before buying). At the OpenStack level you just define the PCI devices you want OpenStack Nova to provision and you can then define custom instance-types/flavors that will get a GPU passed through.
> Similar things go for networking.
> Lastly, just because you can do this doesn't make it a good idea...
> OpenStack is complex, HPC systems are complex, layering one complicated thing on another is a good way to create tricky problems that hide in the interface between the two layers. So make sure you're gaining something from having OpenStack in the mix here.
> On 15 March 2016 at 23:00, <openstack-hpc-request at lists.openstack.org> wrote:
>> Message: 1
>> Date: Tue, 15 Mar 2016 19:05:38 +0800
>> From: "me,apporc" <appleorchard2000 at gmail.com>
>> To: openstack-hpc at lists.openstack.org
>> Subject: [openstack-hpc] What's the state of openstack-hpc now?
>> <CAOBTi0sftGTG-fscM-C5wLu6bTgZMaLaM2eXBJpa0a=vkPDusg at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>> Hi, all
>> I found this etherpad which was created long time ago, inside which
>> there are some blueprints: support-heterogeneous-archs,
>> heterogeneous-instance-types and
>> schedule-instances-on-heterogeneous-architectures .
>> But those blueprints had been obselete since year 2014, and some of
>> its patches were abandoned.
>> There however is a forked branch github or launchpad, which is
>> diverged far away from nova/trunk, and not updated since 2014 too.
>> Is that we just abandoned those blueprints in openstack or else?
>> Besides, there is a CaaS project called Senlin, which refered to
>> the word "HPC" in its wiki. But it seems like not really related.
>> "Cluster" can mean many things, but hpc is some kind different.
>> I can not get the status of GPU support in nova. As the case of
>> network, SR-IOV seems ok. For storage, i don't know what the word
>> "mi2" means in etherpad.
>> According to what i got above, it seems we can not use hpc in
>> openstack now. But there are some videos here, here and
>> here.Since we can not get GPU in nova instance, are they just
>> building traditional hpcs without GPU?
>> I need more information, thanks in advance.
>> 1. https://etherpad.openstack.org/p/HVHsTqOQGc
>> s 3.
>> 5. https://github.com/usc-isi/nova
>> 6. https://code.launchpad.net/~usc-isi/nova/hpc-trunk
>> 7. https://wiki.openstack.org/wiki/CaaS
>> 8. https://wiki.openstack.org/wiki/Senlin
>> 9. https://wiki.openstack.org/wiki/SR-IOV-Passthrough-For-Networking
>> -------------- next part -------------- An HTML attachment was
>> OpenStack-HPC mailing list
>> OpenStack-HPC at lists.openstack.org
>> End of OpenStack-HPC Digest, Vol 30, Issue 2
> OpenStack-HPC mailing list
> OpenStack-HPC at lists.openstack.org
> OpenStack-HPC mailing list
> OpenStack-HPC at lists.openstack.org
More information about the OpenStack-HPC