[Openstack] Greatest deployment?

Matt Joyce matt.joyce at cloudscaling.com
Thu May 24 15:34:34 UTC 2012


We did some considerable HPC testing when I worked over at NASA Ames with
the Nebula project.  So I think we may have been the first to try out
openstack in an HPC capacity.

If you can find Piyush Mehrotra from the NAS division at Ames, ( I'll leave
it to you to look him up ) he has comprehensive OpenStack tests from the
Bexar days.  He'd probably be willing to share some of that data if there
was interest ( assuming he hasn't already ).

Several points of interest I think worth mentioning are:

I think fundamentally many of the folks who are used to doing HPC work
dislike working with hypervisors in general.  The memory management and
general i/o latency is something they find to be a bit intolerable.
OpenNebula, and OpenStack rely on the same sets of open source
hypervisors.  In fact, I believe OpenStack supports more.  What they do
fundamentally is operate as an orchestration layer on top of the hypervisor
layer of the stack.  So in terms of performance you should not see much
difference between the two at all.  That being said, that's ignoring the
possibility of scheduler customisation and the sort.

We ultimately, much like Amazon HPC ended up handing over VMs to customers
that consumed all the resources on a system thus negating the benefit of
VMs by a large amount.  1 primary reason for this is pinning the 10 gig
drivers, or infiniband if you have it, to a single VM allows for direct
pass through and no hypervisor latency.  We were seeing a maximum
throughput on our 10 gigs of about 8-9 gbit with virtio / jumbo frames via
kvm, while hardware was slightly above 10.  Several vendors in the area I
have spoken with are engaged in efforts to tie in physical layer
provisioning with OpenStack orchestration to bypass the hypervisor
entirely.  LXC is not a good alternative for several obvious reasons.  So
think on all of that.

GPUs are highly specialised.  Depending on your workloads you may not
benefit from them.  Again you have the hardware pinning issue in VMs.

As far as Disk I/O is concerned, large datasets need large disk volumes.
Large non immutable disk volumes.  So swift / lafs go right out the
window.  nova-volume has some limitations ( or it did at the time ) euca
tools couldn't handle 1 TB volumes and the APT maxed out around 2.  So we
had users raiding their volumes and asking how to target them to nodes to
increase I/O.  This was sub optimal.  Luster or gluster would be better
options here.  We chose gluster because we've used luster before, and
anyone who has knows it's pain.

As for node targeting users cared about specific families of cpus.  Many
people optimised by cpu and wanted to target westmeres of nehalems.  We had
no means to do that at the time.

Scheduling full instances is somewhat easier so long as all the nodes in
your zone are full instance use only.

Matt Joyce
Now at Cloudscaling



On Thu, May 24, 2012 at 5:49 AM, John Paul Walters <jwalters at isi.edu> wrote:

> Hi,
>
> On May 24, 2012, at 5:45 AM, Thierry Carrez wrote:
>
> >
> >
> >> OpenNebula has also this advantage, for me, that it's designed also to
> >> provide scientific cloud and it's used by few research centres and even
> >> supercomputing centres. How about Openstack? Anyone tried deploy it in
> >> supercomputing environment? Maybe huge cluster or GPU cluster or any
> >> other scientific group is using Openstack? Is anyone using Openstack in
> >> scentific environement or Openstack's purpose is to create commercial
> >> only cloud (business - large and small companies)?
> >
> > OpenStack is being used in a number of research clouds, including NeCTAR
> > (Australia's national research cloud). There is huge interest around
> > bridging the gap there, with companies like Nimbis or Bull being
> involved.
> >
> > Hopefully people with more information than I have will comment on this
> > thread.
> >
> >
> We're developing GPU, bare metal, and large SMP (think SGI UV) support for
> Openstack and we're targeting HPC/scientific computing workloads.  It's a
> work in progress, but we have people using our code and we're talking to
> folks about getting our code onto nodes within FutureGrid.  We have GPU
> support for LXC right now, and we're working on adding support for other
> hypervisors as well.  We're also working on getting the code into shape for
> merging upstream, some of which (the bare metal work) has already been
> done.  We had an HPC session at the most recent Design Summit, and it was
> well-attended with lots of great input.  If there are specific features
> that you're looking for, we'd love to hear about it.
>
> By the way, all of our code is available at
> https://github.com/usc-isi/nova, so if you'd like to try it out before it
> gets merged upstream, go for it.
>
> best,
> JP
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack at lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20120524/84ccd6a7/attachment.html>


More information about the Openstack mailing list