We did some considerable HPC testing when I worked over at NASA Ames with the Nebula project.  So I think we may have been the first to try out openstack in an HPC capacity.<br><br>If you can find Piyush Mehrotra from the NAS division at Ames, ( I'll leave it to you to look him up ) he has comprehensive OpenStack tests from the Bexar days.  He'd probably be willing to share some of that data if there was interest ( assuming he hasn't already ).<br>

<br>Several points of interest I think worth mentioning are:<br>

<br>I think fundamentally many of the folks who are used to doing HPC work dislike working with hypervisors in general.  The memory management and general i/o latency is something they find to be a bit intolerable.  OpenNebula, and OpenStack rely on the same sets of open source hypervisors.  In fact, I believe OpenStack supports more.  What they do fundamentally is operate as an orchestration layer on top of the hypervisor layer of the stack.  So in terms of performance you should not see much difference between the two at all.  That being said, that's ignoring the possibility of scheduler customisation and the sort.<br>


<br>We ultimately, much like Amazon HPC ended up handing over VMs to customers that consumed all the resources on a system thus negating the benefit of VMs by a large amount.  1 primary reason for this is pinning the 10 gig drivers, or infiniband if you have it, to a single VM allows for direct pass through and no hypervisor latency.  We were seeing a maximum throughput on our 10 gigs of about 8-9 gbit with virtio / jumbo frames via kvm, while hardware was slightly above 10.  Several vendors in the area I have spoken with are engaged in efforts to tie in physical layer provisioning with OpenStack orchestration to bypass the hypervisor entirely.  LXC is not a good alternative for several obvious reasons.  So think on all of that.<br>


<br>GPUs are highly specialised.  Depending on your workloads you may not benefit from them.  Again you have the hardware pinning issue in VMs.  <br><br>As far as Disk I/O is concerned, large datasets need large disk volumes.  Large non immutable disk volumes.  So swift / lafs go right out the window.  nova-volume has some limitations ( or it did at the time ) euca tools couldn't handle 1 TB volumes and the APT maxed out around 2.  So we had users raiding their volumes and asking how to target them to nodes to increase I/O.  This was sub optimal.  Luster or gluster would be better options here.  We chose gluster because we've used luster before, and anyone who has knows it's pain.<br>

<br>As for node targeting users cared about specific families of cpus.  Many people optimised by cpu and wanted to target westmeres of nehalems.  We had no means to do that at the time.  <br><br>Scheduling full instances is somewhat easier so long as all the nodes in your zone are full instance use only.  <br>

<br>Matt Joyce<br>Now at Cloudscaling<br>

<br><br><br><div class="gmail_quote">On Thu, May 24, 2012 at 5:49 AM, John Paul Walters <span dir="ltr"><<a href="mailto:jwalters@isi.edu" target="_blank">jwalters@isi.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


Hi,<br>

<div><br>

On May 24, 2012, at 5:45 AM, Thierry Carrez wrote:<br>

<br>

><br>

><br>

>> OpenNebula has also this advantage, for me, that it's designed also to<br>

>> provide scientific cloud and it's used by few research centres and even<br>

>> supercomputing centres. How about Openstack? Anyone tried deploy it in<br>

>> supercomputing environment? Maybe huge cluster or GPU cluster or any<br>

>> other scientific group is using Openstack? Is anyone using Openstack in<br>

>> scentific environement or Openstack's purpose is to create commercial<br>

>> only cloud (business - large and small companies)?<br>

><br>

> OpenStack is being used in a number of research clouds, including NeCTAR<br>

> (Australia's national research cloud). There is huge interest around<br>

> bridging the gap there, with companies like Nimbis or Bull being involved.<br>

><br>

> Hopefully people with more information than I have will comment on this<br>

> thread.<br>

><br>

><br>

</div>We're developing GPU, bare metal, and large SMP (think SGI UV) support for Openstack and we're targeting HPC/scientific computing workloads.  It's a work in progress, but we have people using our code and we're talking to folks about getting our code onto nodes within FutureGrid.  We have GPU support for LXC right now, and we're working on adding support for other hypervisors as well.  We're also working on getting the code into shape for merging upstream, some of which (the bare metal work) has already been done.  We had an HPC session at the most recent Design Summit, and it was well-attended with lots of great input.  If there are specific features that you're looking for, we'd love to hear about it.<br>


<br>

By the way, all of our code is available at <a href="https://github.com/usc-isi/nova" target="_blank">https://github.com/usc-isi/nova</a>, so if you'd like to try it out before it gets merged upstream, go for it.<br>


<br>

best,<br>

JP<br>

<div><div><br>

<br>

_______________________________________________<br>

Mailing list: <a href="https://launchpad.net/%7Eopenstack" target="_blank">https://launchpad.net/~openstack</a><br>

Post to     : <a href="mailto:openstack@lists.launchpad.net" target="_blank">openstack@lists.launchpad.net</a><br>

Unsubscribe : <a href="https://launchpad.net/%7Eopenstack" target="_blank">https://launchpad.net/~openstack</a><br>

More help   : <a href="https://help.launchpad.net/ListHelp" target="_blank">https://help.launchpad.net/ListHelp</a><br>

</div></div></blockquote></div><br>