LXC has a few benefits.  As you likely are aware it is faster than a traditional hypervisor.  But I'm willing to argue that the price paid for that benefit makes it largely not worthwhile for HPC use cases where openstack would see use.<br>

<br>Firstly, and foremost using LXC you immediately lose the ability for your users to define their own compute environment, or choose from more than one pre-existing compute environment templates.  That's huge.  That's one of the primary benefits of cloud in this area over existing technologies.  You eliminate that, and I have to ask why virtualize at all?<br>

<br>Secondly, while LXC does provide a lot of native access, it still does paging management internally just as kvm does.  So direct memory management ( some HPC users like this ) becomes just as problematic as it is in kvm.  Lots of overhead.<br>

<br>Third, the generally espoused major benefit of LXC is Disk I/O.  If you are using lustre or gluster... I don't see any reason you care all that much.  Mind you I've been unable to find benchmarks on lustre use under kvm vs lxc.  My guess is LXC is faster.  My guess is the difference is probably negligible when compared to the general cost of either LXC or KVM over a grid / batch solution.  Additionally there's always hardware pinning. <br>

<br>And of course finally there is the LXC management issue.  LXC has been known to cause a lot of grief for administrators if they hand out root to their users.  Since the containers are directly calling the kernel, and not working through a hypervisor they can conflict each other.  For instance if one unloads NFS or something, it can impact other users access to NFS.   <br>

<br>I guess depending on how you are performing your allocations this last concern may not matter.  If you are allocating entire systems at a time to users, then obviously this doesn't matter.  But that goes back to my first point, if you are doing that, why not just push a physical image to box at allocation time with pxe and call it a day?  And I remind you hardware pinning can be done in kvm and xen, providing at least near native access to some devices ( ie network ).<br>

<br>For the record I figure more than a couple folks are avoiding hypervisors entirely for the time being.  For some loads that certainly makes sense.  Others, I think it's just a general aversion to overhead and not really very strategic thinking.<br>

<br>The opinions expressed here, are entirely my own.<br><br>-Matt<br><br><br><div class="gmail_quote">On Tue, May 29, 2012 at 9:31 PM, Michael Chapman <span dir="ltr"><<a href="mailto:michael.chapman@anu.edu.au" target="_blank">michael.chapman@anu.edu.au</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Matt,<div class="im"><div><br></div><div><font color="#6600cc">LXC is not a good alternative for several obvious reasons.  So think on all of that.</font></div>

</div><div>Could you expand on why you believe LXC is not a good alternative? As an HPC provider we're currently weighing up options to get the most we can out of our Openstack deployment performance-wise. In particular we have quite a bit of IB, a fairly large Lustre deployment and some GPUs, and are seriously considering going down the LXC route to try to avoid wasting all of that by putting a hypervisor on top.</div>


<div><br></div><div> - Michael Chapman<div><div class="h5"><br><br><div class="gmail_quote">On Fri, May 25, 2012 at 1:34 AM, Matt Joyce <span dir="ltr"><<a href="mailto:matt.joyce@cloudscaling.com" target="_blank">matt.joyce@cloudscaling.com</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">We did some considerable HPC testing when I worked over at NASA Ames with the Nebula project.  So I think we may have been the first to try out openstack in an HPC capacity.<br>


<br>If you can find Piyush Mehrotra from the NAS division at Ames, ( I'll leave it to you to look him up ) he has comprehensive OpenStack tests from the Bexar days.  He'd probably be willing to share some of that data if there was interest ( assuming he hasn't already ).<br>


<br>Several points of interest I think worth mentioning are:<br>

<br>I think fundamentally many of the folks who are used to doing HPC work dislike working with hypervisors in general.  The memory management and general i/o latency is something they find to be a bit intolerable.  OpenNebula, and OpenStack rely on the same sets of open source hypervisors.  In fact, I believe OpenStack supports more.  What they do fundamentally is operate as an orchestration layer on top of the hypervisor layer of the stack.  So in terms of performance you should not see much difference between the two at all.  That being said, that's ignoring the possibility of scheduler customisation and the sort.<br>


<br>We ultimately, much like Amazon HPC ended up handing over VMs to customers that consumed all the resources on a system thus negating the benefit of VMs by a large amount.  1 primary reason for this is pinning the 10 gig drivers, or infiniband if you have it, to a single VM allows for direct pass through and no hypervisor latency.  We were seeing a maximum throughput on our 10 gigs of about 8-9 gbit with virtio / jumbo frames via kvm, while hardware was slightly above 10.  Several vendors in the area I have spoken with are engaged in efforts to tie in physical layer provisioning with OpenStack orchestration to bypass the hypervisor entirely.  LXC is not a good alternative for several obvious reasons.  So think on all of that.<br>


<br>GPUs are highly specialised.  Depending on your workloads you may not benefit from them.  Again you have the hardware pinning issue in VMs.  <br><br>As far as Disk I/O is concerned, large datasets need large disk volumes.  Large non immutable disk volumes.  So swift / lafs go right out the window.  nova-volume has some limitations ( or it did at the time ) euca tools couldn't handle 1 TB volumes and the APT maxed out around 2.  So we had users raiding their volumes and asking how to target them to nodes to increase I/O.  This was sub optimal.  Luster or gluster would be better options here.  We chose gluster because we've used luster before, and anyone who has knows it's pain.<br>


<br>As for node targeting users cared about specific families of cpus.  Many people optimised by cpu and wanted to target westmeres of nehalems.  We had no means to do that at the time.  <br><br>Scheduling full instances is somewhat easier so long as all the nodes in your zone are full instance use only.  <br>


<span><font color="#888888">

<br>Matt Joyce<br>Now at Cloudscaling</font></span><div><div><br>

<br><br><br><div class="gmail_quote">On Thu, May 24, 2012 at 5:49 AM, John Paul Walters <span dir="ltr"><<a href="mailto:jwalters@isi.edu" target="_blank">jwalters@isi.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


Hi,<br>

<div><br>

On May 24, 2012, at 5:45 AM, Thierry Carrez wrote:<br>

<br>

><br>

><br>

>> OpenNebula has also this advantage, for me, that it's designed also to<br>

>> provide scientific cloud and it's used by few research centres and even<br>

>> supercomputing centres. How about Openstack? Anyone tried deploy it in<br>

>> supercomputing environment? Maybe huge cluster or GPU cluster or any<br>

>> other scientific group is using Openstack? Is anyone using Openstack in<br>

>> scentific environement or Openstack's purpose is to create commercial<br>

>> only cloud (business - large and small companies)?<br>

><br>

> OpenStack is being used in a number of research clouds, including NeCTAR<br>

> (Australia's national research cloud). There is huge interest around<br>

> bridging the gap there, with companies like Nimbis or Bull being involved.<br>

><br>

> Hopefully people with more information than I have will comment on this<br>

> thread.<br>

><br>

><br>

</div>We're developing GPU, bare metal, and large SMP (think SGI UV) support for Openstack and we're targeting HPC/scientific computing workloads.  It's a work in progress, but we have people using our code and we're talking to folks about getting our code onto nodes within FutureGrid.  We have GPU support for LXC right now, and we're working on adding support for other hypervisors as well.  We're also working on getting the code into shape for merging upstream, some of which (the bare metal work) has already been done.  We had an HPC session at the most recent Design Summit, and it was well-attended with lots of great input.  If there are specific features that you're looking for, we'd love to hear about it.<br>


<br>

By the way, all of our code is available at <a href="https://github.com/usc-isi/nova" target="_blank">https://github.com/usc-isi/nova</a>, so if you'd like to try it out before it gets merged upstream, go for it.<br>


<br>

best,<br>

JP<br>

<div><div><br>

<br>

_______________________________________________<br>

Mailing list: <a href="https://launchpad.net/%7Eopenstack" target="_blank">https://launchpad.net/~openstack</a><br>

Post to     : <a href="mailto:openstack@lists.launchpad.net" target="_blank">openstack@lists.launchpad.net</a><br>

Unsubscribe : <a href="https://launchpad.net/%7Eopenstack" target="_blank">https://launchpad.net/~openstack</a><br>

More help   : <a href="https://help.launchpad.net/ListHelp" target="_blank">https://help.launchpad.net/ListHelp</a><br>

</div></div></blockquote></div><br>

</div></div><br>_______________________________________________<br>

Mailing list: <a href="https://launchpad.net/%7Eopenstack" target="_blank">https://launchpad.net/~openstack</a><br>

Post to     : <a href="mailto:openstack@lists.launchpad.net" target="_blank">openstack@lists.launchpad.net</a><br>

Unsubscribe : <a href="https://launchpad.net/%7Eopenstack" target="_blank">https://launchpad.net/~openstack</a><br>

More help   : <a href="https://help.launchpad.net/ListHelp" target="_blank">https://help.launchpad.net/ListHelp</a><br>

<br></blockquote></div><br><br clear="all"><div><br></div></div></div><span class="HOEnZb"><font color="#888888">-- <br><span style="border-collapse:collapse;font-family:arial,sans-serif;font-size:13px"><div><span style="border-collapse:collapse;font-family:arial,sans-serif;font-size:13px">Michael Chapman</span></div>


<div><span style="border-collapse:collapse;font-family:arial,sans-serif;font-size:13px"><i>Cloud Computing Services</i></span></div><div><span style="border-collapse:collapse;font-family:arial,sans-serif;font-size:13px">ANU Supercomputer Facility<br>


Room 318, Leonard Huxley Building (#56), Mills Road<br>The Australian National University<br>Canberra ACT 0200 Australia</span></div><div><span style="border-collapse:collapse;font-family:arial,sans-serif;font-size:13px">Tel: <i><a href="tel:%2B61%202%206125%C2%A07106" value="+61261257106" target="_blank">+61 2 6125 7106</a></i><br>


Web: <a href="http://nci.org.au" target="_blank">http://nci.org.au</a></span></div></span><br>

</font></span></div>

</blockquote></div><br>