[Openstack-operators] Reference architecture for medium sized environment

Joe Topjian joe.topjian at cybera.ca
Mon Aug 19 19:56:34 UTC 2013


Hi Mitch,


On Mon, Aug 19, 2013 at 10:22 AM, Mitch Anderson <mitch at metauser.net> wrote:

> Thanks Joe,
>
> I have a few servers currently.. and a minimal environment running.  Just
> was asked what my "ideal" setup would be for an HA environment.  But
> "ideal" and what the budget would ever be able to allow are still two
> different things.. So I'm trying to make my "ideal" something I know would
> be achievable.
>

I know where you're coming from -- I run into this situation a lot.


> What type of backend NFS setup are you using?  I look for "enterprise"
> solutions(by that I'm just really looking for a failover head mostly) and
> come away not really liking any of them... but maybe I'm missing something
> with them, or I've only seen a very narrow segment of that market....
>

In one cloud, I'm using a NetApp 3240 appliance. The deciding factor on
this at the time was the availability of an Essex driver. After running it
in production for a year, it's been the best purchase we could have ever
made. I'd recommend it to anyone who can afford it.

On other clouds, I'm using standard Linux NFS. There is drive redundancy,
but no proper HA -- simply because the SLA doesn't require it. Building a
NetApp-type storage backend with deduplication and multiple controllers is
something on my list.


> Great tip on the compute nodes!  Currently I have a pair of refurb Dell
> R610's that I'm using which are 6 core Xeons.  And was looking at buying
> several Dell R515s, for controller HA cluster and a few for the Ceph
> cluser...  I don't plan on running anything other than Xeons for the
> compute, but the AMD's are a bit cheaper so using them for ancillary roles.
>

If you can afford it, look at the r720xd. It's a monster in terms of how
much storage and RAM it can support. You can purchase a "lightly"
configured one and upgrade later. For example, purchase it with two
internal 2.5" drives for the OS and four 3.5" drives for storage (ceph or
whatever you decide to go with). It can handle an addition eight 3.5"
drives for later expansion.

I've worked with both 610's and 510's, too, and they're great servers.

I've also worked with c6100's and c6220's. They're very interesting form
factors and they can save you a good amount of money in the long run if
they fit your environment. A bit difficult to troubleshoot, though, simply
because they share a chassis and that's the first thing that Dell wants to
troubleshoot (but is never actually the issue).


>
> thanks for the help
>
>
> On Mon, Aug 19, 2013 at 8:28 AM, Joe Topjian <joe.topjian at cybera.ca>wrote:
>
>> Hi Mitch,
>>
>> I totally understand limited budgets -- I work for a non-profit. :) With
>> your budget, are you utilizing existing hardware or purchasing new
>> hardware? If the latter, maybe look at cutting back on the compute expense
>> in favor of storage. For example, a 6-core CPU costs approximately $400
>> while an 8-core costs $1300:
>>
>> http://ark.intel.com/products/64584
>> http://ark.intel.com/products/64594
>>
>> I came across that through a typo in a quote I recently received. After
>> review, we decided to go with the 6-core and used the money we saved to buy
>> two extra compute nodes!
>>
>> Regarding the instances that each compute node will be running: you said
>> there will be different sizes and needs. I come across two very distinct
>> needs in my environments: infrastructure instances (web, db servers) and
>> research (data processing).
>>
>> With the former, hosting 40-60 instances per compute node is absolute
>> cake. Could probably double that. Using a 6-core CPU as an example, if a
>> server has two of those, that's 12 cores, multiplied by 2 for
>> hyper-threading (24), then multiplied by 16 (which is the default CPU
>> overcommit in OpenStack) for a grand total of 384 vcpus. Divide that by 60
>> and you get an average of 6 vpus per instance. Then factor in how many of
>> those instances are going to be idle in terms of CPU.
>>
>> (with the 8-core you would get 8 vcpus per instance)
>>
>> Research tasks, on the other hand, should be calculated by physical cores
>> and expect that instances will peg cores for hours while they do work. I
>> know Tim Bell has been blogging a bit about CERN's OpenStack design, but
>> I'm not sure if he's talked much about this aspect. One of my clouds gets
>> some CERN-related work and I've had compute nodes completely lock up and
>> require a reboot.
>>
>> So that's CPU (and memory is pretty much the same. There's been some
>> discussion on this list in the past about using KSM and overcommitting --
>> try to look them up as they are good topics. My $0.02 is that I use KSM and
>> memory overcommit).
>>
>> Storage is a whole other beast.
>>
>> I can't comment on Ceph, and don't want to disregard it, but from having
>> worked with Gluster, distributed storage is still something I'm not
>> comfortable using in production.
>>
>> All clouds I've built as of a year ago are using a centralized NFS server
>> for instance and ephemeral storage. One cloud uses the same NFS server for
>> iSCSI-based block storage while the rest use the NFS driver. I plan on
>> migrating that iSCSI cloud to NFS. I'll probably take an IO hit, but I
>> enjoy not working with iSCSI. :)
>>
>> Glance is stored on a plain file backend that is rsync'd every few hours
>> to another server as a cold-backup.
>>
>> Whatever you choose to do, make sure you can live migrate instances one
>> way or another. In terms of operational / administrative responsibilities,
>> this will help in so many ways. Block storage HA is important, too, but I'd
>> rank live migration as a #1 priority.
>>
>> In terms of cost, I used to treat storage as an afterthought, but now it
>> gets 50% or more of the budget.
>>
>> Hope that helps
>>
>>
>> On Sat, Aug 17, 2013 at 10:55 AM, Mitch Anderson <mitch at metauser.net>wrote:
>>
>>> I would like to think of the compute as 'failure expectant' but the
>>> instance store is a huge hold back.  I have a limited budget and would like
>>> to get the best environment as possible on it.  With that, consolidating
>>> storage is a huge priority.  Running the glance from the Ceph cluster would
>>> definately be a plus.  However, needing shared storage for
>>> /var/lib/nova/instances as well as the ceph cluster means I need an HA NFS
>>> setup as well as the Ceph storage nodes.  I think the only thing I will be
>>> able to get passed off would be one or the other.  Which I assume means I
>>> need this:
>>> https://blueprints.launchpad.net/nova/+spec/bring-rbd-support-libvirt-images-type to
>>> get approved and implemented for havana... what shared ephemeral instance
>>> stores is everyone using?
>>>
>>>
>>> On Sat, Aug 17, 2013 at 10:33 AM, Abel Lopez <alopgeek at gmail.com> wrote:
>>>
>>>> I believe the general consensus for production systems is to not run
>>>> ceph on compute nodes. Compute nodes should be solely used as instance
>>>> resources. Plus, compute nodes should be 'failure expectant', you should be
>>>> able to just pull one out and replace it with a blank box. Adding storage
>>>> cluster to the mix just complicates maintenance planning, etc. Plus,
>>>> rule-of-thumb for ceph is 1GHz per OSD, which can be significant depending
>>>> on the number of disks you're planning on.
>>>>
>>>> Since you're starting from scratch, I would recommend having your
>>>> glance utilize the ceph cluster you're planning. You get added benefits by
>>>> using qcow2 disk images in ceph, as new instances are launched as COW
>>>> clones.
>>>>
>>>> As for 'minimal storage' on your compute nodes, I assume that you're
>>>> intending to have a shared '/var/lib/nova/instances/' directory, as each vm
>>>> will need a disk file. This has the added benefit of being a prerequisite
>>>> for vm migration.
>>>>
>>>> Hope that helps.
>>>>
>>>> On Aug 16, 2013, at 11:05 PM, Mitch Anderson <mitch at metauser.net>
>>>> wrote:
>>>>
>>>> > aHi all,
>>>> >
>>>> > I've been looking around for example architectures for types of
>>>> sytems and numbers for an HA setup of Openstack Grizzly (Probably won't go
>>>> live until after havana is released).
>>>> >
>>>> > I've done a Simple Non-HA setup with Mirantis' Fuel.  Which has
>>>> worked out well.  And they're documented Production HA setup is with 3
>>>> Controllers and N compute nodes...
>>>> >
>>>> > If I were to use Ceph for storage I would need a minimum of atleast 3
>>>> nodes.  I was looking to make my compute nodes have minimal disk space so
>>>> only my Controllers would have storage (for Glance, DB's, etc..) and the
>>>> Ceph storage nodes would have the rest.  Is this solution preferred?  Or,
>>>> do I run Ceph on the compute nodes?  If so, what size nodes should they be
>>>> then?  I'd like to run 40-60 VM's per compute node of varying sizes and
>>>> needs.
>>>> >
>>>> > Any pointers would be much appreciated!
>>>> >
>>>> > -Mitch Anderson
>>>> > _______________________________________________
>>>> > OpenStack-operators mailing list
>>>> > OpenStack-operators at lists.openstack.org
>>>> >
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>
>>>>
>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>
>>>
>>
>>
>> --
>> Joe Topjian
>> Systems Architect
>> Cybera Inc.
>>
>> www.cybera.ca
>>
>> Cybera is a not-for-profit organization that works to spur and support
>> innovation, for the economic benefit of Alberta, through the use
>> of cyberinfrastructure.
>>
>
>


-- 
Joe Topjian
Systems Architect
Cybera Inc.

www.cybera.ca

Cybera is a not-for-profit organization that works to spur and support
innovation, for the economic benefit of Alberta, through the use
of cyberinfrastructure.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20130819/7e92dec3/attachment.html>


More information about the OpenStack-operators mailing list