[Openstack-operators] Hyper-converged OpenStack with Ceph

Nick Jones nick.jones at datacentred.co.uk
Thu Mar 19 22:02:33 UTC 2015


I think the answer is:  It depends.  Echoing Warren's experience below,
we've also seen extraordinarily high load averages on our OSD nodes during
certain recovery scenarios with Ceph.  Putting instances on there would
simply not be viable under such circumstances.

Ceph's documentation was recently(-ish) updated to say that "[..] during
recovery they [OSDs] need significantly more RAM (e.g., ~1GB per 1TB of
storage per daemon).".  On nodes where you have multiple 4TB drives
provisioned (for example) it's easy to chew through all available memory
leaving precious little available for anything else.

But of course, if you were to build your converged compute / storage nodes
with smaller amounts of disk per daemon, use only SSDs, and ensure there's
plenty of memory left over after taking into consideration the above
calculation then the answer becomes 'maybe'.


-- 

-Nick

On 19 March 2015 at 19:07, Warren Wang <warren at wangspeed.com> wrote:

> I would avoid co-locating Ceph and compute processes. Memory on compute
> nodes is a scare resource, if you're not running with any overcommit, which
> you shouldn't. Ceph requires a fair amount (2GB per OSD to be safe) of
> guaranteed memory to deal with recovery. You can certainly overload memory
> and reserve it, but it is just going to make things difficult to manage and
> troubleshoot. I'll give an example. I have 2 Ceph clusters that were
> experiencing aggressive page scanning and page cache reclaimation under
> some moderate workload. Enough to drive the load on an OSD server to 4
> digits. If that had occurred on a box also running compute resources, we
> would have had tickets rolling in. However, all we did is slow down some of
> the storage, so it largely went unnoticed.
>
> There may also come a time when package dependencies cause conflicts that
> will be difficult to reconcile. OVS, kernel, Ceph, etc. It's possible to
> attempt to dedicate resources on a single host to various processes, but I
> personally don't think it's worth the effort.
>
> Warren
>
> Warren
>
> On Thu, Mar 19, 2015 at 12:33 PM, Fox, Kevin M <Kevin.Fox at pnnl.gov> wrote:
>
>> We've running it both ways. We have clouds with dedicated storage nodes,
>> and clouds sharing storage/compute.
>>
>> The storage/compute solution with ceph is working ok for us. But, that
>> particular cloud is 1gigabit only and seems very slow compared to our other
>> clouds. But because of the gigabit interconnect, while the others are
>> 40gigabit, its not clear if its slow because of the storage/compute
>> together, or simply because of the slower interconnect. Could be some of
>> both.
>>
>> I'd be very curious if anyone else had a feeling for storage/compute
>> together on a faster interconnect.
>>
>> Thanks,
>> Kevin
>>
>> ________________________________________
>> From: Jesse Keating [jlk at bluebox.net]
>> Sent: Thursday, March 19, 2015 9:20 AM
>> To: openstack-operators at lists.openstack.org
>> Subject: Re: [Openstack-operators] Hyper-converged OpenStack with Ceph
>>
>> On 3/19/15 9:08 AM, Jared Cook wrote:
>> > Hi, I'm starting to see a number of vendors push hyper-converged
>> > OpenStack solutions where compute and Ceph OSD nodes are one in the
>> > same.  In addition, Ceph monitors are placed on OpenStack controller
>> > nodes in these architectures.
>> >
>> > Recommendations I have read in the past have been to keep these things
>> > separate, but some vendors are now saying that this actually works out
>> > OK in practice.
>> >
>> > The biggest concern I have is that the compute node functions will
>> > compete with Ceph functions, and one over utilized node will slow down
>> > the entire Ceph cluster, which will slow down the entire cloud.  Is this
>> > an unfounded concern?
>> >
>> > Does anyone have experience running in this mode?  Experience at scale?
>> >
>> >
>>
>> Not CEPH related, but it's a known tradeoff that compute resource on
>> control nodes can cause resource competition. This is a tradeoff for the
>> total cost of the cluster and the expected use case. If the use case
>> plans to scale out to many compute nodes, we suggest upgrading to
>> dedicated control nodes. This is higher cost, but somewhat necessary for
>> matching performance to capacity.
>>
>> We may start small, but we can scale up to match the (growing) needs.
>>
>>
>> --
>> -jlk
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>

-- 
DataCentred Limited registered in England and Wales no. 05611763
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20150319/9d786894/attachment-0001.html>


More information about the OpenStack-operators mailing list