[Openstack-operators] [openstack-operators][ceph][nova] How do you handle Nova on Ceph?

Adam Kijak adam.kijak at corp.ovh.com
Wed Oct 12 12:23:41 UTC 2016


> ________________________________________
> From: Xav Paice <xavpaice at gmail.com>
> Sent: Monday, October 10, 2016 8:41 PM
> To: openstack-operators at lists.openstack.org
> Subject: Re: [Openstack-operators] [openstack-operators][ceph][nova] How do you handle Nova on Ceph?
> 
> On Mon, 2016-10-10 at 13:29 +0000, Adam Kijak wrote:
> > Hello,
> >
> > We use a Ceph cluster for Nova (Glance and Cinder as well) and over
> > time,
> > more and more data is stored there. We can't keep the cluster so big
> > because of
> > Ceph's limitations. Sooner or later it needs to be closed for adding
> > new
> > instances, images and volumes. Not to mention it's a big failure
> > domain.
> 
> I'm really keen to hear more about those limitations.

Basically it's all related to the failure domain ("blast radius") and risk management.
Bigger Ceph cluster means more users.
Growing the Ceph cluster temporary slows it down, so many users will be affected.
There are bugs in Ceph which can cause data corruption. It's rare, but when it happens 
it can affect many (maybe all) users of the Ceph cluster.

> >
> > How do you handle this issue?
> > What is your strategy to divide Ceph clusters between compute nodes?
> > How do you solve VM snapshot placement and migration issues then
> > (snapshots will be left on older Ceph)?
> 
> Having played with Ceph and compute on the same hosts, I'm a big fan of
> separating them and having dedicated Ceph hosts, and dedicated compute
> hosts.  That allows me a lot more flexibility with hardware
> configuration and maintenance, easier troubleshooting for resource
> contention, and also allows scaling at different rates.

Exactly, I consider it the best practice as well.




More information about the OpenStack-operators mailing list