[openstack-dev] Scheduler proposal

Clint Byrum clint at fewbar.com
Wed Oct 7 21:19:43 UTC 2015

Excerpts from Zane Bitter's message of 2015-10-07 12:28:36 -0700:
> On 07/10/15 13:36, Ed Leafe wrote:
> > Several months ago I proposed an experiment [0] to see if switching the data model for the Nova scheduler to use Cassandra as the backend would be a significant improvement as opposed to the current design using multiple copies of the same data (compute_node in MySQL DB, HostState in memory in the scheduler, ResourceTracker in memory in the compute node) and trying to keep them all in sync via passing messages.
> It seems to me (disclaimer: not a Nova dev) that which database to use 
> is completely irrelevant to your proposal, which is really about moving 
> the scheduling from a distributed collection of Python processes with 
> ad-hoc (or sometimes completely missing) synchronisation into the 
> database to take advantage of its well-defined semantics. But you've 
> framed it in such a way as to guarantee that this never gets discussed, 
> because everyone will be too busy arguing about whether or not Cassandra 
> is better than Galera.

Your point is valid Zane, that the idea is more about having a
synchronized view of the scheduling state, and not about Cassandra.

I think Cassandra makes the proposal more realistic and easier to think
aboutthough, as Cassandra is focused on problems of the scale that this
represents. Galera won't do this well at any kind of scale, without
the added complexity and inefficiency of cells. So whatever Galera's
capability for a single node to handle the write churn of a truly
synchronized scheduler is, would be the maximum capacity of one cell.

I like the concrete nature of this proposal, and suggest people review
it as a whole, and not try to reduce it to its components without an
extremely strong reason to do so.

More information about the OpenStack-dev mailing list