Open Stack

Wed Jul 15 18:25:50 UTC 2015

On 16 July 2015 at 02:18, Ed Leafe <ed at leafe.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
...
> What I'd like to investigate is replacing the current design of having
> the compute nodes communicating with the scheduler via message queues.
> This design is overly complex and has several known scalability
> issues. My thought is to replace this with a Cassandra [1] backend.
> Compute nodes would update their state to Cassandra whenever they
> change, and that data would be read by the scheduler to make its host
> selection. When the scheduler chooses a host, it would post the claim
> to Cassandra wrapped in a lightweight transaction, which would ensure
> that no other scheduler has tried to claim those resources. When the
> host has built the requested VM, it will delete the claim and update
> Cassandra with its current state.

+1 on doing an experiment.

Some semi-random thoughts here. Well, not random at all, I've been
mulling on this for a while.

I think Kafka may fit our model significantly vis-a-vis updating state
more closely than Cassandra does. It would be neat if we could do a
few different sketchy implementations and head-to-head test them. I
love Cassandra in a lot of ways, but lightweight-transaction are two
words that I'd really not expect to see in Cassandra (Yes, I know it
has them in the official docs and design :)) - its a full paxos
interaction to do SERIAL consistency, which is more work than ether
QUORUM or LOCAL_QUORUM. A sharded approach - there is only one compute
node in question for the update needed - can be less work than either
and still race free.

I too also very much want to see us move to brokerless RPC,
systematically, for all the reasons :). You might need a little of
that mixed in to the experiments, depending on the scale reached.

In terms of quantification; are you looking to test scalability (e.g.
scheduling some N events per second without races), [there are huge
improvements possible by rewriting the current schedulers innards to
be less wasteful, but that doesn't address active-active setups],
latency (e.g. 99th percentile time-to-schedule) or <...> ?

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud

Open Stack

[openstack-dev] [nova] Proposal for an Experiment

OpenStack

Community

Documentation

Branding & Legal