[openstack-dev] [nova] Proposal for an Experiment

John Garbutt john at johngarbutt.com
Thu Jul 16 11:23:07 UTC 2015


On 15 July 2015 at 19:25, Robert Collins <robertc at robertcollins.net> wrote:
> On 16 July 2015 at 02:18, Ed Leafe <ed at leafe.com> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA512
> ...
>> What I'd like to investigate is replacing the current design of having
>> the compute nodes communicating with the scheduler via message queues.
>> This design is overly complex and has several known scalability
>> issues. My thought is to replace this with a Cassandra [1] backend.
>> Compute nodes would update their state to Cassandra whenever they
>> change, and that data would be read by the scheduler to make its host
>> selection. When the scheduler chooses a host, it would post the claim
>> to Cassandra wrapped in a lightweight transaction, which would ensure
>> that no other scheduler has tried to claim those resources. When the
>> host has built the requested VM, it will delete the claim and update
>> Cassandra with its current state.
>
> +1 on doing an experiment.
>
> Some semi-random thoughts here. Well, not random at all, I've been
> mulling on this for a while.
>
> I think Kafka may fit our model significantly vis-a-vis updating state
> more closely than Cassandra does. It would be neat if we could do a
> few different sketchy implementations and head-to-head test them. I
> love Cassandra in a lot of ways, but lightweight-transaction are two
> words that I'd really not expect to see in Cassandra (Yes, I know it
> has them in the official docs and design :)) - its a full paxos
> interaction to do SERIAL consistency, which is more work than ether
> QUORUM or LOCAL_QUORUM. A sharded approach - there is only one compute
> node in question for the update needed - can be less work than either
> and still race free.
>
> I too also very much want to see us move to brokerless RPC,
> systematically, for all the reasons :). You might need a little of
> that mixed in to the experiments, depending on the scale reached.
>
> In terms of quantification; are you looking to test scalability (e.g.
> scheduling some N events per second without races), [there are huge
> improvements possible by rewriting the current schedulers innards to
> be less wasteful, but that doesn't address active-active setups],
> latency (e.g. 99th percentile time-to-schedule) or <...> ?

+1 for trying Kafka

I have tried to write up my thoughts on the Kafka approach (and a few
related things) in here:
https://review.openstack.org/#/c/191914/5/specs/backlog/approved/parallel-scheduler.rst,cm

Its trying to describe what I want to prototype for the next
scheduler, its also possibly one of the worse specs I have ever seen.
There may be some ideas worth nicking in there (there may not be!)

John

PS
I also cover my want for multiple schedulers living in Nova, long term
(We already have 2.5 schedulers, depending on how you count them)
I can see some of these schedulers being the "best" for a sub set of
deployments.



More information about the OpenStack-dev mailing list