Open Stack

Wed Jul 15 18:08:55 UTC 2015

On 07/15/15 20:40, Clint Byrum wrote:
> What you describe is a spike. It's a grand plan, and you don't need
> anyone's permission, so huzzah for the spike!
>
> As far as what should be improved, I hear a lot that having multiple
> schedulers does not scale well, so I'd suggest that as a primary target
> (maybe measure the _current_ problem, and then set the target as a 10x
> improvement over what we have now).
>
> Things to consider while pushing on that goal:
>
> * Do not backslide the resilience in the system. The code is just now
> starting to be fault tolerant when talking to RabbitMQ, so make sure
> to also consider how tolerant of failures this will be. Cassandra is
> typically chosen for its resilience and performance, but Cassandra does
> a neat trick in that clients can switch its CAP theorem profile from
> Consistent and Available (but slow) to Available and Performant when
> reading things. That might be useful in the context of trying to push
> the performance _UP_ for schedulers, while not breaking anything else.
>
> * Consider the cost of introducing a brand new technology into the
> deployer space. If there _is_ a way to get the desired improvement with,
> say, just MySQL and some clever sharding, then that might be a smaller
> pill to swallow for deployers.
+1000 to this part regarding introducing a new technology
>
> Anyway, I wish you well on this endeavor and hope to see your results
> soon!
>
> Excerpts from Ed Leafe's message of 2015-07-15 07:18:42 -0700:
>> Hash: SHA512
>>
>> Changing the architecture of a complex system such as Nova is never
>> easy, even when we know that the design isn't working as well as we
>> need it to. And it's even more frustrating because when the change is
>> complete, it's hard to know if the improvement, if any, was worth it.
>>
>> So I had an idea: what if we ran a test of that architecture change
>> out-of-tree? In other words, create a separate deployment, and rip out
>> the parts that don't work well, replacing them with an alternative
>> design. There would be no Gerrit reviews or anything that would slow
>> down the work or add load to the already overloaded reviewers. Then we
>> could see if this modified system is a significant-enough improvement
>> to justify investing the time in implementing it in-tree. And, of
>> course, if the test doesn't show what was hoped for, it is scrapped
>> and we start thinking anew.
>>
>> The important part in this process is defining up front what level of
>> improvement would be needed to make considering actually making such a
>> change worthwhile, and what sort of tests would demonstrate whether or
>> not whether this level was met. I'd like to discuss such an experiment
>> next week at the Nova mid-cycle.
>>
>> What I'd like to investigate is replacing the current design of having
>> the compute nodes communicating with the scheduler via message queues.
>> This design is overly complex and has several known scalability
>> issues. My thought is to replace this with a Cassandra [1] backend.
>> Compute nodes would update their state to Cassandra whenever they
>> change, and that data would be read by the scheduler to make its host
>> selection. When the scheduler chooses a host, it would post the claim
>> to Cassandra wrapped in a lightweight transaction, which would ensure
>> that no other scheduler has tried to claim those resources. When the
>> host has built the requested VM, it will delete the claim and update
>> Cassandra with its current state.
>>
>> One main motivation for using Cassandra over the current design is
>> that it will enable us to run multiple schedulers without increasing
>> the raciness of the system. Another is that it will greatly simplify a
>> lot of the internal plumbing we've set up to implement in Nova what we
>> would get out of the box with Cassandra. A third is that if this
>> proves to be a success, it would also be able to be used further down
>> the road to simplify inter-cell communication (but this is getting
>> ahead of ourselves...). I've worked with Cassandra before and it has
>> been rock-solid to run and simple to set up. I've also had preliminary
>> technical reviews with the engineers at DataStax [2], the company
>> behind Cassandra, and they agreed that this was a good fit.
>>
>> At this point I'm sure that most of you are filled with thoughts on
>> how this won't work, or how much trouble it will be to switch, or how
>> much more of a pain it will be, or how you hate non-relational DBs, or
>> any of a zillion other negative thoughts. FWIW, I have them too. But
>> instead of ranting, I would ask that we acknowledge for now that:
>>
>> a) it will be disruptive and painful to switch something like this at
>> this point in Nova's development
>> b) it would have to provide *significant* improvement to make such a
>> change worthwhile
>>
>> So what I'm asking from all of you is to help define the second part:
>> what we would want improved, and how to measure those benefits. In
>> other words, what results would you have to see in order to make you
>> reconsider your initial "nah, this'll never work" reaction, and start
>> to think that this is will be a worthwhile change to make to Nova.
>>
>> I'm also asking that you refrain from talking about why this can't
>> work for now. I know it'll be difficult to do that, since nobody likes
>> ranting about stuff more than I do, but right now it won't be helpful.
>> There will be plenty of time for that later, assuming that this
>> experiment yields anything worthwhile. Instead, think of the current
>> pain points in the scheduler design, and what sort of improvement you
>> would have to see in order to seriously consider undertaking this
>> change to Nova.
>>
>> I've gotten the OK from my management to pursue this, and several
>> people in the community have expressed support for both the approach
>> and the experiment, even though most don't have spare cycles to
>> contribute. I'd love to have anyone who is interested become involved.
>>
>> I hope that this will be a positive discussion at the Nova mid-cycle
>> next week. I know it will be a lively one. :)
>>
>> [1] http://cassandra.apache.org/
>> [2] http://www.datastax.com/
>>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-- 
Best Regards,
Maish Saidel-Keesing

Open Stack

[openstack-dev] [nova] Proposal for an Experiment

OpenStack

Community

Documentation

Branding & Legal