Open Stack

Wed Sep 28 15:01:23 UTC 2016

On 9/28/2016 12:10 AM, Joshua Harlow wrote:
>>>> ACTION: we should make sure workarounds are advertised better
>>>> ACTION: we should have some document about "when cells"?
>>> This is a difficult question to answer because "it depends." It's akin
>>> to asking "how many nova-api/nova-conductor processes should I run?"
>>> Well, what hardware is being used, how much traffic do you get, is it
>>> bursty or sustained, are instances created and left alone or are they
>>> torn down regularly, do you prune your database, what version of rabbit
>>> are you using, etc...
>>>
>>> I would expect the best answer(s) to this question are going to come
>>> from the operators themselves. What I've seen with cellsv1 is that
>>> someone will decide for themselves that they should put no more than X
>>> computes in a cell and that information filters out to other operators.
>>> That provides a starting point for a new deployment to tune from.
>>
>> I don't think we need "don't go larger than N nodes" kind of advice. But
>> we should probably know what kinds of things we expect to be hot spots.
>> Like mysql load, possibly indicated by system load or high level of db
>> conflicts. Or rabbit mq load. Or something along those lines.
>>
>> Basically the things to look out for that indicate your are approaching
>> a scale point where cells is going to help. That also helps in defining
>> what kind of scaling issues cells won't help on, which need to be
>> addressed in other ways (such as optimizations).
>
> Big +1 if we can really get out of the behavior/pattern of
> thinking/thought of guessing at the overall system characteristics
> *somehow* I think it would be great for our own communities maturity and
> for each project/s. Even though I know such things are hard, it scares
> the bejeezus out of me when we (as a group) create software but can't
> give recommendations on its behavioral characteristics (we aren't doing
> quantum physics here the last time I checked).
>
> Just some ideas:
>
> * Rally maybe can help here?
> * Fixing a standard set of configuration options and testing that at
> scale (using the intel lab?) - and then possibly using rally (or other)
> to probe the system characteristics and then giving recommendations
> before releasing the software for general consumption based on observed
> system characteristics (this is basically what operators are going to
> have to do anyway to qualify a release, especially if the community
> isn't doing it and/or is shying away from doing it).
>
> I just have a hard time accepting that tribal knowledge about scale that
> has to filter from operators to operator (yes I know from personal
> experience this is how things trickled down) is a good way to go. It
> reminds me of the medicine and practices in the late 1800s where all
> sorts of quackery science was happening; and IMHO we can do better than
> this :)

Hmm, that reminds me that I'm running low on leeches...

>
> Anyway, back to your regularly scheduled programming,
>
> -Josh
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

-- 

Thanks,

Matt Riedemann

Open Stack

[openstack-dev] [nova] ops meetup feedback

OpenStack

Community

Documentation

Branding & Legal