[Openstack-operators] [nova] FYI, local conductor mode is deprecated, pending removal in N
mriedem at linux.vnet.ibm.com
Fri Nov 13 15:09:24 UTC 2015
On 11/12/2015 5:34 PM, Joshua Harlow wrote:
> Ok, so the following is starting to form:
> Hopefully we can get to the bottom of this (especially for clouds that
> run a large amount of computes in a single cell/only one cell).
> Andrew Laski wrote:
>> On 11/12/15 at 10:53am, Clint Byrum wrote:
>>> Excerpts from Joshua Harlow's message of 2015-11-12 10:35:21 -0800:
>>>> Mike Dorman wrote:
>>>> > We do have a backlog story to investigate this more deeply, we just
>>>> have not had the time to do it yet. For us, it’s been easier/faster
>>>> to add more hardware to conductor to get over the hump temporarily.
>>>> > We kind of have that work earmarked for after the Liberty upgrade,
>>>> in hopes that maybe it’ll be fixed there.
>>>> > If anybody else has done even some trivial troubleshooting already,
>>>> it’d be great to get that info as a starting point. I.e. which
>>>> specific calls to conductor are causing the load, etc.
>>>> > Mike
>>>> +1 I think we in the #openstack-performance channel really need to
>>>> investigate this, because it really worries me personally from hearing
>>>> many many rumors about how the remote conductor falls over. Please join
>>>> there and we can try to work through a plan to figure out what to do
>>>> about this situation. It would be great if the nova people also joined
>>>> there (because in the end, likely something in nova will need to be
>>>> fixed/changed/something else to resolve what appears to be a problem
>>>> many operators).
>>> Falling over is definitely a bad sign. ;)
>>> The concept of pushing messages over a bus instead of just making local
>>> calls shouldn't result in much extra load. Perhaps we just have too many
>>> layers of unoptimized encapsulation. I have to wonder if something like
>>> protobuf would help.
>> Falling over is also a very broad description and doesn't let us know
>> what the actual issue is.
>> From my experience the performance concern with conductor has been in
>> not understanding the ratio of conductor nodes to computes that are
>> necessary for our usage. Conductor doesn't add much extra load, but it
>> concentrates it on a smaller number of services. If we ran one conductor
>> per compute I suspect we would have no performance issues, but that's a
>> lot of capacity to use for this.
>> I am curious what conductor/compute ratios that others are trying to
>> achieve, given equal hardware types for each, and what are the barriers
>> to this happening?
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
Cool, that's helpful for taking notes. I've posted some questions in there.
I also added this to the next performance team meeting agenda. I have a
conflict at that time so I might not be able to join, but I'm assuming
notes will be put back into the etherpad.
More information about the OpenStack-operators