[openstack-dev] [Neutron] DHCP Agent Reliability

Carl Baldwin carl at ecbaldwin.net
Sat Dec 7 00:20:13 UTC 2013


Pasting a few things from IRC here to fill out the context...

<marun> carl_baldwin: but according to markmcclain and salv-orlando,
it isn't possible to trivially use multiple workers for rpc because
processing rpc requests out of sequence can be dangerous

<carl_baldwin> marun: I think it is already possible to run more than
one RPC message processor.  If the neutron server process is run on
multiple hosts in active/active I think you end up getting multiple
independent RPC processing threads unless I'm missing something.

<marun> carl_baldwin: is active/active an option?

I checked one of my environments where there are two API servers
running.  It is clear from the logs that both servers are consuming
and processing RPC messages independently.  I have not identified any
problems resulting from doing this yet.  I've been running this way
for months.  There could be something lurking in there preparing to
cause a problem.

I'm suddenly keenly interested in understanding the problems with
processing RPC messages out of order.  I tried reading the IRC backlog
for information about this but it was not clear to me.  Mark or
Salvatore, can you comment?

Not only is RPC being handled by both physical servers in my
environment but each of the API server worker processes is consuming
and processing RPC messages independently.  So, I am currently running
a multi-process RPC scenario now.

I did not intend for this to happen this way.  My environment has
something different than the current upstream.  I confirmed that with
current upstream code and the ML2 plugin only the parent process
consumes RPC messages.  It is probably because this environment is
still using an older version of my multi-process API worker patch.
Still looking in to it.

Carl

On Thu, Dec 5, 2013 at 7:32 AM, Carl Baldwin <carl at ecbaldwin.net> wrote:
> Creating separate processes for API workers does allow a bit more room
> for RPC message processing in the main process.  If this isn't enough
> and the main process is still bound on CPU and/or green
> thread/sqlalchemy blocking then creating separate worker processes for
> RPC processing may be the next logical step to scale.  I'll give it
> some thought today and possibly create a blueprint.
>
> Carl
>
> On Thu, Dec 5, 2013 at 7:13 AM, Maru Newby <marun at redhat.com> wrote:
>>
>> On Dec 5, 2013, at 6:43 AM, Carl Baldwin <carl at ecbaldwin.net> wrote:
>>
>>> I have offered up https://review.openstack.org/#/c/60082/ as a
>>> backport to Havana.  Interest was expressed in the blueprint for doing
>>> this even before this thread.  If there is consensus for this as the
>>> stop-gap then it is there for the merging.  However, I do not want to
>>> discourage discussion of other stop-gap solutions like what Maru
>>> proposed in the original post.
>>>
>>> Carl
>>
>> Awesome.  No worries, I'm still planning on submitting a patch to improve notification reliability.
>>
>> We seem to be cpu bound now in processing RPC messages.  Do you think it would be reasonable to run multiple processes for RPC?
>>
>>
>> m.
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list