[openstack-dev] [Neutron] DHCP Agent Reliability

Ashok Kumaran ashokkumaran.b at gmail.com
Wed Dec 4 16:12:19 UTC 2013


On Wed, Dec 4, 2013 at 8:30 PM, Maru Newby <marun at redhat.com> wrote:

>
> On Dec 4, 2013, at 8:55 AM, Carl Baldwin <carl at ecbaldwin.net> wrote:
>
> > Stephen, all,
> >
> > I agree that there may be some opportunity to split things out a bit.
> > However, I'm not sure what the best way will be.  I recall that Mark
> > mentioned breaking out the processes that handle API requests and RPC
> > from each other at the summit.  Anyway, it is something that has been
> > discussed.
> >
> > I actually wanted to point out that the neutron server now has the
> > ability to run a configurable number of sub-processes to handle a
> > heavier load.  Introduced with this commit:
> >
> > https://review.openstack.org/#/c/37131/
> >
> > Set api_workers to something > 1 and restart the server.
> >
> > The server can also be run on more than one physical host in
> > combination with multiple child processes.
>
> I completely misunderstood the import of the commit in question.  Being
> able to run the wsgi server(s) out of process is a nice improvement, thank
> you for making it happen.  Has there been any discussion around making the
> default for api_workers > 0 (at least 1) to ensure that the default
> configuration separates wsgi and rpc load?  This also seems like a great
> candidate for backporting to havana and maybe even grizzly, although
> api_workers should probably be defaulted to 0 in those cases.
>

+1 for backporting the api_workers feature to havana as well as Grizzly :)

>
> FYI, I re-ran the test that attempted to boot 75 micro VM's simultaneously
> with api_workers = 2, with mixed results.  The increased wsgi throughput
> resulted in almost half of the boot requests failing with 500 errors due to
> QueuePool errors (https://bugs.launchpad.net/neutron/+bug/1160442) in
> Neutron.  It also appears that maximizing the number of wsgi requests has
> the side-effect of increasing the RPC load on the main process, and this
> means that the problem of dhcp notifications being dropped is little
> improved.  I intend to submit a fix that ensures that notifications are
> sent regardless of agent status, in any case.
>
>
> m.
>
> >
> > Carl
> >
> > On Tue, Dec 3, 2013 at 9:47 AM, Stephen Gran
> > <stephen.gran at theguardian.com> wrote:
> >> On 03/12/13 16:08, Maru Newby wrote:
> >>>
> >>> I've been investigating a bug that is preventing VM's from receiving IP
> >>> addresses when a Neutron service is under high load:
> >>>
> >>> https://bugs.launchpad.net/neutron/+bug/1192381
> >>>
> >>> High load causes the DHCP agent's status updates to be delayed, causing
> >>> the Neutron service to assume that the agent is down.  This results in
> the
> >>> Neutron service not sending notifications of port addition to the DHCP
> >>> agent.  At present, the notifications are simply dropped.  A simple
> fix is
> >>> to send notifications regardless of agent status.  Does anybody have
> any
> >>> objections to this stop-gap approach?  I'm not clear on the
> implications of
> >>> sending notifications to agents that are down, but I'm hoping for a
> simple
> >>> fix that can be backported to both havana and grizzly (yes, this bug
> has
> >>> been with us that long).
> >>>
> >>> Fixing this problem for real, though, will likely be more involved.
>  The
> >>> proposal to replace the current wsgi framework with Pecan may increase
> the
> >>> Neutron service's scalability, but should we continue to use a 'fire
> and
> >>> forget' approach to notification?  Being able to track the success or
> >>> failure of a given action outside of the logs would seem pretty
> important,
> >>> and allow for more effective coordination with Nova than is currently
> >>> possible.
> >>
> >>
> >> It strikes me that we ask an awful lot of a single neutron-server
> instance -
> >> it has to take state updates from all the agents, it has to do
> scheduling,
> >> it has to respond to API requests, and it has to communicate about
> actual
> >> changes with the agents.
> >>
> >> Maybe breaking some of these out the way nova has a scheduler and a
> >> conductor and so on might be a good model (I know there are things
> people
> >> are unhappy about with nova-scheduler, but imagine how much worse it
> would
> >> be if it was built into the API).
> >>
> >> Doing all of those tasks, and doing it largely single threaded, is just
> >> asking for overload.
> >>
> >> Cheers,
> >> --
> >> Stephen Gran
> >> Senior Systems Integrator - theguardian.com
> >> Please consider the environment before printing this email.
> >> ------------------------------------------------------------------
> >> Visit theguardian.com
> >> On your mobile, download the Guardian iPhone app theguardian.com/iphoneand
> >> our iPad edition theguardian.com/iPad   Save up to 33% by subscribing
> to the
> >> Guardian and Observer - choose the papers you want and get full digital
> >> access.
> >> Visit subscribe.theguardian.com
> >>
> >> This e-mail and all attachments are confidential and may also
> >> be privileged. If you are not the named recipient, please notify
> >> the sender and delete the e-mail and all attachments immediately.
> >> Do not disclose the contents to another person. You may not use
> >> the information for any purpose, or store, or copy, it in any way.
> >>
> >> Guardian News & Media Limited is not liable for any computer
> >> viruses or other material transmitted with or as part of this
> >> e-mail. You should employ virus checking software.
> >>
> >> Guardian News & Media Limited
> >>
> >> A member of Guardian Media Group plc
> >> Registered Office
> >> PO Box 68164
> >> Kings Place
> >> 90 York Way
> >> London
> >> N1P 2AP
> >>
> >> Registered in England Number 908396
> >>
> >>
> --------------------------------------------------------------------------
> >>
> >>
> >>
> >> _______________________________________________
> >> OpenStack-dev mailing list
> >> OpenStack-dev at lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131204/900d59fe/attachment.html>


More information about the OpenStack-dev mailing list