[openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

Neil Jerram Neil.Jerram at metaswitch.com
Wed May 13 12:35:59 UTC 2015


Hi Salvatore,

Thanks for your reply...

On 08/05/15 09:20, Salvatore Orlando wrote:
> Just like the Neutron plugin manager, also ML2 driver manager ensure
> drivers are loaded only once regardless of the number of workers.
> What Kevin did proves that drivers are correctly loaded before forking
> (I reckon).

Yes, up to a point.  It seems clear that we can rely on the following 
events being ordered:

1. Mechanism drivers are instantiated (__init__) and initialized 
(initialize).

2. The Neutron server forks (into a number of copies as dictated by 
api_workers and rpc_workers).

3. Mechanism driver entry points such as create_port_pre/postcommit are 
called.

However...

> However, forking is something to be careful about especially when using
> eventlet. For the plugin my team maintains we were creating a periodic
> task during plugin initialisation.
> This lead to an interesting condition where API workers were hanging
> [1]. This situation was fixed with a rather pedestrian fix - by adding a
> delay.

Yes!  This is precisely the same situation that I have.  Currently I am 
also planning to 'fix' it by adding a delay of a few seconds.  However 
that is not an amazing fix, because if there is something that a 
mechanism driver needs to do on startup, it would probably rather do it 
as soon as possible; and on the other hand because it involves guessing 
how long steps (1) and (2) above will take.

Readers may be wondering why a mechanism driver needs to do something on 
startup?  In general, the answer is so as to recheck the Neutron DB - 
i.e. any VMs/ports that should already exist - and ensure that the 
driver's downstream components are all correctly in sync with that.  In 
Calico's case, that means auditing that the routing and iptables on each 
compute host match to the current VM and security configuration.

This need is implied by the existence of the _postcommit entry points. 
When a mechanism driver is implemented using those entry points, it is 
possible for driver or downstream software to crash after the Neutron DB 
believes that a transaction has been committed, and leave dataplane 
state wrong.  Clearly, then, when the driver or downstream software is 
restarted, it needs to resync against the standing Neutron DB.

> Generally speaking I would find useful to have a way to "identify" an
> API worker in order to designate a specific one for processing that
> should not be made redundant.
> On the other hand I self-object to the above statement by saying that
> API workers are not supposed to do this kind of processing, which should
> be deferred to some other helper process.

+1 on both points :-)

There could be a post_fork() mechanism driver entry point.  It wouldn't 
matter which worker or helper process called it; the requirement would 
be simply that it would only be called once, after all the forking has 
occurred.

Regards,
	Neil


> Salvatore
>
> [1] https://bugs.launchpad.net/vmware-nsx/+bug/1420278
>
> On 8 May 2015 at 09:43, Kevin Benton <blak111 at gmail.com
> <mailto:blak111 at gmail.com>> wrote:
>
>     I'm not sure I understand the behavior you are seeing. When your
>     mechanism driver gets initialized and kicks off processing, all of
>     that should be happening in the parent PID. I don't know why your
>     child processes start executing code that wasn't invoked. Can you
>     provide a pointer to the code or give a sample that reproduces the
>     issue?
>
>     I modified the linuxbridge mech driver to try to reproduce it:
>     http://paste.openstack.org/show/216859/
>
>     In the output, I never received any of the init code output I added
>     more than once, including the function spawned using eventlet.
>
>     The only time I ever saw anything executed by a child process was
>     actual API requests (e.g. the create_port method).
>
>
>     On Thu, May 7, 2015 at 6:08 AM, Neil Jerram
>     <Neil.Jerram at metaswitch.com <mailto:Neil.Jerram at metaswitch.com>> wrote:
>
>         Is there a design for how ML2 mechanism drivers are supposed to
>         cope with the Neutron server forking?
>
>         What I'm currently seeing, with api_workers = 2, is:
>
>         - my mechanism driver gets instantiated and initialized, and
>         immediately kicks off some processing that involves
>         communicating over the network
>
>         - the Neutron server process then forks into multiple copies
>
>         - multiple copies of my driver's network processing then
>         continue, and interfere badly with each other :-)
>
>         I think what I should do is:
>
>         - wait until any forking has happened
>
>         - then decide (somehow) which mechanism driver is going to kick
>         off that processing, and do that.
>
>         But how can a mechanism driver know when the Neutron server
>         forking has happened?
>
>         Thanks,
>                  Neil
>
>         __________________________________________________________________________
>         OpenStack Development Mailing List (not for usage questions)
>         Unsubscribe:
>         OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>         <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
>     --
>     Kevin Benton
>
>     __________________________________________________________________________
>     OpenStack Development Mailing List (not for usage questions)
>     Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list