[neutron][oslo] CI issue related to pyroute2 and latest oslo.privsep

Ben Nemec openstack at nemebean.com
Thu Jan 17 20:37:03 UTC 2019


I think it's worth noting that this has actually demonstrated a rather 
significant issue with threaded privsep, which is that forking from a 
Python thread is really not a safe thing to do.[1][2]

Sure, we could just say "don't fork in privileged code", but in this 
case the fork wasn't even in our code, it was in a library we were 
using. There are a few options, none of which I'm crazy about at this point:

* Provide a way for callers to specify that a call needs to run 
in-process rather than in the thread-pool. Two problems with this: 1) It 
requires the callers to know that forking is happening and 2) I'm not 
sure it actually fixes all of the potential problems. You might need to 
have a completely separate privsep daemon to avoid the potential bad 
fork/thread interactions.

* Switch to multiprocessing so calls execute in their own process. I may 
be wrong, but I think this requires all of the parameters passed in to 
be pickleable, which I bet is not remotely the case right now.

I'm open to suggestions that are better than playing whack-a-mole with 
these bugs using a threaded and un-threaded daemon.

-Ben

1: https://rachelbythebay.com/w/2011/06/07/forked/
2: https://rachelbythebay.com/w/2014/08/16/forkenv/

On 1/17/19 2:12 PM, Slawomir Kaplonski wrote:
> Hi,
> 
> Recently we had one more issue related to oslo.privsep and pyroute2. This caused many failures in Neutron CI. See [1] for details. Now fix (more like a workaround) for this issue is merged [2]. So if You saw in Your patch failing tempest/scenario jobs and in failed tests there were issues with SSH to instance through floating IP, please now rebase Your patch. It should be better :)
> 
> [1] https://bugs.launchpad.net/neutron/+bug/1811515
> [2] https://review.openstack.org/#/c/631275/
> 
>> Slawek Kaplonski
> Senior software engineer
> Red Hat
> 
> 



More information about the openstack-discuss mailing list