[openstack-dev] [Oslo][Neutron] Fork() safety and oslo.messaging

Ken Giusti kgiusti at gmail.com
Tue Nov 25 15:47:23 UTC 2014

Hi Mehdi

On Tue, Nov 25, 2014 at 5:38 AM, Mehdi Abaakouk <sileht at sileht.net> wrote:
> Hi,
> I think the main issue is the behavior of the API
> of oslo-incubator/openstack/common/service.py, specially:
>  * ProcessLauncher.launch_service(MyService())
> And then the MyService have this behavior:
> class MyService:
>    def __init__(self):
>        # CODE DONE BEFORE os.fork()
>    def start(self):
>        # CODE DONE AFTER os.fork()
> So if an application created a FD inside MyService.__init__ or before ProcessLauncher.launch_service, it will be shared between
> processes and we got this kind of issues...
> For the rabbitmq/qpid driver, the first connection is created when the rpc server is started or when the first rpc call/cast/... is done.
> So if the application doesn't do that inside MyService.__init__ or before ProcessLauncher.launch_service everything works as expected.
> But if the issue is raised I think this is an application issue (rpc stuff done before the os.fork())

Mmmm... I don't think it's that clear (re: an application issue).  I
mean, yes - the application is doing the os.fork() at the 'wrong'
time, but where is this made clear in the oslo.messaging API

I think this is the real issue here:  what is the "official" guidance
for using os.fork() and its interaction with oslo libraries?

In the case of oslo.messaging, I can't find any mention of os.fork()
in the API docs (I may have missed it - please correct me if so).
That would imply - at least to me - that there is _no_ restrictions of
using os.fork() together with oslo.messaging.

But in the case of qpid, that is definitely _not_ the case.

The legacy qpid driver - impl_qpid - imports a 3rd party library, the
qpid.messaging API.   This library uses threading.Thread internally,
we (consumers of this library) have no control over how that thread is
managed.  So for impl_qpid, os.fork()'ing after the driver is loaded
can't be guaranteed to work.   In fact, I'd say os.fork() and
impl_qpid will not work - full stop.

> For the amqp1 driver case, I think this is the same things, it seems to do lazy creation of the connection too.

We have more flexibility here, since the driver directly controls when
the thread is spawned.  But the very fact that the thread is used
places a restriction on how oslo.messaging and os.fork() can be used
together, which isn't made clear in the documentation for the library.

I'm not familiar with the rabbit driver - I've seen some patch for
heatbeating in rabbit introduce threading, so there may also be an
implication there as well.

> I will take a look to the neutron code, if I found a rpc usage
> before the os.fork().

I've done some tracing of neutron-server's behavior in this case - you
may want to take a look at


> Personally, I don't like this API, because the behavior difference between
> '__init__' and 'start' is too implicit.

That's true, but I'd say that the problem of implicitness re:
os.fork() needs to be clarified at the library level as well.



> Cheers,
> ---
> Mehdi Abaakouk
> mail: sileht at sileht.net
> irc: sileht
> Le 2014-11-24 20:27, Ken Giusti a écrit :
>> Hi all,
>> As far as oslo.messaging is concerned, should it be possible for the
>> main application to safely os.fork() when there is already an active
>> connection to a messaging broker?
>> I ask because I'm hitting what appears to be fork-related issues with
>> the new AMQP 1.0 driver.  I think the same problems have been seen
>> with the older impl_qpid driver as well [0]
>> Both drivers utilize a background threading.Thread that handles all
>> async socket I/O and protocol timers.
>> In the particular case I'm trying to debug, rpc_workers is set to 4 in
>> neutron.conf.  As far as I can tell, this causes neutron.service to
>> os.fork() four workers, but does so after it has created a listener
>> (and therefore a connection to the broker).
>> This results in multiple processes all select()'ing the same set of
>> networks sockets, and stuff breaks :(
>> Even without the background process, wouldn't this use still result in
>> sockets being shared across the parent/child processes?   Seems
>> dangerous.
>> Thoughts?
>> [0] https://bugs.launchpad.net/oslo.messaging/+bug/1330199

Ken Giusti  (kgiusti at gmail.com)

More information about the OpenStack-dev mailing list