[openstack-dev] [oslo][barbican][sahara] start RPC service before launcher wait?
aspiers at suse.com
Tue Aug 1 01:32:34 UTC 2017
Ken Giusti <kgiusti at gmail.com> wrote:
>On Mon, Jul 31, 2017 at 10:01 AM, Adam Spiers <aspiers at suse.com> wrote:
>> I recently discovered a bug where barbican-worker would hang on
>> shutdown if queue.asynchronous_workers was changed from 1 to 2:
>> resulting in a warning like this:
>> WARNING oslo_messaging.server [-] Possible hang: stop is waiting for
>> start to complete
>> I found a similar bug in Sahara:
>> where the fix was to call start() on the RPC service before making the
>> launcher wait() on it, so I ported the fix to Barbican, and it seems
>> to work fine:
>> I noticed that both projects use ProcessLauncher; barbican uses
>> oslo_service.service.launch() which has:
>> if workers is None or workers == 1:
>> launcher = ServiceLauncher(conf, restart_method=restart_method)
>> launcher = ProcessLauncher(conf, restart_method=restart_method)
>> However, I'm not an expert in oslo.service or oslo.messaging, and one
>> of Barbican's core reviewers (thanks Kaitlin!) noted that not many
>> other projects start the task before calling wait() on the launcher,
>> so I thought I'd check here whether that is the correct fix, or
>> whether there's something else odd going on.
>> Any oslo gurus able to shed light on this?
>As far as an oslo.messaging server is concerned, the order of operations is:
># do stuff until ready to stop the server...
>The final wait blocks until all requests that are in progress when stop()
>is called finish and cleanup.
Thanks - that makes sense. So the question is, why would
barbican-worker only hang on shutdown when there are multiple workers?
Maybe the real bug is somewhere in oslo_service.service.ProcessLauncher
and it's not calling start() correctly?
More information about the OpenStack-dev