[openstack-dev] Should RPC consume_in_thread() be more fault tolerant?

Qing He Qing.He at radisys.com
Tue Jun 25 20:59:45 UTC 2013


Clarify, operator does not have to go through a long log to find the issue. Instead, he/she needs to be notified that something severe/unexpected just happened and he/she needs to check it out.

From: Qing He
Sent: Tuesday, June 25, 2013 1:09 PM
To: 'OpenStack Development Mailing List'
Subject: RE: [openstack-dev] Should RPC consume_in_thread() be more fault tolerant?

Does the log alert operator? Something like SNMP trap?

From: Ray Pekowski [mailto:pekowski at gmail.com]<mailto:[mailto:pekowski at gmail.com]>
Sent: Tuesday, June 25, 2013 12:16 PM
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] Should RPC consume_in_thread() be more fault tolerant?


On Jun 25, 2013 1:09 PM, "Qing He" <Qing.He at radisys.com<mailto:Qing.He at radisys.com>> wrote:
>
> Basically, when 'unexpected' happens, someone (e.g., operator) needs to know about it and look into it to see if it is something benign or fatal. If it is masked, the system may degrade overtime unnoticed into unusable.

The approach implemented in the patch is to log the exception and retry at a rate of one per second.  An alternative would be a log and a sys.exit() to kill the entire process.  Be aware that the code affected by this patch is rpc created dispatcher like threads.  Let's have a vote on which option is preferrable.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130625/ae52e014/attachment.html>


More information about the OpenStack-dev mailing list