Open Stack

Thu Oct 15 09:23:58 UTC 2015

On 15/10/15 12:42, Matt Fischer wrote:
> 
> 
> On Thu, Oct 8, 2015 at 5:38 AM, Vladimir Kuklin <vkuklin at mirantis.com
> <mailto:vkuklin at mirantis.com>> wrote:
> 
>     Hi, folks
> 
>     * Intro
> 
>     Per our discussion at Meeting #54 [0] I would like to propose the
>     uniform approach of exception handling for all puppet-openstack
>     providers accessing any types of OpenStack APIs.
> 
>     * Problem Description
> 
>     While working on Fuel during deployment of multi-node HA-aware
>     environments we faced many intermittent operational issues, e.g.:
> 
>     401/403 authentication failures when we were doing scaling of
>     OpenStack controllers due to difference in hashing view between
>     keystone instances
>     503/502/504 errors due to temporary connectivity issues

The 5xx errors are not connectivity issues:

500 Internal Server Error
501 Not Implemented
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout
505 HTTP Version Not Supported

I believe nothing should be done to trap them.

The connectivity issues are different matter (to be addressed as
mentioned by Matt)

>     non-idempotent operations like deletion or creation - e.g. if you
>     are deleting an endpoint and someone is deleting on the other node
>     and you get 404 - you should continue with success instead of
>     failing. 409 Conflict error should also signal us to re-fetch
>     resource parameters and then decide what to do with them.
> 
>     Obviously, it is not optimal to rerun puppet to correct such errors
>     when we can just handle an exception properly.
> 
>     * Current State of Art
> 
>     There is some exception handling, but it does not cover all the
>     aforementioned use cases.
> 
>     * Proposed solution
> 
>     Introduce a library of exception handling methods which should be
>     the same for all puppet openstack providers as these exceptions seem
>     to be generic. Then, for each of the providers we can introduce
>     provider-specific libraries that will inherit from this one.
> 
>     Our mos-puppet team could add this into their backlog and could work
>     on that in upstream or downstream and propose it upstream.
> 
>     What do you think on that, puppet folks?
> 

The real issue is because we're dealing with openstackclient, a CLI tool
and not an API. Therefore no error propagation is expected.

Using REST interfaces for all Openstack API would provide all HTTP errors:

Check for "HTTP Response Classes" in
http://ruby-doc.org/stdlib-2.2.3/libdoc/net/http/rdoc/Net/HTTP.html

>     [0] http://eavesdrop.openstack.org/meetings/puppet_openstack/2015/puppet_openstack.2015-10-06-15.00.html
> 
> 
> I think that we should look into some solutions here as I'm generally
> for something we can solve once and re-use. Currently we solve some of
> this at TWC by serializing our deploys and disabling puppet site wide
> while we do so. This avoids the issue of Keystone on one node removing
> and endpoint while the other nodes (who still have old code) keep trying
> to add it back.
> 
> For connectivity issues especially after service restarts, we're using
> puppet-healthcheck [0] and I'd like to discuss that more in Tokyo as an
> alternative to explicit retries and delays. It's in the etherpad so
> hopefully you can attend.

+1

> 
> [0] - https://github.com/puppet-community/puppet-healthcheck
> 
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

Open Stack

[openstack-dev] [puppet][Fuel] OpenstackLib Client Provider Better Exception Handling

OpenStack

Community

Documentation

Branding & Legal