[openstack-dev] [puppet][Fuel] OpenstackLib Client Provider Better Exception Handling

Vladimir Kuklin vkuklin at mirantis.com
Thu Oct 15 10:10:09 UTC 2015


Gilles,

5xx errors like 503 and 502/504 could always be intermittent operational
issues. E.g. when you access your keystone backends through some proxy and
there is a connectivity issue between the proxy and backends which
disappears in 10 seconds, you do not need to rerun the puppet completely -
just retry the request.

Regarding "REST interfaces for all Openstack API" - this is very close to
another topic that I raised ([0]) - using native Ruby application and
handle the exceptions. Otherwise whenever we have an OpenStack client
(generic or neutron/glance/etc. one) sending us a message like '[111]
Connection refused' this message is very much determined by the framework
that OpenStack is using within this release for clients. It could be
`requests` or any other type of framework which sends different text
message depending on its version. So it is very bothersome to write a bunch
of 'if' clauses or gigantic regexps instead of handling simple Ruby
exception. So I agree with you here - we need to work with the API
directly. And, by the way, if you also support switching to native Ruby
OpenStack API client, please feel free to support movement towards it in
the thread [0]

Matt and Gilles,

Regarding puppet-healthcheck - I do not think that puppet-healtcheck
handles exactly what I am mentioning here - it is not running exactly at
the same time as we run the request.

E.g. 10 seconds ago everything was OK, then we had a temporary connectivity
issue, then everything is ok again in 10 seconds. Could you please describe
how puppet-healthcheck can help us solve this problem?

Or another example - there was an issue with keystone accessing token
database when you have several keystone instances running, or there was
some desync between these instances, e.g. you fetched the token at keystone
#1 and then you verify it again keystone #2. Keystone #2 had some issues
verifying it not due to the fact that token was bad, but due to the fact
that that keystone #2 had some issues. We would get 401 error and instead
of trying to rerun the puppet we would need just to handle this issue
locally by retrying the request.

[0] http://permalink.gmane.org/gmane.comp.cloud.openstack.devel/66423

On Thu, Oct 15, 2015 at 12:23 PM, Gilles Dubreuil <gilles at redhat.com> wrote:

>
>
> On 15/10/15 12:42, Matt Fischer wrote:
> >
> >
> > On Thu, Oct 8, 2015 at 5:38 AM, Vladimir Kuklin <vkuklin at mirantis.com
> > <mailto:vkuklin at mirantis.com>> wrote:
> >
> >     Hi, folks
> >
> >     * Intro
> >
> >     Per our discussion at Meeting #54 [0] I would like to propose the
> >     uniform approach of exception handling for all puppet-openstack
> >     providers accessing any types of OpenStack APIs.
> >
> >     * Problem Description
> >
> >     While working on Fuel during deployment of multi-node HA-aware
> >     environments we faced many intermittent operational issues, e.g.:
> >
> >     401/403 authentication failures when we were doing scaling of
> >     OpenStack controllers due to difference in hashing view between
> >     keystone instances
> >     503/502/504 errors due to temporary connectivity issues
>
> The 5xx errors are not connectivity issues:
>
> 500 Internal Server Error
> 501 Not Implemented
> 502 Bad Gateway
> 503 Service Unavailable
> 504 Gateway Timeout
> 505 HTTP Version Not Supported
>
> I believe nothing should be done to trap them.
>
> The connectivity issues are different matter (to be addressed as
> mentioned by Matt)
>
> >     non-idempotent operations like deletion or creation - e.g. if you
> >     are deleting an endpoint and someone is deleting on the other node
> >     and you get 404 - you should continue with success instead of
> >     failing. 409 Conflict error should also signal us to re-fetch
> >     resource parameters and then decide what to do with them.
> >
> >     Obviously, it is not optimal to rerun puppet to correct such errors
> >     when we can just handle an exception properly.
> >
> >     * Current State of Art
> >
> >     There is some exception handling, but it does not cover all the
> >     aforementioned use cases.
> >
> >     * Proposed solution
> >
> >     Introduce a library of exception handling methods which should be
> >     the same for all puppet openstack providers as these exceptions seem
> >     to be generic. Then, for each of the providers we can introduce
> >     provider-specific libraries that will inherit from this one.
> >
> >     Our mos-puppet team could add this into their backlog and could work
> >     on that in upstream or downstream and propose it upstream.
> >
> >     What do you think on that, puppet folks?
> >
>
> The real issue is because we're dealing with openstackclient, a CLI tool
> and not an API. Therefore no error propagation is expected.
>
> Using REST interfaces for all Openstack API would provide all HTTP errors:
>
> Check for "HTTP Response Classes" in
> http://ruby-doc.org/stdlib-2.2.3/libdoc/net/http/rdoc/Net/HTTP.html
>
>
> >     [0]
> http://eavesdrop.openstack.org/meetings/puppet_openstack/2015/puppet_openstack.2015-10-06-15.00.html
> >
> >
> > I think that we should look into some solutions here as I'm generally
> > for something we can solve once and re-use. Currently we solve some of
> > this at TWC by serializing our deploys and disabling puppet site wide
> > while we do so. This avoids the issue of Keystone on one node removing
> > and endpoint while the other nodes (who still have old code) keep trying
> > to add it back.
> >
> > For connectivity issues especially after service restarts, we're using
> > puppet-healthcheck [0] and I'd like to discuss that more in Tokyo as an
> > alternative to explicit retries and delays. It's in the etherpad so
> > hopefully you can attend.
>
> +1
>
> >
> > [0] - https://github.com/puppet-community/puppet-healthcheck
> >
> >
> >
> >
> __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Yours Faithfully,
Vladimir Kuklin,
Fuel Library Tech Lead,
Mirantis, Inc.
+7 (495) 640-49-04
+7 (926) 702-39-68
Skype kuklinvv
35bk3, Vorontsovskaya Str.
Moscow, Russia,
www.mirantis.com <http://www.mirantis.ru/>
www.mirantis.ru
vkuklin at mirantis.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20151015/5c12d8e5/attachment.html>


More information about the OpenStack-dev mailing list