[openstack-dev] [neutron] Supporting retries in neutronclient

Paul Ward wpward at linux.vnet.ibm.com
Thu Jun 5 17:19:27 UTC 2014


Carl,

I haven't been able to try this yet as it requires us to run a pretty  
big scale
test.

But to try to summarize the current feeling on this thread... the  
retry logic is
being put into the neutronclient already (via
https://review.openstack.org/#/c/71464/), it's just that it's not  
"automatic" and
is being left up to the invoker to decide when to use retry.  The idea  
of doing
the retries automatically isn't the way to go because it is dangerous for
non-idempotent operations.

So... I think we leave the proposed change as is and will potentially need to
enhance users as we see fit.  The invoker in our failure case is nova trying
to get network info, so this seems like a good first one to try out.

Thoughts?

Thanks,
   Paul

Quoting Carl Baldwin <carl at ecbaldwin.net>:

> Paul,
>
> I'm curious.  Have you been able to update to a client using requests?
>  Has it solved your problem?
>
> Carl
>
> On Thu, May 29, 2014 at 11:15 AM, Paul Ward <wpward at us.ibm.com> wrote:
>> Yes, we're still on a code level that uses httplib2.  I noticed that as
>> well, but wasn't sure if that would really
>> help here as it seems like an ssl thing itself.  But... who knows??  I'm not
>> sure how consistently we can
>> recreate this, but if we can, I'll try using that patch to use requests and
>> see if that helps.
>>
>>
>>
>> "Armando M." <armamig at gmail.com> wrote on 05/29/2014 11:52:34 AM:
>>
>>> From: "Armando M." <armamig at gmail.com>
>>
>>
>>> To: "OpenStack Development Mailing List (not for usage questions)"
>>> <openstack-dev at lists.openstack.org>,
>>> Date: 05/29/2014 11:58 AM
>>
>>> Subject: Re: [openstack-dev] [neutron] Supporting retries in neutronclient
>>>
>>> Hi Paul,
>>>
>>> Just out of curiosity, I am assuming you are using the client that
>>> still relies on httplib2. Patch [1] replaced httplib2 with requests,
>>> but I believe that a new client that incorporates this change has not
>>> yet been published. I wonder if the failures you are referring to
>>> manifest themselves with the former http library rather than the
>>> latter. Could you clarify?
>>>
>>> Thanks,
>>> Armando
>>>
>>> [1] - https://review.openstack.org/#/c/89879/
>>>
>>> On 29 May 2014 17:25, Paul Ward <wpward at us.ibm.com> wrote:
>>> > Well, for my specific error, it was an intermittent ssl handshake error
>>> > before the request was ever sent to the
>>> > neutron-server.  In our case, we saw that 4 out of 5 resize operations
>>> > worked, the fifth failed with this ssl
>>> > handshake error in neutronclient.
>>> >
>>> > I certainly think a GET is safe to retry, and I agree with your
>>> > statement
>>> > that PUTs and DELETEs probably
>>> > are as well.  This still leaves a change in nova needing to be made to
>>> > actually a) specify a conf option and
>>> > b) pass it to neutronclient where appropriate.
>>> >
>>> >
>>> > Aaron Rosen <aaronorosen at gmail.com> wrote on 05/28/2014 07:38:56 PM:
>>> >
>>> >> From: Aaron Rosen <aaronorosen at gmail.com>
>>> >
>>> >
>>> >> To: "OpenStack Development Mailing List (not for usage questions)"
>>> >> <openstack-dev at lists.openstack.org>,
>>> >> Date: 05/28/2014 07:44 PM
>>> >
>>> >> Subject: Re: [openstack-dev] [neutron] Supporting retries in
>>> >> neutronclient
>>> >>
>>> >> Hi,
>>> >>
>>> >> I'm curious if other openstack clients implement this type of retry
>>> >> thing. I think retrying on GET/DELETES/PUT's should probably be okay.
>>> >>
>>> >> What types of errors do you see in the neutron-server when it fails
>>> >> to respond? I think it would be better to move the retry logic into
>>> >> the server around the failures rather than the client (or better yet
>>> >> if we fixed the server :)). Most of the times I've seen this type of
>>> >> failure is due to deadlock errors caused between (sqlalchemy and
>>> >> eventlet *i think*) which cause the client to eventually timeout.
>>> >>
>>> >> Best,
>>> >>
>>> >> Aaron
>>> >>
>>> >
>>> >> On Wed, May 28, 2014 at 11:51 AM, Paul Ward <wpward at us.ibm.com> wrote:
>>> >> Would it be feasible to make the retry logic only apply to read-only
>>> >> operations?  This would still require a nova change to specify the
>>> >> number of retries, but it'd also prevent invokers from shooting
>>> >> themselves in the foot if they call for a write operation.
>>> >>
>>> >>
>>> >>
>>> >> Aaron Rosen <aaronorosen at gmail.com> wrote on 05/27/2014 09:40:00 PM:
>>> >>
>>> >> > From: Aaron Rosen <aaronorosen at gmail.com>
>>> >>
>>> >> > To: "OpenStack Development Mailing List (not for usage questions)"
>>> >> > <openstack-dev at lists.openstack.org>,
>>> >> > Date: 05/27/2014 09:44 PM
>>> >>
>>> >> > Subject: Re: [openstack-dev] [neutron] Supporting retries in
>>> >> > neutronclient
>>> >> >
>>> >> > Hi,
>>> >>
>>> >> >
>>> >> > Is it possible to detect when the ssl handshaking error occurs on
>>> >> > the client side (and only retry for that)? If so I think we should
>>> >> > do that rather than retrying multiple times. The danger here is
>>> >> > mostly for POST operations (as Eugene pointed out) where it's
>>> >> > possible for the response to not make it back to the client and for
>>> >> > the operation to actually succeed.
>>> >> >
>>> >> > Having this retry logic nested in the client also prevents things
>>> >> > like nova from handling these types of failures individually since
>>> >> > this retry logic is happening inside of the client. I think it would
>>> >> > be better not to have this internal mechanism in the client and
>>> >> > instead make the user of the client implement retry so they are
>>> >> > aware of failures.
>>> >> >
>>> >> > Aaron
>>> >> >
>>> >>
>>> >> > On Tue, May 27, 2014 at 10:48 AM, Paul Ward <wpward at us.ibm.com>
>>> >> > wrote:
>>> >> > Currently, neutronclient is hardcoded to only try a request once in
>>> >> > retry_request by virtue of the fact that it uses self.retries as the
>>> >> > retry count, and that's initialized to 0 and never changed.  We've
>>> >> > seen an issue where we get an ssl handshaking error intermittently
>>> >> > (seems like more of an ssl bug) and a retry would probably have
>>> >> > worked.  Yet, since neutronclient only tries once and gives up, it
>>> >> > fails the entire operation.  Here is the code in question:
>>> >> >
>>> >> > https://github.com/openstack/python-neutronclient/blob/master/
>>> >> > neutronclient/v2_0/client.py#L1296
>>> >> >
>>> >> > Does anybody know if there's some explicit reason we don't currently
>>> >> > allow configuring the number of retries?  If not, I'm inclined to
>>> >> > propose a change for just that.
>>> >> >
>>> >> > _______________________________________________
>>> >> > OpenStack-dev mailing list
>>> >> > OpenStack-dev at lists.openstack.org
>>> >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> >>
>>> >> > _______________________________________________
>>> >> > OpenStack-dev mailing list
>>> >> > OpenStack-dev at lists.openstack.org
>>> >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> >>
>>> >> _______________________________________________
>>> >> OpenStack-dev mailing list
>>> >> OpenStack-dev at lists.openstack.org
>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> >
>>> >> _______________________________________________
>>> >> OpenStack-dev mailing list
>>> >> OpenStack-dev at lists.openstack.org
>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> >
>>> >
>>> > _______________________________________________
>>> > OpenStack-dev mailing list
>>> > OpenStack-dev at lists.openstack.org
>>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> >
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list