<div dir="ltr"><div>Hello Zane, we applyed the patch and modified our haproxy : unfortunately it does not solve db deadlock issue.</div><div>Ignazio & Gianpiero<br></div></div><div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr">Il giorno mer 2 gen 2019 alle ore 07:28 Zane Bitter <<a href="mailto:zbitter@redhat.com" target="_blank">zbitter@redhat.com</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 21/12/18 2:07 AM, Jay Pipes wrote:<br>

> On 12/20/2018 02:01 AM, Zane Bitter wrote:<br>

>> On 19/12/18 6:49 AM, Jay Pipes wrote:<br>

>>> On 12/18/2018 11:06 AM, Mike Bayer wrote:<br>

>>>> On Tue, Dec 18, 2018 at 12:36 AM Ignazio Cassano<br>

>>>> <<a href="mailto:ignaziocassano@gmail.com" target="_blank">ignaziocassano@gmail.com</a>> wrote:<br>

>>>>><br>

>>>>> Yes, I  tried on yesterday and this workaround solved.<br>

>>>>> Thanks<br>

>>>>> Ignazio<br>

>>>><br>

>>>> OK, so that means this "deadlock" is not really a deadlock but it is a<br>

>>>> write-conflict between two Galera masters.      I have a long term<br>

>>>> goal to being relaxing this common requirement that Openstack apps<br>

>>>> only refer to one Galera master at a time.    If this is a particular<br>

>>>> hotspot for Heat (no pun intended) can we pursue adding a transaction<br>

>>>> retry decorator for this operation?  This is the standard approach for<br>

>>>> other applications that are subject to galera multi-master writeset<br>

>>>> conflicts such as Neutron.<br>

>><br>

>> The weird thing about this issue is that we actually have a retry <br>

>> decorator on the operation that I assume is the problem. It was added <br>

>> in Queens and largely fixed this issue in the gate:<br>

>><br>

>> <a href="https://review.openstack.org/#/c/521170/1/heat/db/sqlalchemy/api.py" rel="noreferrer" target="_blank">https://review.openstack.org/#/c/521170/1/heat/db/sqlalchemy/api.py</a><br>

>><br>

>>> Correct.<br>

>>><br>

>>> Heat doesn't use SELECT .. FOR UPDATE does it? That's also a big <br>

>>> cause of the aforementioned "deadlocks".<br>

>><br>

>> AFAIK, no. In fact we were quite careful to design stuff that is <br>

>> expected to be subject to write contention to use UPDATE ... WHERE (by <br>

>> doing query().filter_by().update() in sqlalchemy), but it turned out <br>

>> to be those very statements that were most prone to causing deadlocks <br>

>> in the gate (i.e. we added retry decorators in those two places and <br>

>> the failures went away), according to me in the commit message for <br>

>> that patch: <a href="https://review.openstack.org/521170" rel="noreferrer" target="_blank">https://review.openstack.org/521170</a><br>

>><br>

>> Are we Doing It Wrong(TM)?<br>

> <br>

> No, it looks to me like you're doing things correctly. The OP mentioned <br>

> that this only happens when deleting a Magnum cluster -- and that it <br>

> doesn't occur in normal Heat template usage.<br>

> <br>

> I wonder (as I really don't know anything about Magnum, unfortunately), <br>

> is there something different about the Magnum cluster resource handling <br>

> in Heat that might be causing the wonkiness?<br>

<br>

There's no special-casing for Magnum within Heat. It's likely to be just <br>

that there's a lot of resources in a Magnum cluster - or more <br>

specifically, a lot of edges in the resource graph, which leads to more <br>

write contention (and, in a multi-master setup, more write conflicts). <br>

I'd assume that any similarly-complex template would have the same <br>

issues, and that Ignazio just didn't have anything else that complex to <br>

hand.<br>

<br>

That gives me an idea, though. I wonder if this would help:<br>

<br>

<a href="https://review.openstack.org/627914" rel="noreferrer" target="_blank">https://review.openstack.org/627914</a><br>

<br>

Ignazio, could you possibly test with that ^ patch in multi-master mode <br>

to see if it resolves the issue?<br>

<br>

cheers,<br>

Zane.<br>

<br>

</blockquote></div></div>