<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On 3 August 2015 at 20:53, Clint Byrum <span dir="ltr"><<a href="mailto:clint@fewbar.com" target="_blank">clint@fewbar.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Excerpts from Devananda van der Veen's message of 2015-08-03 08:53:21 -0700:<br>Also on a side note, I think Cinder's need for this is really subtle,<br>
and one could just accept that sometimes it's going to break when it does<br>
two things to one resource from two hosts. The error rate there might<br>
even be lower than the false-error rate that would be caused by a twitchy<br>
DLM with timeouts a little low. So there's a core cinder discussion that<br>
keeps losing to the shiny DLM discussion, and I'd like to see it played<br>
out fully: Could Cinder just not do anything, and let the few drivers<br>
that react _really_ badly, implement their own concurrency controls?<br></blockquote></div><br><br></div><div class="gmail_extra">So the problem here is data corruption. Lots of our races can cause data corruption. Not 'my instance didn't come up', not 'my network is screwed and I need to tear everything down and do it again', but 'My 1tb of customer database is now missing the second half'. This means that we *really* need some confidence and understanding in whatever we do. The idea of locks timing out and being stolen without fencing is frankly scary and begging for data corruption unless we're very careful. I'd rather use a persistent lock (e.g. a db record change) and manual recovery than a lock timeout that might cause corruption. <br></div><div class="gmail_extra"><br><br><div class="gmail_signature"><div dir="ltr"><div>-- <br>Duncan Thomas</div></div></div>
</div></div>