<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Please see inline.<br>
<br>
cheers,<br>
<br>
Rossella<br>
<br>
<div class="moz-cite-prefix">On 05/20/2014 12:26 AM, Salvatore
Orlando wrote:<br>
</div>
<blockquote
cite="mid:CAGR=i3jkoR=Sgks=LOiJ9GnwqSht8CF-QwzVfpzrmpa1eOjycg@mail.gmail.com"
type="cite">
<div dir="ltr">Some comments inline.
<div><br>
</div>
<div>Salvatore<br>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On 19 May 2014 20:32, sridhar basam
<span dir="ltr"><<a moz-do-not-send="true"
href="mailto:sridhar.basam@gmail.com" target="_blank">sridhar.basam@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr">
<div style="font-size:large"><br>
</div>
<div class="gmail_extra">
<br>
<br>
<div class="gmail_quote">
<div class="">On Mon, May 19, 2014 at 1:30 PM, Jay
Pipes <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:jaypipes@gmail.com"
target="_blank">jaypipes@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Stackers,<br>
<br>
On Friday in Atlanta, I had the pleasure of
moderating the database session at the Ops
Meetup track. We had lots of good discussions
and heard important feedback from operators on
DB topics.<br>
<br>
For the record, I would not bring this point
up so publicly unless I believed it was a
serious problem affecting a large segment of
users. When doing an informal survey of the
users/operators in the room at the start of
the session, out of approximately 200 people
in the room, only a single person was using
PostgreSQL, about a dozen were using standard
MySQL master/slave replication, and the rest
were using MySQL Galera clustering. So, this
is a real issue for a large segment of the
operators -- or at least the ones at the
session. :)<br>
<br>
</blockquote>
<div><br>
</div>
</div>
<div>
<div style="font-size:large">We are one of
those operators that use Galera for
replicating our mysql databases. We used to
see issues with deadlocks when having
multiple mysql writers in our mysql cluster.
As a workaround we have our haproxy
configuration in an active-standby
configuration for our mysql VIP. </div>
<div style="font-size:large"><br>
</div>
<div style="font-size:large">I seem to recall we
had a lot of the deadlocks happen through
Neutron. When we go through our Icehouse
testing, we will redo our multimaster mysql
setup and provide feedback on the issues we
see.</div>
</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>The SELECT... FOR UPDATE issue is going to be a non
trivial one for neutron as well. Some components, like
IPAM, heavily rely on it.</div>
<div>However, Neutron is a lot more susceptible to
deadlock problems than nova because it does not
implement at the moment a retry mechanism.</div>
<div>This is something which should be added during the
Juno release cycle regardless of all the other
enhancement currently being planned, such as task
oriented operations. <br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>
<div style="font-size:large"><br>
</div>
<div style="font-size:large">thanks,</div>
<div style="font-size:large"> Sridhar</div>
<br>
</div>
<div class="">
<div> </div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Peter
Boros, from Percona, was able to provide some
insight on MySQL Galera topics, and one issue
came up that is likely the cause of a lot of
heartache for operators who use MySQL Galera
(or Percona XtraDB Cluster).<br>
<br>
We were discussing whether people had seen
deadlock issues [1] when using MySQL Galera in
their deployment, and were brainstorming on
why deadlocks might be seen. I had suggested
that perhaps Nova's use of autoincrementing
primary keys may have been the cause. Peter
pretty quickly dispatched that notion, saying
that Galera automatically handles
autoincrementing keys using managed
innodb_autoincrement_increment and
innodb_autoincrement_offset config options.<br>
<br>
I think at that point I mentioned that there
were a number of places that were using the
SELECT ... FOR UPDATE construct in Nova (in
SQLAlchemy, it's the with_lockmode('update')
modification of the query object). Peter
promptly said that was a problem. MySQL Galera
does not support SELECT ... FOR UPDATE, since
it has no concept of cross-node locking of
records and results are non-deterministic.<br>
<br>
So... what to do?<br>
<br>
For starters, some information on the use of
with_lockmode() in Nova and Neutron...<br>
<br>
Within Nova, there are actually only a few
places where with_lockmode('update') is used.
Unfortunately, the use of
with_lockmode('update') is in the quota code,
which tends to wrap largish blocks of code
within the Nova compute execution code.<br>
<br>
Within Neutron, however, the use of
with_lockmode('update') is all over the place.
There are 44 separate uses of it in 11
different files.<br>
<br>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>I will report on a separate thread on this, so that
we can have an assessment of where locking statements
are used and why.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div class="">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">We
have a number of options:<br>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>I thin option 0 should be to rework/redesign the
code, where possible, to avoid DB-level locking at all.</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
I totally agree. Is anybody already coordinating this rework? I'd
like to help. After redesigning, it is gonna be easier to make a
decision regarding a distributed lock manager.<br>
<br>
<blockquote
cite="mid:CAGR=i3jkoR=Sgks=LOiJ9GnwqSht8CF-QwzVfpzrmpa1eOjycg@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div class="">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br>
1) Stop using MySQL Galera for databases of
projects that contain with_lockmode('update')<br>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>This looks hideous, but I am afraid this is what all
people wishing to deploy Icehouse should consider doing.</div>
<div> <br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div class="">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br>
2) Put a big old warning in the docs somewhere
about the problem of potential deadlocks or
odd behaviour with Galera in these projects<br>
<br>
3) For Nova and Neutron, remove the use of
with_lockmode('update') and instead use a
coarse-grained file lock or a distributed lock
manager for those areas where we need
deterministic reads or quiescence.<br>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>We had an attempt at implementing a sort of
distributed lock for neutron: <a moz-do-not-send="true"
href="https://review.openstack.org/#/c/34695/">https://review.openstack.org/#/c/34695/</a></div>
<div>Beyond the implementation reservations on this patch,
one thing that should be noticed, probably needless to
say, is that distributed coordination is something that
should never be taken in a light-hearted way.</div>
<div>Once all the non-locking solution have been ruled
out, distributed coordination among processes could be
considered. In that case I think it might be better to
use some OTS software rather than working out some home
grown solution (I surely do not see space for a new
project here)</div>
<div>On a side note, I'm rather ignorant on python
frameworks for distributed coordination... concoord? Is
zookeper something that should be ruled out because of
language restrictions?</div>
<div><br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div class="">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br>
4) For the Nova db quota driver, refactor the
driver to either use a non-locking method for
reservation and quota queries or move the
driver out into its own projects (or use
something like Climate and make sure that
Climate uses a non-blocking algorithm for
those queries...)<br>
<br>
Thoughts?<br>
<br>
-jay<br>
<br>
[1] <a moz-do-not-send="true"
href="http://lists.openstack.org/pipermail/openstack/2014-May/007202.html"
target="_blank">http://lists.openstack.org/pipermail/openstack/2014-May/007202.html</a><br>
<br>
_______________________________________________<br>
OpenStack-dev mailing list<br>
<a moz-do-not-send="true"
href="mailto:OpenStack-dev@lists.openstack.org"
target="_blank">OpenStack-dev@lists.openstack.org</a><br>
<a moz-do-not-send="true"
href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"
target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
</blockquote>
</div>
</div>
<br>
</div>
</div>
<br>
_______________________________________________<br>
OpenStack-dev mailing list<br>
<a moz-do-not-send="true"
href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
<a moz-do-not-send="true"
href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"
target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
OpenStack-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a>
<a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>
</pre>
</blockquote>
<br>
</body>
</html>