<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    Please see inline.<br>
    <br>
    cheers,<br>
    <br>
    Rossella<br>
    <br>
    <div class="moz-cite-prefix">On 05/20/2014 12:26 AM, Salvatore
      Orlando wrote:<br>
    </div>
    <blockquote
cite="mid:CAGR=i3jkoR=Sgks=LOiJ9GnwqSht8CF-QwzVfpzrmpa1eOjycg@mail.gmail.com"
      type="cite">
      <div dir="ltr">Some comments inline.
        <div><br>
        </div>
        <div>Salvatore<br>
          <div class="gmail_extra"><br>
            <br>
            <div class="gmail_quote">On 19 May 2014 20:32, sridhar basam
              <span dir="ltr"><<a moz-do-not-send="true"
                  href="mailto:sridhar.basam@gmail.com" target="_blank">sridhar.basam@gmail.com</a>></span>
              wrote:<br>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                <div dir="ltr">
                  <div style="font-size:large"><br>
                  </div>
                  <div class="gmail_extra">
                    <br>
                    <br>
                    <div class="gmail_quote">
                      <div class="">On Mon, May 19, 2014 at 1:30 PM, Jay
                        Pipes <span dir="ltr"><<a
                            moz-do-not-send="true"
                            href="mailto:jaypipes@gmail.com"
                            target="_blank">jaypipes@gmail.com</a>></span>
                        wrote:<br>
                        <blockquote class="gmail_quote"
                          style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Stackers,<br>
                          <br>
                          On Friday in Atlanta, I had the pleasure of
                          moderating the database session at the Ops
                          Meetup track. We had lots of good discussions
                          and heard important feedback from operators on
                          DB topics.<br>
                          <br>
                          For the record, I would not bring this point
                          up so publicly unless I believed it was a
                          serious problem affecting a large segment of
                          users. When doing an informal survey of the
                          users/operators in the room at the start of
                          the session, out of approximately 200 people
                          in the room, only a single person was using
                          PostgreSQL, about a dozen were using standard
                          MySQL master/slave replication, and the rest
                          were using MySQL Galera clustering. So, this
                          is a real issue for a large segment of the
                          operators -- or at least the ones at the
                          session. :)<br>
                          <br>
                        </blockquote>
                        <div><br>
                        </div>
                      </div>
                      <div>
                        <div style="font-size:large">​We are one of
                          those operators that use Galera for
                          replicating our mysql databases. We used to
                           see issues with deadlocks when having
                          multiple mysql writers in our mysql cluster.
                          As a workaround we have our haproxy
                          configuration in an active-standby
                          configuration for our mysql VIP. </div>
                        <div style="font-size:large"><br>
                        </div>
                        <div style="font-size:large">I seem to recall we
                          had a lot of the deadlocks happen through
                          Neutron. When we go through our Icehouse
                          testing, we will redo our multimaster mysql
                          setup and provide feedback on the issues we
                          see.</div>
                      </div>
                    </div>
                  </div>
                </div>
              </blockquote>
              <div><br>
              </div>
              <div>The SELECT... FOR UPDATE issue is going to be a non
                trivial one for neutron as well. Some components, like
                IPAM, heavily rely on it.</div>
              <div>However, Neutron is a lot more susceptible to
                deadlock problems than nova because it does not
                implement at the moment a retry mechanism.</div>
              <div>This is something which should be added during the
                Juno release cycle regardless of all the other
                enhancement currently being planned, such as task
                oriented operations. <br>
              </div>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                <div dir="ltr">
                  <div class="gmail_extra">
                    <div class="gmail_quote">
                      <div>
                        <div style="font-size:large"><br>
                        </div>
                        <div style="font-size:large">thanks,</div>
                        <div style="font-size:large">         Sridhar</div>
                        <br>
                      </div>
                      <div class="">
                        <div> </div>
                        <blockquote class="gmail_quote"
                          style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Peter
                          Boros, from Percona, was able to provide some
                          insight on MySQL Galera topics, and one issue
                          came up that is likely the cause of a lot of
                          heartache for operators who use MySQL Galera
                          (or Percona XtraDB Cluster).<br>
                          <br>
                          We were discussing whether people had seen
                          deadlock issues [1] when using MySQL Galera in
                          their deployment, and were brainstorming on
                          why deadlocks might be seen. I had suggested
                          that perhaps Nova's use of autoincrementing
                          primary keys may have been the cause. Peter
                          pretty quickly dispatched that notion, saying
                          that Galera automatically handles
                          autoincrementing keys using managed
                          innodb_autoincrement_increment and
                          innodb_autoincrement_offset config options.<br>
                          <br>
                          I think at that point I mentioned that there
                          were a number of places that were using the
                          SELECT ... FOR UPDATE construct in Nova (in
                          SQLAlchemy, it's the with_lockmode('update')
                          modification of the query object). Peter
                          promptly said that was a problem. MySQL Galera
                          does not support SELECT ... FOR UPDATE, since
                          it has no concept of cross-node locking of
                          records and results are non-deterministic.<br>
                          <br>
                          So... what to do?<br>
                          <br>
                          For starters, some information on the use of
                          with_lockmode() in Nova and Neutron...<br>
                          <br>
                          Within Nova, there are actually only a few
                          places where with_lockmode('update') is used.
                          Unfortunately, the use of
                          with_lockmode('update') is in the quota code,
                          which tends to wrap largish blocks of code
                          within the Nova compute execution code.<br>
                          <br>
                          Within Neutron, however, the use of
                          with_lockmode('update') is all over the place.
                          There are 44 separate uses of it in 11
                          different files.<br>
                          <br>
                        </blockquote>
                      </div>
                    </div>
                  </div>
                </div>
              </blockquote>
              <div><br>
              </div>
              <div>I will report on a separate thread on this, so that
                we can have an assessment of where locking statements
                are used and why.</div>
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                <div dir="ltr">
                  <div class="gmail_extra">
                    <div class="gmail_quote">
                      <div class="">
                        <blockquote class="gmail_quote"
                          style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">We
                          have a number of options:<br>
                        </blockquote>
                      </div>
                    </div>
                  </div>
                </div>
              </blockquote>
              <div><br>
              </div>
              <div>I thin option 0 should be to rework/redesign the
                code, where possible, to avoid DB-level locking at all.</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    I totally agree. Is anybody already coordinating this rework? I'd
    like to help. After redesigning, it is gonna be easier to make a
    decision regarding a distributed lock manager.<br>
    <br>
    <blockquote
cite="mid:CAGR=i3jkoR=Sgks=LOiJ9GnwqSht8CF-QwzVfpzrmpa1eOjycg@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                <div dir="ltr">
                  <div class="gmail_extra">
                    <div class="gmail_quote">
                      <div class="">
                        <blockquote class="gmail_quote"
                          style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br>
                          1) Stop using MySQL Galera for databases of
                          projects that contain with_lockmode('update')<br>
                        </blockquote>
                      </div>
                    </div>
                  </div>
                </div>
              </blockquote>
              <div><br>
              </div>
              <div>This looks hideous, but I am afraid this is what all
                people wishing to deploy Icehouse should consider doing.</div>
              <div> <br>
              </div>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                <div dir="ltr">
                  <div class="gmail_extra">
                    <div class="gmail_quote">
                      <div class="">
                        <blockquote class="gmail_quote"
                          style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br>
                          2) Put a big old warning in the docs somewhere
                          about the problem of potential deadlocks or
                          odd behaviour with Galera in these projects<br>
                          <br>
                          3) For Nova and Neutron, remove the use of
                          with_lockmode('update') and instead use a
                          coarse-grained file lock or a distributed lock
                          manager for those areas where we need
                          deterministic reads or quiescence.<br>
                        </blockquote>
                      </div>
                    </div>
                  </div>
                </div>
              </blockquote>
              <div><br>
              </div>
              <div>We had an attempt at implementing a sort of
                distributed lock for neutron: <a moz-do-not-send="true"
                  href="https://review.openstack.org/#/c/34695/">https://review.openstack.org/#/c/34695/</a></div>
              <div>Beyond the implementation reservations on this patch,
                one thing that should be noticed, probably needless to
                say, is that distributed coordination is something that
                should never be taken in a light-hearted way.</div>
              <div>Once all the non-locking solution have been ruled
                out, distributed coordination among processes could be
                considered. In that case I think it might be better to
                use some OTS software rather than working out some home
                grown solution (I surely do not see space for a new
                project here)</div>
              <div>On a side note, I'm rather ignorant on python
                frameworks for distributed coordination... concoord? Is
                zookeper something that should be ruled out because of
                language restrictions?</div>
              <div><br>
              </div>
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                <div dir="ltr">
                  <div class="gmail_extra">
                    <div class="gmail_quote">
                      <div class="">
                        <blockquote class="gmail_quote"
                          style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br>
                          4) For the Nova db quota driver, refactor the
                          driver to either use a non-locking method for
                          reservation and quota queries or move the
                          driver out into its own projects (or use
                          something like Climate and make sure that
                          Climate uses a non-blocking algorithm for
                          those queries...)<br>
                          <br>
                          Thoughts?<br>
                          <br>
                          -jay<br>
                          <br>
                          [1] <a moz-do-not-send="true"
href="http://lists.openstack.org/pipermail/openstack/2014-May/007202.html"
                            target="_blank">http://lists.openstack.org/pipermail/openstack/2014-May/007202.html</a><br>
                          <br>
                          _______________________________________________<br>
                          OpenStack-dev mailing list<br>
                          <a moz-do-not-send="true"
                            href="mailto:OpenStack-dev@lists.openstack.org"
                            target="_blank">OpenStack-dev@lists.openstack.org</a><br>
                          <a moz-do-not-send="true"
                            href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"
                            target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
                        </blockquote>
                      </div>
                    </div>
                    <br>
                  </div>
                </div>
                <br>
                _______________________________________________<br>
                OpenStack-dev mailing list<br>
                <a moz-do-not-send="true"
                  href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
                <a moz-do-not-send="true"
                  href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"
                  target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
                <br>
              </blockquote>
            </div>
            <br>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
OpenStack-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a>
<a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>