<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--

/* Font Definitions */

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman","serif";}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {mso-style-priority:99;

        color:purple;

        text-decoration:underline;}

span.EmailStyle17

        {mso-style-type:personal-reply;

        font-family:"Courier New";

        color:#1F497D;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-family:"Calibri","sans-serif";}

@page WordSection1

        {size:8.5in 11.0in;

        margin:1.0in 1.0in 1.0in 1.0in;}

div.WordSection1

        {page:WordSection1;}

--></style><!--[if gte mso 9]><xml>

<o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

<o:shapelayout v:ext="edit">

<o:idmap v:ext="edit" data="1" />

</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Li Ma, Mike [Wilson | Bayer], and Roman Podoliaka,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>A similar topic came up in Atlanta at a database panel I participated in. Jay Pipes had organized it as part of the ops track and Peter Boros (of Percona) and I were on the panel. The issue of what to do about the database under OpenStack in the face of high load from such components as, for example ceilometer.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Splitting reads and writes is a solution that is fraught with challenges as it requires the application to know where it wrote, where it should read from, what is replication latency, and all of that. At the heart of the issue is that you want to scale the database.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>I had suggested at this panel that those who want to try and solve this problem should try the Database Virtualization Engine[1] product from Tesora. In the interest of full disclosure, I work for Tesora. <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>The solution is a simple way to horizontally scale a MySQL (or Percona or MariaDB) database across a collection of database servers. It exposes a MySQL compatible interface and takes care of all the minutiae of where to store data, partitioning it across the various database servers, and executing queries on behalf of an application irrespective of the location of the data. It natively provides such capabilities as distributed joins, aggregation and sorting. Architecturally it is a traditional parallel database built from a collection of unmodified MySQL (or variant) databases. <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>It is open source, and available for free download.[2] <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Percona recently tested[3] the DVE product and confirmed that the solution provided horizontal scalability and linear (and in some cases better than linear) performance improvements[4] with scale. You can get a copy of their full test report here.[5] <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Ingesting data at very high volume is often an area of considerable pain for large systems and in one demonstration of our product, we were required to ingest 1 million CDR style records per second. We demonstrated that with just 15 Amazon RDS servers (m1.xlarge, standard EBS volumes, no provisioned IOPS) and two c1.xlarge servers to run the Tesora DVE software, we could in fact ingest a sustained stream of over 1 million CDR’s a second![6]<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>To Mike Wilson and Roman’s point, the solution I’m proposing would be entirely transparent to the developer and would be something that would be both elastic and scalable with the workload placed on it. In addition, standard SQL queries will continue to work unmodified, irrespective of which database server physically holds a row of data.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>To Mike Bayer’s point about data distribution and transaction management; yes, we handle all the details relating to handling data consistency and providing atomic transactions during Insert/Update/Delete operations.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>As a company, we at Tesora are committed to OpenStack and are significant participants in Trove (the database-as-a-service project for OpenStack). You can verify this yourself on Stackalytics [7] or [8]. If you would like to consider it as a part of your solution to oslo.db, we’d be thrilled to work with the OpenStack community to make this work, both from a technical and a business/licensing perspective. You can catch most of our dev team on either #openstack-trove or #tesora.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Some of us from Tesora, Percona and Mirantis are planning an ops panel similar to the one at Atlanta, for the Summit in Paris. I would definitely like to meet with more of you in Paris and discuss how we address issues of scale in the database that powers an OpenStack implementation.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Thanks,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>-amrith<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>--<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Amrith Kumar, CTO Tesora (www.tesora.com)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Twitter: @amrithkumar  <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>IRC: amrith @freenode <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>[1] <a href="http://www.tesora.com/solutions/database-virtualization-engine">http://www.tesora.com/solutions/database-virtualization-engine</a><o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>[2] <a href="http://www.tesora.com/solutions/downloads/products">http://www.tesora.com/solutions/downloads/products</a><o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>[3] <a href="http://www.mysqlperformanceblog.com/2014/06/24/benchmarking-tesoras-database-virtualisation-engine-sysbench/">http://www.mysqlperformanceblog.com/2014/06/24/benchmarking-tesoras-database-virtualisation-engine-sysbench/</a> <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>[4] <a href="http://www.tesora.com/blog/perconas-evaluation-our-database-virtualization-engine">http://www.tesora.com/blog/perconas-evaluation-our-database-virtualization-engine</a><o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>[5] <a href="http://resources.tesora.com/site/download/percona-benchmark-whitepaper">http://resources.tesora.com/site/download/percona-benchmark-whitepaper</a> <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>[6] <a href="http://www.tesora.com/blog/ingesting-over-1000000-rows-second-mysql-aws-cloud">http://www.tesora.com/blog/ingesting-over-1000000-rows-second-mysql-aws-cloud</a> <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>[7] <a href="http://stackalytics.com/?module=trove-group&metric=commits">http://stackalytics.com/?module=trove-group&metric=commits</a><o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>[8] <a href="http://stackalytics.com/?module=trove-group&metric=marks">http://stackalytics.com/?module=trove-group&metric=marks</a><o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'><o:p> </o:p></span></p><div style='border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt'><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal><b><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>From:</span></b><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'> Mike Wilson [mailto:geekinutah@gmail.com] <br><b>Sent:</b> Friday, August 08, 2014 7:35 PM<br><b>To:</b> OpenStack Development Mailing List (not for usage questions)<br><b>Subject:</b> Re: [openstack-dev] [oslo.db]A proposal for DB read/write separation<o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>Li Ma,<o:p></o:p></p><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>This is interesting, In general I am in favor of expanding the scope of any read/write separation capabilities that we have. I'm not clear what exactly you are proposing, hopefully you can answer some of my questions inline. The thing I had thought of immediately was detection of whether an operation is read or write and integrating that into oslo.db or sqlalchemy. Mike Bayer has some thoughts on that[1] and there are other approaches around that can be copied/learned from. These sorts of things are clear to me and while moving towards more transparency for the developer, still require context. Please, share with us more details on your proposal.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>-Mike<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>[1] <a href="http://www.percona.com/doc/percona-xtradb-cluster/5.5/wsrep-system-index.html">http://www.percona.com/doc/percona-xtradb-cluster/5.5/wsrep-system-index.html</a><o:p></o:p></p></div><div><p class=MsoNormal>[2] <a href="http://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/">http://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/</a><o:p></o:p></p></div><div><p class=MsoNormal style='margin-bottom:12.0pt'><o:p> </o:p></p><div><p class=MsoNormal>On Thu, Aug 7, 2014 at 10:03 PM, Li Ma <<a href="mailto:skywalker.nick@gmail.com" target="_blank">skywalker.nick@gmail.com</a>> wrote:<o:p></o:p></p><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in'><p class=MsoNormal>Getting a massive amount of information from data storage to be displayed is<br>where most of the activity happens in OpenStack. The two activities of reading<br>data and writing (creating, updating and deleting) data are fundamentally<br>different.<br><br>The optimization for these two opposite database activities can be done by<br>physically separating the databases that service these two different<br>activities. All the writes go to database servers, which then replicates the<br>written data to the database server(s) dedicated to servicing the reads.<o:p></o:p></p></blockquote><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in'><p class=MsoNormal><br>Currently, AFAIK, many OpenStack deployment in production try to take<br>advantage of MySQL (includes Percona or MariaDB) multi-master Galera cluster.<br>It is possible to design and implement a read/write separation schema<br>for such a DB cluster.<o:p></o:p></p></blockquote><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>I just want to clarify here, are you suggesting that _all_ reads and _all_ writes would hit different databases? It would be interesting to see a relational schema design that would allow that to work. That seems like something that you wouldn't try in a relational database at all.<o:p></o:p></p></div><div><p class=MsoNormal> <o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in'><p class=MsoNormal><br>Actually, OpenStack has a method for read scalability via defining<br>master_connection and slave_connection in configuration, but this method<br>lacks of flexibility due to deciding master or slave in the logical<br>context(code). It's not transparent for application developer.<br>As a result, it is not widely used in all the OpenStack projects.<br><br>So, I'd like to propose a transparent read/write separation method<br>for oslo.db that every project may happily takes advantage of it<br>without any code modification.<o:p></o:p></p></blockquote><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>The problem with making it transparent to the developer is that, well, you can't unless your application is tolerant of old data in an asynchronous replication world. If you are in a fully synchronous world you could fully separate writes and reads, but what would be the point since your database performance is now trash anyway. Please note that although Galera is a considered a synchronous model it's not actually all the way there. You can break the certification of course, but there are also things that are done to keep the performance to an acceptable level. Take for example the wswrep_causal_reads configuration parameter[2]. Without this sucker being turned on you can't make read/write separation transparent to the developer. Turning it on causes a significant performance degradation unfortunately. <o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>I feel like this is a problem fundamental to a consistent relational dataset. If you are okay with eventual consistency it's okay, you can make things transparent to the developer. But by it's very nature relational datasets are well, relational, they need all the other pieces and those pieces need to be consistent. I guess what I am saying is that your proposal needs more details. Please respond with specifics and examples to move the discussion forward.<o:p></o:p></p></div><div><p class=MsoNormal> <o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in'><p class=MsoNormal><br>Moreover, I'd like to put it in the mailing list in advance to<br>make sure it is acceptable for oslo.db.<br><br>I'd appreciate any comments.<br><br>br.<br>Li Ma<br><br><br>_______________________________________________<br>OpenStack-dev mailing list<br><a href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.org</a><br><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><o:p></o:p></p></blockquote></div><p class=MsoNormal><o:p> </o:p></p></div></div></div></div></body></html>