[Openstack-operators] operator challenges with MySQL

Jay Pipes jaypipes at gmail.com
Wed Jul 10 16:37:12 UTC 2013

On 07/10/2013 11:49 AM, Yapeng Wu wrote:
> Hello, Jay,
> Could you please elaborate on this one:
>>> We've found that the message queue layer
>>> and limitations with software-defined networking components are a much
>>> larger scaling problem than the database layer, frankly.

Sure. I meant that the scaling issues we've hit have been in Quantum/NVP 
and tenant VMs being able to fairly easily overwhelm NVP with new flow 
creates and essentially cause NVP to become unresponsive. Kevin Bringard 
and other folks in our team have spent a number of weeks attempting to 
determine the best combination of plugins and configuration in Quantum 
to provide SDN for tenants with an acceptable level of redundancy and 
bandwidth that comes remotely close to the performance and stability we 
see with our nova-network multi-host/VLAN setups. Unfortunately, Quantum 
isn't my specialty and my colleague Kevin is out on vacation for a few 
weeks; hopefully he can add more information about the issues we've seen 
when he returns.

As for message queue scaling, we've had issues with Nova components 
getting "stuck" trying to reconnect to the zone's RabbitMQ load-balanced 
VIP, with the component needing a service restart in order to free 
itself up. This happens when there is a network partition or a restart 
of one of the Rabbit nodes that is balanced by the LB. We've seen the 
various daemons consume 100% of CPU and hundreds of MB of RAM when they 
get into this state. There's a bug about this (something to do with 
eventlet and RPC) but for the life of me I can't find it right now. Ray 
P from Dell had some patches in the Grizzly timeframe around making the 
RPC code more scalable and we're just rolling out Grizzly stuff now and 
hope that will make a difference.

We have not seen database scaling issues with one exception: 
Ceilometer's API (at least early versions of it) provided no paging or 
filtering functionality, which meant that tools that were hitting its 
API /resources and /meters endpoints would easily kill the endpoint as 
hundreds of thousands of records would be passed into JSON serialization 
and passed back to the caller... even a few of those happening 
concurrently was enough to cause havoc.

Hope this helps,

> Thanks,
> Yapeng Wu
> -----Original Message-----
> From: Jay Pipes [mailto:jaypipes at gmail.com]
> Sent: Tuesday, July 09, 2013 4:46 PM
> To: openstack-operators at lists.openstack.org
> Subject: Re: [Openstack-operators] operator challenges with MySQL
> On 07/08/2013 06:47 AM, Sushil Suresh wrote:
>> Hi Matt,
>> Welcome to the list.
> Indeed, welcome to the OpenStack community, Matt. Sushil did a good job
> outlining the major pain points below. I added a few things inline, but
> on a general note, once an operator stops using MySQL for Keystone token
> storage and separates heavy-write-few-read-pattern database traffic
> (like Ceilometer) from heavy-read-few-write-pattern database traffic
> (like pretty much everything else in OpenStack), I think most operators
> find that the database itself is not the number one (or even number
> three bottleneck in OpenStack. We've found that the message queue layer
> and limitations with software-defined networking components are a much
> larger scaling problem than the database layer, frankly.
> Anyway, some more comments inline...
>> DB Migrations.
>> ----------------------
>> Openstack is a very fast evolving project, which is great. However that
>> means there are quite a lot of db migrations which add and remove
>>    columns to existing tables and perform similar operations with indexes
>> etc.
>> If you have a production environment that is heavily used like ours, you
>> are looking at having millions of rows in each of these tables.
>> As a developer writing code and testing, these migrations work perfectly
>> well. Typical tests are performed with test databases containing only test
>> data which never comes up to millions or records.
>> Further more there is no production load on the test database actively
>> trying to write stuff when you are altering the tables.
>> Database abstraction with SQLAlchemy is great, but it generally mean
>> your schema alterations end up having the standard ALTER TABLE syntax.
>> I have personally used Percona's pt-online-schema-change.html to get me
>> out of some of these sticky situations.
>> http://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html
>> BTW, Thank you very much for the above tool.
>> There is a risk in using such approaches as you want to make sure that
>> you are not deviating even in the slightest of manner from what would
>> have been achieved using the standard alter table. There is some work
>> being done to introduce archiving of data etc, which will help keep data
>> growth in checks. Also using an online schema change becomes tricky when
>> you have foreign key constraints on tables or if you have setup triggers
>> for your table. (thankfully no triggers yet).
> Total agreement with Sushil here...
>> ---------------------------------------------
>> High Availability is also something that that I think could so with some
>> attention.
>> http://docs.openstack.org/trunk/openstack-ha/content/ch-intro.html
>> The above page does detail the current recommended approach for setting
>> up highly available mysql database servers.
>> The current approach is great if you are using dedicated hardware and
>> have separate physical switches etc.
>> It does work reasonably well in a virtualised environment too, but one
>> needs to take into account several other factors.
>> With mysql 5.6 support global transaction ID, (GTID) and improvements in
>> galera and PXC(percona xtradb cluster)
>> I think there is definitely room to review the current recommended solutions
> I think one of the biggest areas that Percona folks could really help
> with are detailed tutorials on how to appropriately split the read/write
> database traffic I spoke about earlier and how to make the most
> effective use of PXC/Galera. We use Galera internally for all of our
> identity and image database traffic, synchronously replicate between our
> deployment zones, and it's excellent. We use Galera for our other
> databases internal to a deployment zone as well (with the exception of
> Ceilometer, which is better utilized (IMO) with standard MySQL
> master/slave setups.
> I'd be happy to collaborate with you or someone from Percona in the
> coming months on such an article. Feel free to email me directly if you
> have interest.
> In addition to the above topics, it might be good to have a couple
> articles on backup and recovery best practices in relation to the
> database layer... specifically around what needs to be "taken down"
> during a recovery and what can stay online (for instance, a failed or
> corrupted DB doesn't necessarily need to mean loss of service or
> connectivity to tenant VMs...)
> All the best,
> -jay
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

More information about the OpenStack-operators mailing list