[openstack-dev] Does DB schema hygiene warrant long migrations?

Monty Taylor mordred at inaugust.com
Fri Oct 25 03:52:21 UTC 2013



On 10/24/2013 05:19 PM, Johannes Erdfelt wrote:
> On Fri, Oct 25, 2013, Michael Still <mikal at stillhq.com> wrote:
>> Because I am a grumpy old man I have just -2'ed
>> https://review.openstack.org/#/c/39685/ and I wanted to explain my
>> rationale. Mostly I am hoping for a consensus to form -- if I am wrong
>> then I'll happy remove my vote from this patch.
>>
>> This patch does the reasonably sensible thing of converting two
>> columns from being text to varchar, which reduces their expense to the
>> database. Given the data stored is already of limited length, it
>> doesn't impact our functionality at all either.
>>
>> However, when I run it with medium sized (30 million instances)
>> databases, the change does cause a 10 minute downtime. I don't
>> personally think the change is worth such a large outage, but perhaps
>> everyone else disagrees.
> 
> I'm not sure how you could have 30 million instances. That's a lot of
> hardware! :)
> 
> However, in our Rackspace sized deploys (less than 30 million
> instances), we've seen many migrations take longer than 10 minutes.
> 
> DB migrations are one of the biggest problems we've been facing lately.
> Especially since a lot of migrations have been done over the past number
> of months ended up causing a lot of pain considering the value they
> bring.
> 
> For instance, migration 185 was particularly painful. It only "renamed"
> the indexes, but it required rebuilding them. This took a long time for
> such a simple task.
> 
> So I'm very interested in figuring out some sort of solution that makes
> database migrations much less painful.
> 
> That said, I'm hesitant to say that cleanups like these shouldn't be
> done. At a certain point we'll build a significant amount of technical
> debt around the database that we're afraid to touch.
> 
>> PS: I could see a more complicated approach where we did these changes
>> "in flight" by adding columns, using a periodic task to copy data to
>> the new columns, and then dropping the old. That's a lot more
>> complicated to implement though.
> 
> You mean an expand/contract style of migrations?
> 
> It's been discussed at previous summits, but it's a lot of work.
> 
> It's also at the mercy of the underlying database engine. For instance,
> MySQL (depending the version and the underlying database engine) will
> recreate the table when adding columns. This will grab a lock and take
> a long time.

http://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl.html
http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html#innodb-online-ddl-summary-grid

Add column is an online operation in modern MySQL. If you are running a
real production system, you should ALWAYS use current MySQL.

If you are out there, and you have a schema large enough for this to be
an issue, you need to be running modern MySQL.

That said - I TOTALLY support all of the statements above about doing
the schema upgrades in a sane manner. It's the right thing to do.



More information about the OpenStack-dev mailing list