[openstack-dev] [nova] Adding new features to Kilo and future releases - DB upgrades

Johannes Erdfelt johannes at erdfelt.com
Mon Jan 26 21:33:51 UTC 2015


On Thu, Jan 22, 2015, Kekane, Abhishek <Abhishek.Kekane at nttdata.com> wrote:
> With online schema changes/No downtime DB upgrades things would be
> much lot easier for OpenStack deployments.
> Big kudos to Johannes who initiated this feature. But as a service
> provider, I'm curious to understand what is the development process
> of adding new features to Kilo and future releases once the online
> schema changes is in.
> 
> 
> 1. Will the committer be responsible of adding new procedures of
> upgrading db with minimal or zero downtime? or the online schema
> changes framework itself will detect whatever db changes are required
> on its own and the decision to apply db changes online or offline
> will be left solely with the service provider?

The online schema change code will compare the running schema and the
model and figure out what changes are needed to make the running schema
match the model (it actually leverages alembic for most of this). (This
automates much of the work currently done in sqlalchemy-migrate
scripts)

The scheduling of changes into the three phases is handled completely
internally to the online schema change patch. It understands which
changes are semantically safe (that can be safely applied when nova is
running) and locking safe (so it doesn't block access to the table for a
long time).

Unless you are working on the code that implements the online schema
changes, then a developer need not know how it operates.

Developers just need to make changes to the model and write
sqlalchemy-migrate scripts as we have always required.

Eventually, developers will no longer need to write sqlalchemy-migrate
scripts. This is likely to be 1 or 2 cycles away (certainly not in
Kilo).

There will be some minor restrictions on what kind of schema changes
will be allowed. As an example column renames won't be allowed because
they appear as a delete/add and we'll potentially lose data. However,
this can be done as an explicit add/delete along with a data migration,
if it's needed. Same thing with some column type changes.

I'll clarify these in the patch when it's put up for review.

> 2. Is it possible to predict how much time it would take to upgrade db
> (expand/migrate/contract phases) for adding a new column, constraint.
> For example, adding a new column with NULL constraint would take less
> time than adding a default value.

This is difficult to estimate. It varies on how fast the database server
is (CPU, disk I/O, etc), the current load of the database, the number of
rows in the table and the size of the data in the columns.

However, I can develop some relative estimates. For instance, adding an
index on MySQL 5.6 only acquires a lock very briefly and does the rest
of the work in the background. It still produces load, but it doesn't
block access to the table. This would be considered online safe and
scheduled to the expand phase. The CREATE INDEX would appear to finish
very quickly.

However, other changes could potentially require a table rewrite which
could be long if it's a large table (eg instance_system_metadata table),
but very short if the table only has a handful of rows (eg
instance_types table).

I've written a test suite which takes a variety of database software
(MySQL, PostgreSQL, etc), versions and storage engines (InnoDB, TokuDB,
etc) and does tests to figure out which changes are online safe (as far
as locking goes).

I will be using this data to make better decisions on scheduling of
operations to ensure the expand and contract phases don't cause any
problems.

I can also take that data and make it available somewhere as well
and/or possibly annotate the output from the --dry-run option to explain
why some operations are scheduled to migrate instead of the expand and
contract phases.

JE




More information about the OpenStack-dev mailing list