[openstack-dev] [Solum] Some initial code copying for db/migration

Clayton Coleman ccoleman at redhat.com
Mon Nov 11 18:26:34 UTC 2013

----- Original Message -----
> > 1) Using objects as an abstraction for versioned data:
> >    This seems like a good direction overall, especially from the
> >    point-of-view
> >    of backwards compatibility of consumers of the object. However, after
> >    looking through some
> >    of the objects defined in nova/objects/, I am not sure if I understand
> >    how
> >    this works. Specifically, it is not clear to me how might the consumer
> >    of the
> >    object be able to query different versions of it at runtime.
> The object registry is handled by the metaclass in objects/base.py. It
> automatically collects objects that inherit from NovaObject, and allows
> multiple versions of the same object to exist. We don't have anything
> that needs to specifically ask the registry directly for "foo object
> version X", so there's no interface for doing that right now. We do,
> however, have incoming objects over RPC that need to be re-hydrated,
> with an "is this compatible" version check. We also have the ability to
> downlevel an object using the obj_make_compatible() method. We are
> planning to always roll the conductor code first, which means it can
> take the newest version of an object from the schema (in whatever state
> it's in) and backlevel it to the version being asked for by a remote RPC
> client.

For places where we may not have an RPC isolation layer, it's similar - the code knows what version of the schema to ask for, and the object abstraction hides the details of converting between older to newer.

We probably need to map out the scenarios where multiversion is enabled for live upgrade - that's the most critical place where you need to ask for specific versions.  The Google F1 live schema change paper has a good summary of the core issues [1] (and a great diagram on page 9) with live schema migration that apply to generic SQL dbs as well.  There are five distinct phases:

  1) new code is available that can read the old schema and the new schema, but continues to read the old schema
  2) additive elements of the new schema are enabled
  3) new code begins copying/deleting data as its read / updated, and a background process is converting the rest of the data
  4) new code starts reading/querying the new fields
  5) old schema data is dropped once all code is reading the new schema

The new code has to know whether it should read the new or old schema - it can't read the new schema (query by new column names, by updated data, etc) until all of the reads are complete and in place in #4.  That could be a config value, something in memory triggered by an admin, etc.

> > 2) Using objects as an abstraction to support different kinds of backends
> >    (SQL and non-SQL backends):
> >    - Again, a good direction overall. From implementation point-of-view
> >    though
> >    this may become tricky, in the sense that the object layer would need to
> >    be
> >    designed with just the right amount of logic so as to be able to work
> >    with either
> >    a SQL or a non-SQL backend. It will be good to see some examples of how
> >    this might
> >    be done if there are any existing examples somewhere.
> We don't have any examples of using a non-SQL backend for general
> persistence in Nova, which means we don't have an example of using
> objects to hide it. If what NovaObject currently provides is not
> sufficient to hide the intricacies of a non-SQL persistence layer, I
> think it's probably best to build that on top of what we've got in the
> object model.

The abstraction will probably always leak something, but in general it comes down to isolating a full transaction behind a coarse method.  Was going to try to demonstrate that as I went.

[1] http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/41376.pdf

More information about the OpenStack-dev mailing list