[openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

Nikola Đipanov ndipanov at redhat.com
Mon Jul 8 12:15:10 UTC 2013


On 05/07/13 14:26, Boris Pavlovic wrote:
> Hi all, 
> 
> I would like to explain very high level steps of our work: 
> 1) Sync work with DB in all projects (We have what we have, let it be in
> one place)
> 2) Refactor work with DB in one place (not independently in all projects) 
> 
> So I understand that our code around DB is not ideal, but let it be in
> one place at first.
> 

This is fine in principle, however I don't think we should push it
without considering the details (where the devil is apparently).
I am arguing that DB archiving should be re-done and is broken
conceptually (example below), and I think it would be suboptimal (to say
the least) to get it everywhere first and then fix it.

Just saying a hand-wavy "yeah, but once it's in Oslo we can fix it" is
wrong - especially for functionality that is younger than the time it
will likely take it to 'graduate' Oslo.

> ----------
> About DB archiving. 
> ----------
> Let me describe how it works for non familiar contributors:
> 
> For each table (that have columns, indexes, unique constraints, fk and
> etc) we have shadow table that have only columns (without indexes,
> unique constraints, fk..)
> 
> And then we have utility that makes next things: 
> "move from original table records (that are marked as "deleted") to shadow"
> 
> This was done by David Ripton in Nova in Grizzly. 
> 
> -----
> 
> After a few months I found that there are tons of migrations for
> "original" table and there is no migration for "shadow table". 
> And implement this
> BP https://blueprints.launchpad.net/nova/+spec/db-improve-archiving that
> makes next:
> a) sync shadow tables with original
> b) add test that checks that:
>   1) for each "original" table we have shadow
>   2) we don't have extra shadow tables
>   3) shadow tables have same columns as "original"
> 
> Why is this so important: 
> 1) If "shadow" and "original" table are not synced there could be 2
> results after shadow util was ran:
>   a) it will fail
>   b) (worst) it will break data in shadow table
> 
> ------
> 
> Also there is no exponential growth of JOINs when we are using shadow
> tables: 
> 
> In migrations we should:
> a) Do the same actions on columns (drop, alter) in main and shadow
> b) Do the same actions on tables (create/drop/rename)
> c) Do the same actions on data in Tables 
> 
> So you are doing separately actions on Main tables and Shadow tables,
> but after migration our tables should be synced.
> 
> And it is easier to make the same actions 2 times on "main" and "shadow"
> table in one migration then in separated migrations. 
> 

This is only true if you have one table with no relations that need to
be considered.

Here is an example of when it gets tricky - Say you have a table T1 and
a migration that adds a column c1 that relies on some data from table T2
and T1 has a FK that points to T2. And say for the sake of argument that
objects that are represented by rows in T1 and T2 have different
life-times in the system (think instances and devices, groups, quotas,
networks... this is common in our data model).

In order to properly migrate and assign values to the newly created c1
you will need to:

* Add the column c1 to the live T1
* join on live T2 *and* shadow T2 to get the data needed and populate
the new column.
* Add the column c1 to the shadow T1
* join on live T2 *and* shadow T2 to get the data needed and populate
the new column.

Hence - exponentially more joins, as I stated in my previous email.

Now - this was the *simplest* possible example - things get potentially
much more complicated if the new column relies on previous state of data
(say - counters of some sort), if you need to get data from a third
table (think many-to-many relationships) etc.

If you need a real example - take a look at migration 186 in the current
trunk.

As I said in the previous email, and based on the examples above - this
design decision (unconstrained rows) makes it difficult to reason about
data in the system!

I personally - as a developer working on the codebase - am not happy
making this trade-off in favour of archiving in this way - and would
like to see some design decisions changed, or at the very least a more
broad consensus, that the state as-is is actually OK and we don't need
to worry about it.

> -----
> 
> About the db_sync "downtime" (upgrading from one to another DB version)
> (IRC)
> 
> DB Archiving just help us to reduce this time. One of possible variant
> (high level): 
> 1) Move to shadow_tables our "deleted" rows

This step is in the case of the workflow you describe here:
  1) mandatory
  2) completely defeating the purpose of unconstrained rows if in order
to migrate we have to move *all* of them to shadow tables whcih may take
a non-trivial amount of time.

> 2) Copy shadow_tables from schema -> to tmp_schema
> 3) Drop data from shadow_tables
> 4) Make migrations on schema: 
> a) As shadow tables are empty all migrations will be done really fast 
> b) As our original tables (have) only non "deleted" rows migration will
> be done also much faster.
> 5) Run Nova
> 6) Make migration on tmp_schema
> 7) Copy from tmp_schema to shcema (if it is required for some reasons) 
> 
> So for example writing utitlites that will be able to do this will be
> very useful. 
> ------
> 

Yes this is true - you can write tools to do this, and I am all for
iterative development so I am not opposed to having an early model that
has rough edges - however - I am questioning some basic design decisions
here that I think are not worth the trade-offs.

> 
> So what I think about DB archiving. 
> It is great things that helps us: 
> 1) to reduce migrations downtime
> 2) to reduce count of rows in original table and improve performance
> 
> And I think that tests that checks that "original" and "shadow" tables
> are synces is required here. 
> 

As is probably clear from all above - I respectfully disagree on all
points except 2 for db migrations in the current state.

Thanks,

N.

> 
> Best regards,
> Boris Pavlovic
> 
> 
> 
> 
> 
> On Fri, Jul 5, 2013 at 3:41 PM, Nikola Đipanov <ndipanov at redhat.com
> <mailto:ndipanov at redhat.com>> wrote:
> 
>     On 02/07/13 19:50, Boris Pavlovic wrote:
>     >
>     >   *) DB Archiving
>     >      a) create shadow tables
>     >      b) add tests that checks that shadow and main table are synced.
>     >      c) add code that work with shadow tables.
>     >
> 
>     Hi Boris & all,
> 
>     I have a few points regarding db archiving work that I am growing more
>     concerned about, so I though I might mention them on this thread. I
>     pointed them out ad-hoc on a recent review
>     https://review.openstack.org/#/c/34643/ and there is some discussion
>     there already, although was not very fruitful.
> 
>     I feel that there were a few design oversights and as a result it has a
>     couple of rough edges I noticed.
> 
>     First issue is about the fact that shadow tables do not present a "view
>     of the world" themselves but are just unconstrained rows copied from
>     live tables.
> 
>     This is understandably done for performance reasons while archiving
>     (with current design ideas in place), but also causes issues when
>     migrations affect more than one table. Especially if data migrations
>     need to look at more tables at once, the actual number of table joins
>     needed in order to consider everything grows exponentially. It could be
>     argued that these are not that common, but is something that will make
>     development more difficult and migrations painful once it comes up.
> 
>     To put it shortly - this property generally makes it harder to reason
>     about data.
> 
>     Second point (and it ties in with the first one since it makes it
>     difficult to fix) - Maybe shadow table migrations should be kept
>     separate, and made optional? Currently there is a check that will fail
>     the tests unless the migration is done on both tables, which I think
>     should be removed in favour of separate migrations. Developers should
>     still migrate both of course - but deployers should be able to choose
>     not to do it according to their needs/scale. I am sure there are people
>     on this list that can chip in more on this subject (I've had a brief
>     discussion with lifeless on this topic on IRC).
> 
>     I'm afraid that if you agree that these are in fact problems - you might
>     also agree that we might want to go back on some of the design decisions
>     made around db archiving (like having unconstrained tables in the same
>     db for example).
> 
>     I'd be happy to hear some of the angles that I may have missed,
> 
>     Cheers,
> 
>     Nikola
> 
> 




More information about the OpenStack-dev mailing list