[openstack-dev] [nova] if by "archived" you mean, "wipes out your tables completely", then sure, it works fine

Attila Fazekas afazekas at redhat.com
Fri Mar 13 09:19:14 UTC 2015


The archiving has issues since very long time [1],
something like this [2] is expected to replace it.

The archiving just move trash to the other side of the desk,
usually just permanently deleting everything what is deleted
for more than 7 day is better for everyone.

For now, maybe just wiping out the shadow tables and the existing nova-mange 
functionality is better choice. [3]

[1] https://bugs.launchpad.net/nova/+bug/1305892
[2] https://blueprints.launchpad.net/nova/+spec/db-purge-engine
[3] https://bugs.launchpad.net/nova/+bug/1426873

----- Original Message -----
> From: "Mike Bayer" <mbayer at redhat.com>
> To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
> Sent: Friday, March 13, 2015 12:29:55 AM
> Subject: [openstack-dev] [nova] if by "archived" you mean,	"wipes out your tables completely", then sure, it works
> fine
> 
> Hello Nova -
> 
> Not sure if I’m just staring at this for too long, or if
> archive_deleted_rows_for_table() is just not something we ever use.
> Because it looks like it’s really, really broken very disastrously, and I’m
> wondering if I’m just missing something in front of me.
> 
> Let’s look at what it does!
> 
> First, archive_deleted_rows() calls it with a table name. These names are
> taken by collecting every single table name from nova.db.sqlalchemy.models.
> 
> Then, the function uses table reflection (that is, doesn’t look in the model
> at all, just goes right to the database) to load the table definitions:
> 
>     table = Table(tablename, metadata, autoload=True)
>     shadow_tablename = _SHADOW_TABLE_PREFIX + tablename
>     rows_archived = 0
>     try:
>         shadow_table = Table(shadow_tablename, metadata, autoload=True)
>     except NoSuchTableError:
>         # No corresponding shadow table; skip it.
>         return rows_archived
> 
> this is pretty heavy handed and wasteful from an efficiency point of view,
> and I’d like to fix this too, but let’s go with it. Now we have the two
> tables.
> 
> Then we do this:
> 
>     deleted_column = table.c.deleted
>     query_insert = sql.select([table],
>                           deleted_column != deleted_column.default).\
>                           order_by(column).limit(max_rows)
>     query_delete = sql.select([column],
>                           deleted_column != deleted_column.default).\
>                           order_by(column).limit(max_rows)
> 
> We make some SELECT statements that we’re going to use to find “soft
> deleted” rows, and these will be embedded into an INSERT
> and a DELETE. It is trying to make a statement like “SELECT .. FROM
> table WHERE deleted != <deleted_default>”, so that it finds rows where
> “deleted” has been changed to something, e.g. the row was
> soft deleted.
> 
> But what’s the value of “deleted_default” ?   Remember, all this
> table knows is what the database just told us about it, because it only
> uses reflection.  Let’s see what the “deleted” column in a table like
> instance_types looks like:
> 
> MariaDB [nova]> show create table instance_types;
> | instance_types | CREATE TABLE `instance_types` (
>   `created_at` datetime DEFAULT NULL,
> 
>   …  [omitted] ...
> 
>   `deleted` int(11) DEFAULT NULL,
> )
> 
> The default that we get for this column is NULL. That is very interesting!
> Because, if we look at the *Python-side value of deleted*, we see something
> that is quite the opposite of NULL, e.g. a thing that is most certainly not
> null:
> 
> class SoftDeleteMixin(object):
>     deleted_at = Column(DateTime)
>     deleted = Column(Integer, default=0)
> 
> See that zero there? That’s a ***Python-side default***. It is **not the
> server default**!! You will **not** get it from reflection, the database has
> no clue about it (oddly enough, this entire subject matter is fully
> documented in SQLAlchemy’s documentation, and guess what, the docs are free!
> Read them all you like, I won’t ask for a dime, no questions asked!).
> 
> So, all of our INSERTS **will** put a zero, not NULL, into that column.
> Let’s look in instance_types and see:
> 
> MariaDB [nova]> select id, name, deleted from instance_types;
> +----+-----------+---------+
> | id | name      | deleted |
> +----+-----------+---------+
> |  3 | m1.large  |       0 |
> |  1 | m1.medium |       0 |
> |  7 | m1.micro  |       0 |
> |  6 | m1.nano   |       0 |
> |  5 | m1.small  |       0 |
> |  2 | m1.tiny   |       0 |
> |  4 | m1.xlarge |       0 |
> +----+-----------+---------+
> 7 rows in set (0.00 sec)
> 
> No NULLs.  The value of non-deleted rows is zero.
> 
> What does this all mean?
> 
> It means, when this archival routine runs, it runs queries like this:
> 
> INSERT INTO shadow_quota_usages SELECT quota_usages.created_at,
> quota_usages.updated_at, quota_usages.deleted_at, quota_usages.id,
> quota_usages.project_id, quota_usages.resource, quota_usages.in_use,
> quota_usages.reserved, quota_usages.until_refresh, quota_usages.deleted,
> quota_usages.user_id
> FROM quota_usages
> WHERE quota_usages.deleted IS NOT NULL ORDER BY quota_usages.id
>  LIMIT ? OFFSET ?
> 2015-03-12 17:01:01,218 INFO [sqlalchemy.engine.base.Engine] (7, 0)
> 2015-03-12 17:01:01,219 INFO [sqlalchemy.engine.base.Engine] DELETE FROM
> quota_usages WHERE quota_usages.id in (SELECT T1.id FROM (SELECT
> quota_usages.id
> FROM quota_usages
> WHERE quota_usages.deleted IS NOT NULL ORDER BY quota_usages.id
>  LIMIT ? OFFSET ?) as T1)
> 
> The second query is to DELETE rows from a table like quota_usages based on
> looking at rows where the “deleted” column is “NOT NULL”. Which means, all
> of them! They are all zeros, not NULL!
> 
> If we assume that all of our tables are filled up with zeroes for those
> deleted columns, because that’s the default, this **wipes the whole table
> clean**.
> 
> How do the tests pass? Well the tests are in test_db_api->ArchiveTestCase,
> and actually, they don’t. But they don’t fail every time, because the test
> suite here runs with a database that is almost completely empty anyway, so
> the broken archival routine doesn’t find too many rows to blow away except
> for the rows in “instance_types”, which it only finds sometimes because the
> tests are only running it with a small number of things to delete and the
> order of the tables is non-deterministic.
> 
> I’ve posted the bug report at https://bugs.launchpad.net/nova/+bug/1431571
> where I started out not knowing much about how this worked except that my
> tests were failing, and slowly stumbled my way to come to this conclusion. A
> patch is at https://review.openstack.org/#/c/164009/, where we look at the
> actual Python-side default. However I’d recommend that we just hardcode the
> zero here, since that’s how our soft-delete columns work.
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list