[openstack-dev] [nova] nova-manage db archive_deleted_rows broken

Matt Riedemann mriedem at linux.vnet.ibm.com
Wed Nov 18 03:51:50 UTC 2015



On 11/17/2015 7:43 PM, Matt Riedemann wrote:
>
>
> On 10/9/2015 1:16 PM, Matt Riedemann wrote:
>>
>>
>> On 10/9/2015 12:03 PM, Jay Pipes wrote:
>>> On 10/07/2015 11:04 AM, Matt Riedemann wrote:
>>>> I'm wondering why we don't reverse sort the tables using the sqlalchemy
>>>> metadata object before processing the tables for delete?  That's the
>>>> same thing I did in the 267 migration since we needed to process the
>>>> tree starting with the leafs and then eventually get back to the
>>>> instances table (since most roads lead to the instances table).
>>>
>>> Yes, that would make a lot of sense to me if we used the SA metadata
>>> object for reverse sorting.
>>
>> When I get some free time next week I'm going to play with this.
>>
>>>
>>>> Another thing that's really weird is how max_rows is used in this code.
>>>> There is cumulative tracking of the max_rows value so if the value you
>>>> pass in is too small, you might not actually be removing anything.
>>>>
>>>> I figured max_rows meant up to max_rows from each table, not max_rows
>>>> *total* across all tables. By my count, there are 52 tables in the nova
>>>> db model. The way I read the code, if I pass in max_rows=10 and say it
>>>> processes table A and archives 7 rows, then when it processes table
>>>> B it
>>>> will pass max_rows=(max_rows - rows_archived), which would be 3 for
>>>> table B. If we archive 3 rows from table B, rows_archived >= max_rows
>>>> and we quit. So to really make this work, you have to pass in something
>>>> big for max_rows, like 1000, which seems completely random.
>>>>
>>>> Does this seem odd to anyone else?
>>>
>>> Uhm, yes it does.
>>>
>>>  > Given the relationships between
>>>> tables, I'd think you'd want to try and delete max_rows for all tables,
>>>> so archive 10 instances, 10 block_device_mapping, 10 pci_devices, etc.
>>>>
>>>> I'm also bringing this up now because there is a thread in the
>>>> operators
>>>> list which pointed me to a set of scripts that operators at GoDaddy are
>>>> using for archiving deleted rows:
>>>>
>>>> http://lists.openstack.org/pipermail/openstack-operators/2015-October/008392.html
>>>>
>>>>
>>>>
>>>>
>>>> Presumably because the command in nova doesn't work. We should either
>>>> make this thing work or just punt and delete it because no one cares.
>>>
>>> The db archive code in Nova just doesn't make much sense to me at all.
>>> The algorithm for purging stuff, like you mention above, does not take
>>> into account the relationships between tables; instead of diving into
>>> the children relations and archiving those first, the code just uses a
>>> simplistic "well, if we hit a foreign key error, just ignore and
>>> continue archiving other things, we will eventually repeat the call to
>>> delete this row" strategy:
>>>
>>> https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L6021-L6023
>>>
>>>
>>
>> Yeah, I noticed that too and I don't think it actually does anything. We
>> never actually come back since that would require some
>> tracking/stack/recursion stuff to retry failed tables, which we don't do.
>>
>>>
>>>
>>> I had a proposal [1] to completely rework the whole shadow table mess
>>> and db archiving functionality. I continue to believe that is the
>>> appropriate solution for this, and that we should rip out the existing
>>> functionality because it simply does not work properly.
>>>
>>> Best,
>>> -jay
>>>
>>> [1] https://review.openstack.org/#/c/137669/
>>
>> Are you going to pick that back up? Or sick some minions on it.
>>
>>>
>>> __________________________________________________________________________
>>>
>>>
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>
> I found some time to work on a reverse sort of nova's tables for the db
> archive command, that looks like [1].  It works fine in the unit tests,
> but fails because the deleted instances are referenced by
> instance_actions that aren't deleted.  I see any DB APIs for deleting
> instance actions.

I *don't* see any DB APIs for deleting instance actions.

Kind of an important difference there.  Jay got it at least. :)

>
> Were we just planning on instance_actions living forever in the database?
>
> Should we soft delete instance_actions when we delete the referenced
> instance?
>
> Or should we (hard) delete instance_actions when we archive (move to
> shadow tables) soft deleted instances?
>
> This is going to be a blocker to getting nova-manage db
> archive_deleted_rows working.
>
> [1] https://review.openstack.org/#/c/246635/
>

-- 

Thanks,

Matt Riedemann




More information about the OpenStack-dev mailing list