[openstack-dev] [nova] nova-manage db archive_deleted_rows broken

Matt Riedemann mriedem at linux.vnet.ibm.com
Wed Nov 18 01:43:06 UTC 2015



On 10/9/2015 1:16 PM, Matt Riedemann wrote:
>
>
> On 10/9/2015 12:03 PM, Jay Pipes wrote:
>> On 10/07/2015 11:04 AM, Matt Riedemann wrote:
>>> I'm wondering why we don't reverse sort the tables using the sqlalchemy
>>> metadata object before processing the tables for delete?  That's the
>>> same thing I did in the 267 migration since we needed to process the
>>> tree starting with the leafs and then eventually get back to the
>>> instances table (since most roads lead to the instances table).
>>
>> Yes, that would make a lot of sense to me if we used the SA metadata
>> object for reverse sorting.
>
> When I get some free time next week I'm going to play with this.
>
>>
>>> Another thing that's really weird is how max_rows is used in this code.
>>> There is cumulative tracking of the max_rows value so if the value you
>>> pass in is too small, you might not actually be removing anything.
>>>
>>> I figured max_rows meant up to max_rows from each table, not max_rows
>>> *total* across all tables. By my count, there are 52 tables in the nova
>>> db model. The way I read the code, if I pass in max_rows=10 and say it
>>> processes table A and archives 7 rows, then when it processes table B it
>>> will pass max_rows=(max_rows - rows_archived), which would be 3 for
>>> table B. If we archive 3 rows from table B, rows_archived >= max_rows
>>> and we quit. So to really make this work, you have to pass in something
>>> big for max_rows, like 1000, which seems completely random.
>>>
>>> Does this seem odd to anyone else?
>>
>> Uhm, yes it does.
>>
>>  > Given the relationships between
>>> tables, I'd think you'd want to try and delete max_rows for all tables,
>>> so archive 10 instances, 10 block_device_mapping, 10 pci_devices, etc.
>>>
>>> I'm also bringing this up now because there is a thread in the operators
>>> list which pointed me to a set of scripts that operators at GoDaddy are
>>> using for archiving deleted rows:
>>>
>>> http://lists.openstack.org/pipermail/openstack-operators/2015-October/008392.html
>>>
>>>
>>>
>>> Presumably because the command in nova doesn't work. We should either
>>> make this thing work or just punt and delete it because no one cares.
>>
>> The db archive code in Nova just doesn't make much sense to me at all.
>> The algorithm for purging stuff, like you mention above, does not take
>> into account the relationships between tables; instead of diving into
>> the children relations and archiving those first, the code just uses a
>> simplistic "well, if we hit a foreign key error, just ignore and
>> continue archiving other things, we will eventually repeat the call to
>> delete this row" strategy:
>>
>> https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L6021-L6023
>>
>
> Yeah, I noticed that too and I don't think it actually does anything. We
> never actually come back since that would require some
> tracking/stack/recursion stuff to retry failed tables, which we don't do.
>
>>
>>
>> I had a proposal [1] to completely rework the whole shadow table mess
>> and db archiving functionality. I continue to believe that is the
>> appropriate solution for this, and that we should rip out the existing
>> functionality because it simply does not work properly.
>>
>> Best,
>> -jay
>>
>> [1] https://review.openstack.org/#/c/137669/
>
> Are you going to pick that back up? Or sick some minions on it.
>
>>
>> __________________________________________________________________________
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>

I found some time to work on a reverse sort of nova's tables for the db 
archive command, that looks like [1].  It works fine in the unit tests, 
but fails because the deleted instances are referenced by 
instance_actions that aren't deleted.  I see any DB APIs for deleting 
instance actions.

Were we just planning on instance_actions living forever in the database?

Should we soft delete instance_actions when we delete the referenced 
instance?

Or should we (hard) delete instance_actions when we archive (move to 
shadow tables) soft deleted instances?

This is going to be a blocker to getting nova-manage db 
archive_deleted_rows working.

[1] https://review.openstack.org/#/c/246635/

-- 

Thanks,

Matt Riedemann




More information about the OpenStack-dev mailing list