[openstack-dev] [nova] nova-manage db archive_deleted_rows broken

Matt Riedemann mriedem at linux.vnet.ibm.com
Fri Oct 9 18:16:28 UTC 2015



On 10/9/2015 12:03 PM, Jay Pipes wrote:
> On 10/07/2015 11:04 AM, Matt Riedemann wrote:
>> I'm wondering why we don't reverse sort the tables using the sqlalchemy
>> metadata object before processing the tables for delete?  That's the
>> same thing I did in the 267 migration since we needed to process the
>> tree starting with the leafs and then eventually get back to the
>> instances table (since most roads lead to the instances table).
>
> Yes, that would make a lot of sense to me if we used the SA metadata
> object for reverse sorting.

When I get some free time next week I'm going to play with this.

>
>> Another thing that's really weird is how max_rows is used in this code.
>> There is cumulative tracking of the max_rows value so if the value you
>> pass in is too small, you might not actually be removing anything.
>>
>> I figured max_rows meant up to max_rows from each table, not max_rows
>> *total* across all tables. By my count, there are 52 tables in the nova
>> db model. The way I read the code, if I pass in max_rows=10 and say it
>> processes table A and archives 7 rows, then when it processes table B it
>> will pass max_rows=(max_rows - rows_archived), which would be 3 for
>> table B. If we archive 3 rows from table B, rows_archived >= max_rows
>> and we quit. So to really make this work, you have to pass in something
>> big for max_rows, like 1000, which seems completely random.
>>
>> Does this seem odd to anyone else?
>
> Uhm, yes it does.
>
>  > Given the relationships between
>> tables, I'd think you'd want to try and delete max_rows for all tables,
>> so archive 10 instances, 10 block_device_mapping, 10 pci_devices, etc.
>>
>> I'm also bringing this up now because there is a thread in the operators
>> list which pointed me to a set of scripts that operators at GoDaddy are
>> using for archiving deleted rows:
>>
>> http://lists.openstack.org/pipermail/openstack-operators/2015-October/008392.html
>>
>>
>> Presumably because the command in nova doesn't work. We should either
>> make this thing work or just punt and delete it because no one cares.
>
> The db archive code in Nova just doesn't make much sense to me at all.
> The algorithm for purging stuff, like you mention above, does not take
> into account the relationships between tables; instead of diving into
> the children relations and archiving those first, the code just uses a
> simplistic "well, if we hit a foreign key error, just ignore and
> continue archiving other things, we will eventually repeat the call to
> delete this row" strategy:
>
> https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L6021-L6023

Yeah, I noticed that too and I don't think it actually does anything. We 
never actually come back since that would require some 
tracking/stack/recursion stuff to retry failed tables, which we don't do.

>
>
> I had a proposal [1] to completely rework the whole shadow table mess
> and db archiving functionality. I continue to believe that is the
> appropriate solution for this, and that we should rip out the existing
> functionality because it simply does not work properly.
>
> Best,
> -jay
>
> [1] https://review.openstack.org/#/c/137669/

Are you going to pick that back up? Or sick some minions on it.

>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

-- 

Thanks,

Matt Riedemann




More information about the OpenStack-dev mailing list