[nova] super long online_data_migrations

Matt Riedemann mriedemos at gmail.com
Wed Apr 3 16:11:58 UTC 2019


Following up on bug https://bugs.launchpad.net/nova/+bug/1822613 there 
are a few things:

1. https://review.openstack.org/#/c/649648/ drops the problematic 
migrate_keypairs_to_api_db data migration since we shouldn't need it 
anymore.

2. As noted in the bug and this thread, some online data migrations 
track a marker for paging and some don't, but there is no generic marker 
to say an online data migration is 'done' and doesn't need to be run 
again (like the migrations table for the sqlalchemy-migrate based schema 
migrations). Persisting something like that would require more thought 
for the online data migrations.

3. The other complaint in the bug is that --max-rows defaulting to 50 is 
too small in a large cloud (similar to bug 1742649 for --max-count on 
the map_instances command). I think a compromise here could be something 
like what Dan added for the archive_deleted_rows command:

https://review.openstack.org/#/c/378718/

Where you could specify a batch size using --max-rows but also specify 
an --until-complete option to run through all of the online data 
migrations rather than just stopping at --max-size records migrated. If 
people like that idea, then I think we just want a specless blueprint 
for it like what was used for the change above for archive_deleted_rows.

On 3/31/2019 9:21 PM, Mohammed Naser wrote:
> Hi there,
> 
> During upgrades, I've noticed that when running online_data_migrations
> with "infinite-until-done" mode, it loops over all of the migrations
> one by one.
> 
> However, one of the online data migrations
> (instance_obj.populate_missing_availability_zones) makes a query that
> takes a really long time as it seems inefficient (which eventually
> results in 0, cause it already ran), which means as it loops in
> "blocks" of 50, there's almost a 2-3 to 8 minute wait in really large
> environments.
> 
> The question ends up in specific:
> 
> SELECT count(*) AS count_1
> FROM (SELECT instance_extra.created_at AS instance_extra_created_at,
> instance_extra.updated_at AS instance_extra_updated_at,
> instance_extra.deleted_at AS instance_extra_deleted_at,
> instance_extra.deleted AS instance_extra_deleted, instance_extra.id AS
> instance_extra_id, instance_extra.instance_uuid AS
> instance_extra_instance_uuid
> FROM instance_extra
> WHERE instance_extra.keypairs IS NULL AND instance_extra.deleted = 0) AS anon_1
> 
> The explain for the DB query in this example:
> 
> +------+-------------+----------------+------+---------------+------+---------+------+--------+-------------+
> | id   | select_type | table          | type | possible_keys | key  |
> key_len | ref  | rows   | Extra       |
> +------+-------------+----------------+------+---------------+------+---------+------+--------+-------------+
> |    1 | SIMPLE      | instance_extra | ALL  | NULL          | NULL |
> NULL    | NULL | 382473 | Using where |
> +------+-------------+----------------+------+---------------+------+---------+------+--------+-------------+
> 
> It's possible that it can be ever worse, as this number is from
> another very-long running environments.
> 
> +------+-------------+----------------+------+---------------+------+---------+------+---------+-------------+
> | id   | select_type | table          | type | possible_keys | key  |
> key_len | ref  | rows    | Extra       |
> +------+-------------+----------------+------+---------------+------+---------+------+---------+-------------+
> |    1 | SIMPLE      | instance_extra | ALL  | NULL          | NULL |
> NULL    | NULL | 3008741 | Using where |
> +------+-------------+----------------+------+---------------+------+---------+------+---------+-------------+
> 
> I'm not the SQL expert, could we not optimize this?  Alternatively,
> could we update the online data migrations code to "pop out" any of
> the migrations that return 0 for the next iteration, that way it only
> works on those online_data_migrations that *have* to be done, and
> ignore those it knows are done?
> 
> Thanks,
> Mohammed
> 


-- 

Thanks,

Matt



More information about the openstack-discuss mailing list