[nova] super long online_data_migrations
Mohammed Naser
mnaser at vexxhost.com
Mon Apr 1 02:21:22 UTC 2019
Hi there,
During upgrades, I've noticed that when running online_data_migrations
with "infinite-until-done" mode, it loops over all of the migrations
one by one.
However, one of the online data migrations
(instance_obj.populate_missing_availability_zones) makes a query that
takes a really long time as it seems inefficient (which eventually
results in 0, cause it already ran), which means as it loops in
"blocks" of 50, there's almost a 2-3 to 8 minute wait in really large
environments.
The question ends up in specific:
SELECT count(*) AS count_1
FROM (SELECT instance_extra.created_at AS instance_extra_created_at,
instance_extra.updated_at AS instance_extra_updated_at,
instance_extra.deleted_at AS instance_extra_deleted_at,
instance_extra.deleted AS instance_extra_deleted, instance_extra.id AS
instance_extra_id, instance_extra.instance_uuid AS
instance_extra_instance_uuid
FROM instance_extra
WHERE instance_extra.keypairs IS NULL AND instance_extra.deleted = 0) AS anon_1
The explain for the DB query in this example:
+------+-------------+----------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key |
key_len | ref | rows | Extra |
+------+-------------+----------------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | instance_extra | ALL | NULL | NULL |
NULL | NULL | 382473 | Using where |
+------+-------------+----------------+------+---------------+------+---------+------+--------+-------------+
It's possible that it can be ever worse, as this number is from
another very-long running environments.
+------+-------------+----------------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key |
key_len | ref | rows | Extra |
+------+-------------+----------------+------+---------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | instance_extra | ALL | NULL | NULL |
NULL | NULL | 3008741 | Using where |
+------+-------------+----------------+------+---------------+------+---------+------+---------+-------------+
I'm not the SQL expert, could we not optimize this? Alternatively,
could we update the online data migrations code to "pop out" any of
the migrations that return 0 for the next iteration, that way it only
works on those online_data_migrations that *have* to be done, and
ignore those it knows are done?
Thanks,
Mohammed
--
Mohammed Naser — vexxhost
-----------------------------------------------------
D. 514-316-8872
D. 800-910-1726 ext. 200
E. mnaser at vexxhost.com
W. http://vexxhost.com
More information about the openstack-discuss
mailing list