[openstack-dev] [nova] Metadata API cross joining "instance_metadata" and "instance_system_metadata"

Matt Riedemann mriedemos at gmail.com
Mon Oct 22 16:59:48 UTC 2018


On 10/22/2018 11:25 AM, Sergio A. de Carvalho Jr. wrote:
> Hi,
> 
> While troubleshooting a production issue we identified that the Nova 
> metadata API is fetching a lot more raw data from the database than 
> seems necessary. The problem appears to be caused by the SQL query used 
> to fetch instance data that joins the "instance" table with, among 
> others, two metadata tables: "instance_metadata" and 
> "instance_system_metadata". Below is a simplified version of this query 
> (I've added the full query at the end of this message for reference):
> 
> SELECT ...
>    FROM (SELECT ...
>            FROM `instances`
>           WHERE `instances` . `deleted` = ?
>             AND `instances` . `uuid` = ?
>           LIMIT ?) AS `anon_1`
>    LEFT OUTER JOIN `instance_system_metadata` AS 
> `instance_system_metadata_1`
>      ON `anon_1` . `instances_uuid` = `instance_system_metadata_1` . 
> `instance_uuid`
>    LEFT OUTER JOIN (`security_group_instance_association` AS 
> `security_group_instance_association_1`
>                     INNER JOIN `security_groups` AS `security_groups_1`
>                     ON `security_groups_1` . `id` = 
> `security_group_instance_association_1` . `security_group_id`
>                     AND `security_group_instance_association_1` . 
> `deleted` = ?
>                     AND `security_groups_1` . `deleted` = ? )
>      ON `security_group_instance_association_1` . `instance_uuid` = 
> `anon_1` . `instances_uuid`
>     AND `anon_1` . `instances_deleted` = ?
>    LEFT OUTER JOIN `security_group_rules` AS `security_group_rules_1`
>      ON `security_group_rules_1` . `parent_group_id` = 
> `security_groups_1` . `id`
>     AND `security_group_rules_1` . `deleted` = ?
>    LEFT OUTER JOIN `instance_info_caches` AS `instance_info_caches_1`
>      ON `instance_info_caches_1` . `instance_uuid` = `anon_1` . 
> `instances_uuid`
>    LEFT OUTER JOIN `instance_extra` AS `instance_extra_1`
>      ON `instance_extra_1` . `instance_uuid` = `anon_1` . `instances_uuid`
>    LEFT OUTER JOIN `instance_metadata` AS `instance_metadata_1`
>      ON `instance_metadata_1` . `instance_uuid` = `anon_1` . 
> `instances_uuid`
>     AND `instance_metadata_1` . `deleted` = ?
> 
> The instance table has a 1-to-many relationship to both 
> "instance_metadata" and "instance_system_metadata" tables, so the query 
> is effectively producing a cross join of both metadata tables.
> 
> To illustrate the impact of this query, I have an instance that has 2 
> records in "instance_metadata" and 5 records in "instance_system_metadata":
> 
>  > select instance_uuid,`key`,value from instance_metadata where 
> instance_uuid = 'a6cf4a6a-effe-4438-9b7f-d61b23117b9b';
> +--------------------------------------+-----------+--------+
> | instance_uuid                        | key       | value  |
> +--------------------------------------+-----------+--------+
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 |
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value  |
> +--------------------------------------+-----------+--------+
> 2 rows in set (0.61 sec)
> 
>  > select instance_uuid,`key`,valusystem_metadata where instance_uuid = 
> 'a6cf4a6a-effe-4438-9b7f-d61b23117b9b';
> +------------------------+--------------------------------------+
> | key                    | value                                |
> +------------------------+--------------------------------------+
> | image_disk_format      | qcow2                                |
> | image_min_ram          | 0                                    |
> | image_min_disk         | 20                                   |
> | image_base_image_ref   | 39cd564f-6a29-43e2-815b-62097968486a |
> | image_container_format | bare                                 |
> +------------------------+--------------------------------------+
> 5 rows in set (0.00 sec)
> 
> For this particular instance, the query used by the metadata API will 
> fetch 10 records from the database:
> 
> +--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
> | anon_1_instances_uuid                | instance_metadata_1_key | 
> instance_metadata_1_value | instance_system_metadata_1_key | 
> instance_system_metadata_1_value     |
> +--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1               | 
> value1                    | image_disk_format              | qcow2       
>                           |
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2               | value 
>                      | image_disk_format              | qcow2           
>                       |
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1               | 
> value1                    | image_min_ram                  | 0           
>                           |
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2               | value 
>                      | image_min_ram                  | 0               
>                       |
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1               | 
> value1                    | image_min_disk                 | 20         
>                            |
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2               | value 
>                      | image_min_disk                 | 20               
>                      |
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1               | 
> value1                    | image_base_image_ref           | 
> 39cd564f-6a29-43e2-815b-62097968486a |
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2               | value 
>                      | image_base_image_ref           | 
> 39cd564f-6a29-43e2-815b-62097968486a |
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1               | 
> value1                    | image_container_format         | bare       
>                            |
> | a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2               | value 
>                      | image_container_format         | bare             
>                      |
> +--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
> 10 rows in set (0.00 sec)
> 
> Of course this is only a problem when instances have a lot of metadata 
> records. An instance with 50 records in "instance_metadata" and 50 
> records in "instance_system_metadata" will fetch 50 x 50 = 2,500 rows 
> from the database. It's not difficult to see how this can escalate 
> quickly. This can be a particularly significant problem in a HA scenario 
> with multiple API nodes pulling data from multiple database nodes.
> 
> This issue is affecting our clusters running OpenStack Mitaka. I 
> verified that this is not an issue on clusters running OpenStack 
> Icehouse because, in Icehouse, instance data is pulled as needed, 
> executing separate queries for each table, but I as far as I could see, 
> this issue could be affecting every release since Mitaka.
> 
> Since both metadata tables are structurally identical, a UNION of both 
> metadata tables could be performed before joining but of course the 
> Instance class would need to be updated to reflect this change.
> 
> I did not find any bug report related to this problem and I'm wondering 
> if this is a known issue or if anybody has any ideas of how this can be 
> mitigated.
> 
> Any help here would be hugely appreciated.
> 
> Regards,
> 
> Sergio
> 
> ---
> 
> Full SQL query:
> 
> SELECT `anon_1` . `instances_created_at` AS `anon_1_instances_created_at`,
>         `anon_1` . `instances_updated_at` AS `anon_1_instances_updated_at`,
>         `anon_1` . `instances_deleted_at` AS `anon_1_instances_deleted_at`,
>         `anon_1` . `instances_deleted` AS `anon_1_instances_deleted`,
>         `anon_1` . `instances_id` AS `anon_1_instances_id`,
>         `anon_1` . `instances_user_id` AS `anon_1_instances_user_id`,
>         `anon_1` . `instances_project_id` AS `anon_1_instances_project_id`,
>         `anon_1` . `instances_image_ref` AS `anon_1_instances_image_ref`,
>         `anon_1` . `instances_kernel_id` AS `anon_1_instances_kernel_id`,
>         `anon_1` . `instances_ramdisk_id` AS `anon_1_instances_ramdisk_id`,
>         `anon_1` . `instances_hostname` AS `anon_1_instances_hostname`,
>         `anon_1` . `instances_launch_index` AS 
> `anon_1_instances_launch_index`,
>         `anon_1` . `instances_key_name` AS `anon_1_instances_key_name`,
>         `anon_1` . `instances_key_data` AS `anon_1_instances_key_data`,
>         `anon_1` . `instances_power_state` AS 
> `anon_1_instances_power_state`,
>         `anon_1` . `instances_vm_state` AS `anon_1_instances_vm_state`,
>         `anon_1` . `instances_task_state` AS `anon_1_instances_task_state`,
>         `anon_1` . `instances_memory_mb` AS `anon_1_instances_memory_mb`,
>         `anon_1` . `instances_vcpus` AS `anon_1_instances_vcpus`,
>         `anon_1` . `instances_root_gb` AS `anon_1_instances_root_gb`,
>         `anon_1` . `instances_ephemeral_gb` AS 
> `anon_1_instances_ephemeral_gb`,
>         `anon_1` . `instances_ephemeral_key_uuid` AS 
> `anon_1_instances_ephemeral_key_uuid`,
>         `anon_1` . `instances_host` AS `anon_1_instances_host`,
>         `anon_1` . `instances_node` AS `anon_1_instances_node`,
>         `anon_1` . `instances_instance_type_id` AS 
> `anon_1_instances_instance_type_id`,
>         `anon_1` . `instances_user_data` AS `anon_1_instances_user_data`,
>         `anon_1` . `instances_reservation_id` AS 
> `anon_1_instances_reservation_id`,
>         `anon_1` . `instances_launched_at` AS 
> `anon_1_instances_launched_at`,
>         `anon_1` . `instances_terminated_at` AS 
> `anon_1_instances_terminated_at`,
>         `anon_1` . `instances_availability_zone` AS 
> `anon_1_instances_availability_zone`,
>         `anon_1` . `instances_display_name` AS 
> `anon_1_instances_display_name`,
>         `anon_1` . `instances_display_description` AS 
> `anon_1_instances_display_description`,
>         `anon_1` . `instances_launched_on` AS 
> `anon_1_instances_launched_on`,
>         `anon_1` . `instances_locked` AS `anon_1_instances_locked`,
>         `anon_1` . `instances_locked_by` AS `anon_1_instances_locked_by`,
>         `anon_1` . `instances_os_type` AS `anon_1_instances_os_type`,
>         `anon_1` . `instances_architecture` AS 
> `anon_1_instances_architecture`,
>         `anon_1` . `instances_vm_mode` AS `anon_1_instances_vm_mode`,
>         `anon_1` . `instances_uuid` AS `anon_1_instances_uuid`,
>         `anon_1` . `instances_root_device_name` AS 
> `anon_1_instances_root_device_name`,
>         `anon_1` . `instances_default_ephemeral_device` AS 
> `anon_1_instances_default_ephemeral_device`,
>         `anon_1` . `instances_default_swap_device` AS 
> `anon_1_instances_default_swap_device`,
>         `anon_1` . `instances_config_drive` AS 
> `anon_1_instances_config_drive`,
>         `anon_1` . `instances_access_ip_v4` AS 
> `anon_1_instances_access_ip_v4`,
>         `anon_1` . `instances_access_ip_v6` AS 
> `anon_1_instances_access_ip_v6`,
>         `anon_1` . `instances_auto_disk_config` AS 
> `anon_1_instances_auto_disk_config`,
>         `anon_1` . `instances_progress` AS `anon_1_instances_progress`,
>         `anon_1` . `instances_shutdown_terminate` AS 
> `anon_1_instances_shutdown_terminate`,
>         `anon_1` . `instances_disable_terminate` AS 
> `anon_1_instances_disable_terminate`,
>         `anon_1` . `instances_cell_name` AS `anon_1_instances_cell_name`,
>         `anon_1` . `instances_internal_id` AS 
> `anon_1_instances_internal_id`,
>         `anon_1` . `instances_cleaned` AS `anon_1_instances_cleaned`,
>         `instance_system_metadata_1` . `created_at` AS 
> `instance_system_metadata_1_created_at`,
>         `instance_system_metadata_1` . `updated_at` AS 
> `instance_system_metadata_1_updated_at`,
>         `instance_system_metadata_1` . `deleted_at` AS 
> `instance_system_metadata_1_deleted_at`,
>         `instance_system_metadata_1` . `deleted` AS 
> `instance_system_metadata_1_deleted`,
>         `instance_system_metadata_1` . `id` AS 
> `instance_system_metadata_1_id`,
>         `instance_system_metadata_1` . `key` AS 
> `instance_system_metadata_1_key`,
>         `instance_system_metadata_1` . `value` AS 
> `instance_system_metadata_1_value`,
>         `instance_system_metadata_1` . `instance_uuid` AS 
> `instance_system_metadata_1_instance_uuid`,
>         `security_groups_1` . `created_at` AS 
> `security_groups_1_created_at`,
>         `security_groups_1` . `updated_at` AS 
> `security_groups_1_updated_at`,
>         `security_groups_1` . `deleted_at` AS 
> `security_groups_1_deleted_at`,
>         `security_groups_1` . `deleted` AS `security_groups_1_deleted`,
>         `security_groups_1` . `id` AS `security_groups_1_id`,
>         `security_groups_1` . `name` AS `security_groups_1_name`,
>         `security_groups_1` . `description` AS 
> `security_groups_1_description`,
>         `security_groups_1` . `user_id` AS `security_groups_1_user_id`,
>         `security_groups_1` . `project_id` AS 
> `security_groups_1_project_id`,
>         `security_group_rules_1` . `created_at` AS 
> `security_group_rules_1_created_at`,
>         `security_group_rules_1` . `updated_at` AS 
> `security_group_rules_1_updated_at`,
>         `security_group_rules_1` . `deleted_at` AS 
> `security_group_rules_1_deleted_at`,
>         `security_group_rules_1` . `deleted` AS 
> `security_group_rules_1_deleted`,
>         `security_group_rules_1` . `id` AS `security_group_rules_1_id`,
>         `security_group_rules_1` . `parent_group_id` AS 
> `security_group_rules_1_parent_group_id`,
>         `security_group_rules_1` . `protocol` AS 
> `security_group_rules_1_protocol`,
>         `security_group_rules_1` . `from_port` AS 
> `security_group_rules_1_from_port`,
>         `security_group_rules_1` . `to_port` AS 
> `security_group_rules_1_to_port`,
>         `security_group_rules_1` . `cidr` AS `security_group_rules_1_cidr`,
>         `security_group_rules_1` . `group_id` AS 
> `security_group_rules_1_group_id`,
>         `instance_info_caches_1` . `created_at` AS 
> `instance_info_caches_1_created_at`,
>         `instance_info_caches_1` . `updated_at` AS 
> `instance_info_caches_1_updated_at`,
>         `instance_info_caches_1` . `deleted_at` AS 
> `instance_info_caches_1_deleted_at`,
>         `instance_info_caches_1` . `deleted` AS 
> `instance_info_caches_1_deleted`,
>         `instance_info_caches_1` . `id` AS `instance_info_caches_1_id`,
>         `instance_info_caches_1` . `network_info` AS 
> `instance_info_caches_1_network_info`,
>         `instance_info_caches_1` . `instance_uuid` AS 
> `instance_info_caches_1_instance_uuid`,
>         `instance_extra_1` . `flavor` AS `instance_extra_1_flavor`,
>         `instance_extra_1` . `created_at` AS `instance_extra_1_created_at`,
>         `instance_extra_1` . `updated_at` AS `instance_extra_1_updated_at`,
>         `instance_extra_1` . `deleted_at` AS `instance_extra_1_deleted_at`,
>         `instance_extra_1` . `deleted` AS `instance_extra_1_deleted`,
>         `instance_extra_1` . `id` AS `instance_extra_1_id`,
>         `instance_extra_1` . `instance_uuid` AS 
> `instance_extra_1_instance_uuid`,
>         `instance_metadata_1` . `created_at` AS 
> `instance_metadata_1_created_at`,
>         `instance_metadata_1` . `updated_at` AS 
> `instance_metadata_1_updated_at`,
>         `instance_metadata_1` . `deleted_at` AS 
> `instance_metadata_1_deleted_at`,
>         `instance_metadata_1` . `deleted` AS `instance_metadata_1_deleted`,
>         `instance_metadata_1` . `id` AS `instance_metadata_1_id`,
>         `instance_metadata_1` . `key` AS `instance_metadata_1_key`,
>         `instance_metadata_1` . `value` AS `instance_metadata_1_value`,
>         `instance_metadata_1` . `instance_uuid` AS 
> `instance_metadata_1_instance_uuid`
>    FROM (SELECT `instances` . `created_at` AS `instances_created_at`,
>                 `instances` . `updated_at` AS `instances_updated_at`,
>                 `instances` . `deleted_at` AS `instances_deleted_at`,
>                 `instances` . `deleted` AS `instances_deleted`,
>                 `instances` . `id` AS `instances_id`,
>                 `instances` . `user_id` AS `instances_user_id`,
>                 `instances` . `project_id` AS `instances_project_id`,
>                 `instances` . `image_ref` AS `instances_image_ref`,
>                 `instances` . `kernel_id` AS `instances_kernel_id`,
>                 `instances` . `ramdisk_id` AS `instances_ramdisk_id`,
>                 `instances` . `hostname` AS `instances_hostname`,
>                 `instances` . `launch_index` AS `instances_launch_index`,
>                 `instances` . `key_name` AS `instances_key_name`,
>                 `instances` . `key_data` AS `instances_key_data`,
>                 `instances` . `power_state` AS `instances_power_state`,
>                 `instances` . `vm_state` AS `instances_vm_state`,
>                 `instances` . `task_state` AS `instances_task_state`,
>                 `instances` . `memory_mb` AS `instances_memory_mb`,
>                 `instances` . `vcpus` AS `instances_vcpus`,
>                 `instances` . `root_gb` AS `instances_root_gb`,
>                 `instances` . `ephemeral_gb` AS `instances_ephemeral_gb`,
>                 `instances` . `ephemeral_key_uuid` AS 
> `instances_ephemeral_key_uuid`,
>                 `instances` . `host` AS `instances_host`,
>                 `instances` . `node` AS `instances_node`,
>                 `instances` . `instance_type_id` AS 
> `instances_instance_type_id`,
>                 `instances` . `user_data` AS `instances_user_data`,
>                 `instances` . `reservation_id` AS 
> `instances_reservation_id`,
>                 `instances` . `launched_at` AS `instances_launched_at`,
>                 `instances` . `terminated_at` AS `instances_terminated_at`,
>                 `instances` . `availability_zone` AS 
> `instances_availability_zone`,
>                 `instances` . `display_name` AS `instances_display_name`,
>                 `instances` . `display_description` AS 
> `instances_display_description`,
>                 `instances` . `launched_on` AS `instances_launched_on`,
>                 `instances` . `locked` AS `instances_locked`,
>                 `instances` . `locked_by` AS `instances_locked_by`,
>                 `instances` . `os_type` AS `instances_os_type`,
>                 `instances` . `architecture` AS `instances_architecture`,
>                 `instances` . `vm_mode` AS `instances_vm_mode`,
>                 `instances` . `uuid` AS `instances_uuid`,
>                 `instances` . `root_device_name` AS 
> `instances_root_device_name`,
>                 `instances` . `default_ephemeral_device` AS 
> `instances_default_ephemeral_device`,
>                 `instances` . `default_swap_device` AS 
> `instances_default_swap_device`,
>                 `instances` . `config_drive` AS `instances_config_drive`,
>                 `instances` . `access_ip_v4` AS `instances_access_ip_v4`,
>                 `instances` . `access_ip_v6` AS `instances_access_ip_v6`,
>                 `instances` . `auto_disk_config` AS 
> `instances_auto_disk_config`,
>                 `instances` . `progress` AS `instances_progress`,
>                 `instances` . `shutdown_terminate` AS 
> `instances_shutdown_terminate`,
>                 `instances` . `disable_terminate` AS 
> `instances_disable_terminate`,
>                 `instances` . `cell_name` AS `instances_cell_name`,
>                 `instances` . `internal_id` AS `instances_internal_id`,
>                 `instances` . `cleaned` AS `instances_cleaned`
>            FROM `instances`
>           WHERE `instances` . `deleted` = ?
>             AND `instances` . `uuid` = ?
>           LIMIT ?) AS `anon_1`
>    LEFT OUTER JOIN `instance_system_metadata` AS 
> `instance_system_metadata_1`
>      ON `anon_1` . `instances_uuid` = `instance_system_metadata_1` . 
> `instance_uuid`
>    LEFT OUTER JOIN (`security_group_instance_association` AS 
> `security_group_instance_association_1`
>                     INNER JOIN `security_groups` AS `security_groups_1`
>                     ON `security_groups_1` . `id` = 
> `security_group_instance_association_1` . `security_group_id`
>                     AND `security_group_instance_association_1` . 
> `deleted` = ?
>                     AND `security_groups_1` . `deleted` = ? )
>      ON `security_group_instance_association_1` . `instance_uuid` = 
> `anon_1` . `instances_uuid`
>     AND `anon_1` . `instances_deleted` = ?
>    LEFT OUTER JOIN `security_group_rules` AS `security_group_rules_1`
>      ON `security_group_rules_1` . `parent_group_id` = 
> `security_groups_1` . `id`
>     AND `security_group_rules_1` . `deleted` = ?
>    LEFT OUTER JOIN `instance_info_caches` AS `instance_info_caches_1`
>      ON `instance_info_caches_1` . `instance_uuid` = `anon_1` . 
> `instances_uuid`
>    LEFT OUTER JOIN `instance_extra` AS `instance_extra_1`
>      ON `instance_extra_1` . `instance_uuid` = `anon_1` . `instances_uuid`
>    LEFT OUTER JOIN `instance_metadata` AS `instance_metadata_1`
>      ON `instance_metadata_1` . `instance_uuid` = `anon_1` . 
> `instances_uuid`
>     AND `instance_metadata_1` . `deleted` = ?
> 
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

Thanks for this. Have you debugged to the point of knowing where the 
initial DB query is starting from?

Looking at history, my guess is this is the change which introduced it 
for all requests:

https://review.openstack.org/#/c/276861/

-- 

Thanks,

Matt



More information about the OpenStack-dev mailing list