[openstack-dev] [nova] Metadata API cross joining "instance_metadata" and "instance_system_metadata"

Sergio A. de Carvalho Jr. scarvalhojr at gmail.com
Mon Oct 22 16:25:03 UTC 2018


Hi,

While troubleshooting a production issue we identified that the Nova
metadata API is fetching a lot more raw data from the database than seems
necessary. The problem appears to be caused by the SQL query used to fetch
instance data that joins the "instance" table with, among others, two
metadata tables: "instance_metadata" and "instance_system_metadata". Below
is a simplified version of this query (I've added the full query at the end
of this message for reference):

SELECT ...
  FROM (SELECT ...
          FROM `instances`
         WHERE `instances` . `deleted` = ?
           AND `instances` . `uuid` = ?
         LIMIT ?) AS `anon_1`
  LEFT OUTER JOIN `instance_system_metadata` AS `instance_system_metadata_1`
    ON `anon_1` . `instances_uuid` = `instance_system_metadata_1` .
`instance_uuid`
  LEFT OUTER JOIN (`security_group_instance_association` AS
`security_group_instance_association_1`
                   INNER JOIN `security_groups` AS `security_groups_1`
                   ON `security_groups_1` . `id` =
`security_group_instance_association_1` . `security_group_id`
                   AND `security_group_instance_association_1` . `deleted`
= ?
                   AND `security_groups_1` . `deleted` = ? )
    ON `security_group_instance_association_1` . `instance_uuid` = `anon_1`
. `instances_uuid`
   AND `anon_1` . `instances_deleted` = ?
  LEFT OUTER JOIN `security_group_rules` AS `security_group_rules_1`
    ON `security_group_rules_1` . `parent_group_id` = `security_groups_1` .
`id`
   AND `security_group_rules_1` . `deleted` = ?
  LEFT OUTER JOIN `instance_info_caches` AS `instance_info_caches_1`
    ON `instance_info_caches_1` . `instance_uuid` = `anon_1` .
`instances_uuid`
  LEFT OUTER JOIN `instance_extra` AS `instance_extra_1`
    ON `instance_extra_1` . `instance_uuid` = `anon_1` . `instances_uuid`
  LEFT OUTER JOIN `instance_metadata` AS `instance_metadata_1`
    ON `instance_metadata_1` . `instance_uuid` = `anon_1` . `instances_uuid`
   AND `instance_metadata_1` . `deleted` = ?

The instance table has a 1-to-many relationship to both "instance_metadata"
and "instance_system_metadata" tables, so the query is effectively
producing a cross join of both metadata tables.

To illustrate the impact of this query, I have an instance that has 2
records in "instance_metadata" and 5 records in "instance_system_metadata":

> select instance_uuid,`key`,value from instance_metadata where
instance_uuid = 'a6cf4a6a-effe-4438-9b7f-d61b23117b9b';
+--------------------------------------+-----------+--------+
| instance_uuid                        | key       | value  |
+--------------------------------------+-----------+--------+
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value  |
+--------------------------------------+-----------+--------+
2 rows in set (0.61 sec)

> select instance_uuid,`key`,valusystem_metadata where instance_uuid =
'a6cf4a6a-effe-4438-9b7f-d61b23117b9b';
+------------------------+--------------------------------------+
| key                    | value                                |
+------------------------+--------------------------------------+
| image_disk_format      | qcow2                                |
| image_min_ram          | 0                                    |
| image_min_disk         | 20                                   |
| image_base_image_ref   | 39cd564f-6a29-43e2-815b-62097968486a |
| image_container_format | bare                                 |
+------------------------+--------------------------------------+
5 rows in set (0.00 sec)

For this particular instance, the query used by the metadata API will fetch
10 records from the database:

+--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
| anon_1_instances_uuid                | instance_metadata_1_key |
instance_metadata_1_value | instance_system_metadata_1_key |
instance_system_metadata_1_value     |
+--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1               | value1
                 | image_disk_format              | qcow2
             |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2               | value
                  | image_disk_format              | qcow2
               |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1               | value1
                 | image_min_ram                  | 0
             |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2               | value
                  | image_min_ram                  | 0
               |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1               | value1
                 | image_min_disk                 | 20
              |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2               | value
                  | image_min_disk                 | 20
              |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1               | value1
                 | image_base_image_ref           |
39cd564f-6a29-43e2-815b-62097968486a |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2               | value
                  | image_base_image_ref           |
39cd564f-6a29-43e2-815b-62097968486a |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1               | value1
                 | image_container_format         | bare
              |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2               | value
                  | image_container_format         | bare
              |
+--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
10 rows in set (0.00 sec)

Of course this is only a problem when instances have a lot of metadata
records. An instance with 50 records in "instance_metadata" and 50 records
in "instance_system_metadata" will fetch 50 x 50 = 2,500 rows from the
database. It's not difficult to see how this can escalate quickly. This can
be a particularly significant problem in a HA scenario with multiple API
nodes pulling data from multiple database nodes.

This issue is affecting our clusters running OpenStack Mitaka. I verified
that this is not an issue on clusters running OpenStack Icehouse because,
in Icehouse, instance data is pulled as needed, executing separate queries
for each table, but I as far as I could see, this issue could be affecting
every release since Mitaka.

Since both metadata tables are structurally identical, a UNION of both
metadata tables could be performed before joining but of course the
Instance class would need to be updated to reflect this change.

I did not find any bug report related to this problem and I'm wondering if
this is a known issue or if anybody has any ideas of how this can be
mitigated.

Any help here would be hugely appreciated.

Regards,

Sergio

---

Full SQL query:

SELECT `anon_1` . `instances_created_at` AS `anon_1_instances_created_at`,
       `anon_1` . `instances_updated_at` AS `anon_1_instances_updated_at`,
       `anon_1` . `instances_deleted_at` AS `anon_1_instances_deleted_at`,
       `anon_1` . `instances_deleted` AS `anon_1_instances_deleted`,
       `anon_1` . `instances_id` AS `anon_1_instances_id`,
       `anon_1` . `instances_user_id` AS `anon_1_instances_user_id`,
       `anon_1` . `instances_project_id` AS `anon_1_instances_project_id`,
       `anon_1` . `instances_image_ref` AS `anon_1_instances_image_ref`,
       `anon_1` . `instances_kernel_id` AS `anon_1_instances_kernel_id`,
       `anon_1` . `instances_ramdisk_id` AS `anon_1_instances_ramdisk_id`,
       `anon_1` . `instances_hostname` AS `anon_1_instances_hostname`,
       `anon_1` . `instances_launch_index` AS
`anon_1_instances_launch_index`,
       `anon_1` . `instances_key_name` AS `anon_1_instances_key_name`,
       `anon_1` . `instances_key_data` AS `anon_1_instances_key_data`,
       `anon_1` . `instances_power_state` AS `anon_1_instances_power_state`,
       `anon_1` . `instances_vm_state` AS `anon_1_instances_vm_state`,
       `anon_1` . `instances_task_state` AS `anon_1_instances_task_state`,
       `anon_1` . `instances_memory_mb` AS `anon_1_instances_memory_mb`,
       `anon_1` . `instances_vcpus` AS `anon_1_instances_vcpus`,
       `anon_1` . `instances_root_gb` AS `anon_1_instances_root_gb`,
       `anon_1` . `instances_ephemeral_gb` AS
`anon_1_instances_ephemeral_gb`,
       `anon_1` . `instances_ephemeral_key_uuid` AS
`anon_1_instances_ephemeral_key_uuid`,
       `anon_1` . `instances_host` AS `anon_1_instances_host`,
       `anon_1` . `instances_node` AS `anon_1_instances_node`,
       `anon_1` . `instances_instance_type_id` AS
`anon_1_instances_instance_type_id`,
       `anon_1` . `instances_user_data` AS `anon_1_instances_user_data`,
       `anon_1` . `instances_reservation_id` AS
`anon_1_instances_reservation_id`,
       `anon_1` . `instances_launched_at` AS `anon_1_instances_launched_at`,
       `anon_1` . `instances_terminated_at` AS
`anon_1_instances_terminated_at`,
       `anon_1` . `instances_availability_zone` AS
`anon_1_instances_availability_zone`,
       `anon_1` . `instances_display_name` AS
`anon_1_instances_display_name`,
       `anon_1` . `instances_display_description` AS
`anon_1_instances_display_description`,
       `anon_1` . `instances_launched_on` AS `anon_1_instances_launched_on`,
       `anon_1` . `instances_locked` AS `anon_1_instances_locked`,
       `anon_1` . `instances_locked_by` AS `anon_1_instances_locked_by`,
       `anon_1` . `instances_os_type` AS `anon_1_instances_os_type`,
       `anon_1` . `instances_architecture` AS
`anon_1_instances_architecture`,
       `anon_1` . `instances_vm_mode` AS `anon_1_instances_vm_mode`,
       `anon_1` . `instances_uuid` AS `anon_1_instances_uuid`,
       `anon_1` . `instances_root_device_name` AS
`anon_1_instances_root_device_name`,
       `anon_1` . `instances_default_ephemeral_device` AS
`anon_1_instances_default_ephemeral_device`,
       `anon_1` . `instances_default_swap_device` AS
`anon_1_instances_default_swap_device`,
       `anon_1` . `instances_config_drive` AS
`anon_1_instances_config_drive`,
       `anon_1` . `instances_access_ip_v4` AS
`anon_1_instances_access_ip_v4`,
       `anon_1` . `instances_access_ip_v6` AS
`anon_1_instances_access_ip_v6`,
       `anon_1` . `instances_auto_disk_config` AS
`anon_1_instances_auto_disk_config`,
       `anon_1` . `instances_progress` AS `anon_1_instances_progress`,
       `anon_1` . `instances_shutdown_terminate` AS
`anon_1_instances_shutdown_terminate`,
       `anon_1` . `instances_disable_terminate` AS
`anon_1_instances_disable_terminate`,
       `anon_1` . `instances_cell_name` AS `anon_1_instances_cell_name`,
       `anon_1` . `instances_internal_id` AS `anon_1_instances_internal_id`,
       `anon_1` . `instances_cleaned` AS `anon_1_instances_cleaned`,
       `instance_system_metadata_1` . `created_at` AS
`instance_system_metadata_1_created_at`,
       `instance_system_metadata_1` . `updated_at` AS
`instance_system_metadata_1_updated_at`,
       `instance_system_metadata_1` . `deleted_at` AS
`instance_system_metadata_1_deleted_at`,
       `instance_system_metadata_1` . `deleted` AS
`instance_system_metadata_1_deleted`,
       `instance_system_metadata_1` . `id` AS
`instance_system_metadata_1_id`,
       `instance_system_metadata_1` . `key` AS
`instance_system_metadata_1_key`,
       `instance_system_metadata_1` . `value` AS
`instance_system_metadata_1_value`,
       `instance_system_metadata_1` . `instance_uuid` AS
`instance_system_metadata_1_instance_uuid`,
       `security_groups_1` . `created_at` AS `security_groups_1_created_at`,
       `security_groups_1` . `updated_at` AS `security_groups_1_updated_at`,
       `security_groups_1` . `deleted_at` AS `security_groups_1_deleted_at`,
       `security_groups_1` . `deleted` AS `security_groups_1_deleted`,
       `security_groups_1` . `id` AS `security_groups_1_id`,
       `security_groups_1` . `name` AS `security_groups_1_name`,
       `security_groups_1` . `description` AS
`security_groups_1_description`,
       `security_groups_1` . `user_id` AS `security_groups_1_user_id`,
       `security_groups_1` . `project_id` AS `security_groups_1_project_id`,
       `security_group_rules_1` . `created_at` AS
`security_group_rules_1_created_at`,
       `security_group_rules_1` . `updated_at` AS
`security_group_rules_1_updated_at`,
       `security_group_rules_1` . `deleted_at` AS
`security_group_rules_1_deleted_at`,
       `security_group_rules_1` . `deleted` AS
`security_group_rules_1_deleted`,
       `security_group_rules_1` . `id` AS `security_group_rules_1_id`,
       `security_group_rules_1` . `parent_group_id` AS
`security_group_rules_1_parent_group_id`,
       `security_group_rules_1` . `protocol` AS
`security_group_rules_1_protocol`,
       `security_group_rules_1` . `from_port` AS
`security_group_rules_1_from_port`,
       `security_group_rules_1` . `to_port` AS
`security_group_rules_1_to_port`,
       `security_group_rules_1` . `cidr` AS `security_group_rules_1_cidr`,
       `security_group_rules_1` . `group_id` AS
`security_group_rules_1_group_id`,
       `instance_info_caches_1` . `created_at` AS
`instance_info_caches_1_created_at`,
       `instance_info_caches_1` . `updated_at` AS
`instance_info_caches_1_updated_at`,
       `instance_info_caches_1` . `deleted_at` AS
`instance_info_caches_1_deleted_at`,
       `instance_info_caches_1` . `deleted` AS
`instance_info_caches_1_deleted`,
       `instance_info_caches_1` . `id` AS `instance_info_caches_1_id`,
       `instance_info_caches_1` . `network_info` AS
`instance_info_caches_1_network_info`,
       `instance_info_caches_1` . `instance_uuid` AS
`instance_info_caches_1_instance_uuid`,
       `instance_extra_1` . `flavor` AS `instance_extra_1_flavor`,
       `instance_extra_1` . `created_at` AS `instance_extra_1_created_at`,
       `instance_extra_1` . `updated_at` AS `instance_extra_1_updated_at`,
       `instance_extra_1` . `deleted_at` AS `instance_extra_1_deleted_at`,
       `instance_extra_1` . `deleted` AS `instance_extra_1_deleted`,
       `instance_extra_1` . `id` AS `instance_extra_1_id`,
       `instance_extra_1` . `instance_uuid` AS
`instance_extra_1_instance_uuid`,
       `instance_metadata_1` . `created_at` AS
`instance_metadata_1_created_at`,
       `instance_metadata_1` . `updated_at` AS
`instance_metadata_1_updated_at`,
       `instance_metadata_1` . `deleted_at` AS
`instance_metadata_1_deleted_at`,
       `instance_metadata_1` . `deleted` AS `instance_metadata_1_deleted`,
       `instance_metadata_1` . `id` AS `instance_metadata_1_id`,
       `instance_metadata_1` . `key` AS `instance_metadata_1_key`,
       `instance_metadata_1` . `value` AS `instance_metadata_1_value`,
       `instance_metadata_1` . `instance_uuid` AS
`instance_metadata_1_instance_uuid`
  FROM (SELECT `instances` . `created_at` AS `instances_created_at`,
               `instances` . `updated_at` AS `instances_updated_at`,
               `instances` . `deleted_at` AS `instances_deleted_at`,
               `instances` . `deleted` AS `instances_deleted`,
               `instances` . `id` AS `instances_id`,
               `instances` . `user_id` AS `instances_user_id`,
               `instances` . `project_id` AS `instances_project_id`,
               `instances` . `image_ref` AS `instances_image_ref`,
               `instances` . `kernel_id` AS `instances_kernel_id`,
               `instances` . `ramdisk_id` AS `instances_ramdisk_id`,
               `instances` . `hostname` AS `instances_hostname`,
               `instances` . `launch_index` AS `instances_launch_index`,
               `instances` . `key_name` AS `instances_key_name`,
               `instances` . `key_data` AS `instances_key_data`,
               `instances` . `power_state` AS `instances_power_state`,
               `instances` . `vm_state` AS `instances_vm_state`,
               `instances` . `task_state` AS `instances_task_state`,
               `instances` . `memory_mb` AS `instances_memory_mb`,
               `instances` . `vcpus` AS `instances_vcpus`,
               `instances` . `root_gb` AS `instances_root_gb`,
               `instances` . `ephemeral_gb` AS `instances_ephemeral_gb`,
               `instances` . `ephemeral_key_uuid` AS
`instances_ephemeral_key_uuid`,
               `instances` . `host` AS `instances_host`,
               `instances` . `node` AS `instances_node`,
               `instances` . `instance_type_id` AS
`instances_instance_type_id`,
               `instances` . `user_data` AS `instances_user_data`,
               `instances` . `reservation_id` AS `instances_reservation_id`,
               `instances` . `launched_at` AS `instances_launched_at`,
               `instances` . `terminated_at` AS `instances_terminated_at`,
               `instances` . `availability_zone` AS
`instances_availability_zone`,
               `instances` . `display_name` AS `instances_display_name`,
               `instances` . `display_description` AS
`instances_display_description`,
               `instances` . `launched_on` AS `instances_launched_on`,
               `instances` . `locked` AS `instances_locked`,
               `instances` . `locked_by` AS `instances_locked_by`,
               `instances` . `os_type` AS `instances_os_type`,
               `instances` . `architecture` AS `instances_architecture`,
               `instances` . `vm_mode` AS `instances_vm_mode`,
               `instances` . `uuid` AS `instances_uuid`,
               `instances` . `root_device_name` AS
`instances_root_device_name`,
               `instances` . `default_ephemeral_device` AS
`instances_default_ephemeral_device`,
               `instances` . `default_swap_device` AS
`instances_default_swap_device`,
               `instances` . `config_drive` AS `instances_config_drive`,
               `instances` . `access_ip_v4` AS `instances_access_ip_v4`,
               `instances` . `access_ip_v6` AS `instances_access_ip_v6`,
               `instances` . `auto_disk_config` AS
`instances_auto_disk_config`,
               `instances` . `progress` AS `instances_progress`,
               `instances` . `shutdown_terminate` AS
`instances_shutdown_terminate`,
               `instances` . `disable_terminate` AS
`instances_disable_terminate`,
               `instances` . `cell_name` AS `instances_cell_name`,
               `instances` . `internal_id` AS `instances_internal_id`,
               `instances` . `cleaned` AS `instances_cleaned`
          FROM `instances`
         WHERE `instances` . `deleted` = ?
           AND `instances` . `uuid` = ?
         LIMIT ?) AS `anon_1`
  LEFT OUTER JOIN `instance_system_metadata` AS `instance_system_metadata_1`
    ON `anon_1` . `instances_uuid` = `instance_system_metadata_1` .
`instance_uuid`
  LEFT OUTER JOIN (`security_group_instance_association` AS
`security_group_instance_association_1`
                   INNER JOIN `security_groups` AS `security_groups_1`
                   ON `security_groups_1` . `id` =
`security_group_instance_association_1` . `security_group_id`
                   AND `security_group_instance_association_1` . `deleted`
= ?
                   AND `security_groups_1` . `deleted` = ? )
    ON `security_group_instance_association_1` . `instance_uuid` = `anon_1`
. `instances_uuid`
   AND `anon_1` . `instances_deleted` = ?
  LEFT OUTER JOIN `security_group_rules` AS `security_group_rules_1`
    ON `security_group_rules_1` . `parent_group_id` = `security_groups_1` .
`id`
   AND `security_group_rules_1` . `deleted` = ?
  LEFT OUTER JOIN `instance_info_caches` AS `instance_info_caches_1`
    ON `instance_info_caches_1` . `instance_uuid` = `anon_1` .
`instances_uuid`
  LEFT OUTER JOIN `instance_extra` AS `instance_extra_1`
    ON `instance_extra_1` . `instance_uuid` = `anon_1` . `instances_uuid`
  LEFT OUTER JOIN `instance_metadata` AS `instance_metadata_1`
    ON `instance_metadata_1` . `instance_uuid` = `anon_1` . `instances_uuid`
   AND `instance_metadata_1` . `deleted` = ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181022/d52347ca/attachment.html>


More information about the OpenStack-dev mailing list