Hi Julia,
Thanks once again. we got your point and understood the issue, but we still are facing the same issue on our TRIPLEO Train HA Setup, even if the settings are done as per your recommendations.
The error that we are seeing is again "No valid host was found"
(overcloud) [stack@undercloud v4]$ openstack server show bm-server --fit-width
+-------------------------------------+----------------------------------------------------------------------------------------+
| Field | Value |
+-------------------------------------+----------------------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | |
| OS-EXT-SRV-ATTR:host | None |
| OS-EXT-SRV-ATTR:hostname | bm-server |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None |
| OS-EXT-SRV-ATTR:instance_name | instance-00000014 |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-npd6m9ah |
| OS-EXT-SRV-ATTR:root_device_name | None |
| OS-EXT-SRV-ATTR:user_data | I2Nsb3VkLWNvbmZpZwpkaXNhYmxlX3Jvb3Q6IGZhbHNlCnBhc3N3b3JkOiBoc2MzMjEKc3NoX3B3YXV0aDogdH |
| | J1ZQptYW5hZ2VfZXRjX2hvc3RzOiB0cnVlCmNocGFzc3dkOiB7ZXhwaXJlOiBmYWxzZSB9Cg== |
| OS-EXT-STS:power_state | NOSTATE |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | None |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | |
| config_drive | True |
| created | 2022-02-14T10:20:48Z |
| description | None |
| fault | {'code': 500, 'created': '2022-02-14T10:20:49Z', 'message': 'No valid host was found. |
| | There are not enough hosts available.', 'details': 'Traceback (most recent call |
| | last):\n File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line |
| | 1379, in schedule_and_build_instances\n instance_uuids, return_alternates=True)\n |
| | File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 839, in |
| | _schedule_instances\n return_alternates=return_alternates)\n File |
| | "/usr/lib/python3.6/site-packages/nova/scheduler/client/query.py", line 42, in |
| | select_destinations\n instance_uuids, return_objects, return_alternates)\n File |
| | "/usr/lib/python3.6/site-packages/nova/scheduler/rpcapi.py", line 160, in |
| | select_destinations\n return cctxt.call(ctxt, \'select_destinations\', |
| | **msg_args)\n File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", |
| | line 181, in call\n transport_options=self.transport_options)\n File |
| | "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 129, in _send\n |
| | transport_options=transport_options)\n File "/usr/lib/python3.6/site- |
| | packages/oslo_messaging/_drivers/amqpdriver.py", line 674, in send\n |
| | transport_options=transport_options)\n File "/usr/lib/python3.6/site- |
| | packages/oslo_messaging/_drivers/amqpdriver.py", line 664, in _send\n raise |
| | result\nnova.exception_Remote.NoValidHost_Remote: No valid host was found. There are |
| | not enough hosts available.\nTraceback (most recent call last):\n\n File |
| | "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 235, in inner\n |
| | return func(*args, **kwargs)\n\n File "/usr/lib/python3.6/site- |
| | packages/nova/scheduler/manager.py", line 214, in select_destinations\n |
| | allocation_request_version, return_alternates)\n\n File "/usr/lib/python3.6/site- |
| | packages/nova/scheduler/filter_scheduler.py", line 96, in select_destinations\n |
| | allocation_request_version, return_alternates)\n\n File "/usr/lib/python3.6/site- |
| | packages/nova/scheduler/filter_scheduler.py", line 265, in _schedule\n |
| | claimed_instance_uuids)\n\n File "/usr/lib/python3.6/site- |
| | packages/nova/scheduler/filter_scheduler.py", line 302, in _ensure_sufficient_hosts\n |
| | raise exception.NoValidHost(reason=reason)\n\nnova.exception.NoValidHost: No valid |
| | host was found. There are not enough hosts available.\n\n'} |
| flavor | disk='470', ephemeral='0', |
| | extra_specs.capabilities='boot_mode:uefi,boot_option:local', |
| | extra_specs.resources:CUSTOM_BAREMETAL_RESOURCE_CLASS='1', |
| | extra_specs.resources:DISK_GB='0', extra_specs.resources:MEMORY_MB='0', |
| | extra_specs.resources:VCPU='0', original_name='bm-flavor', ram='63700', swap='0', |
| | vcpus='20' |
| hostId | |
| host_status | |
| id | 49944a1f-7758-4522-9ef1-867ede44b3fc |
| image | whole-disk-centos (80724772-c760-4136-b453-754456d7c549) |
| key_name | None |
| locked | False |
| locked_reason | None |
| name | bm-server |
| project_id | 8dde31e24eba41bfb7212ae154d61268 |
| properties | |
| server_groups | [] |
| status | ERROR |
| tags | [] |
| trusted_image_certificates | None |
| updated | 2022-02-14T10:20:49Z |
| user_id | f689d147221549f1a6cbd1310078127d |
| volumes_attached | |
+-------------------------------------+----------------------------------------------------------------------------------------+
(overcloud) [stack@undercloud v4]$
(overcloud) [stack@undercloud v4]$
For your reference our update flavor and baremetal node properties are as below:
(overcloud) [stack@undercloud v4]$ openstack flavor show bm-flavor --fit-width
+----------------------------+-------------------------------------------------------------------------------------------------+
| Field | Value |
+----------------------------+-------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | None |
| description | None |
| disk | 470 |
| extra_specs | {'resources:CUSTOM_BAREMETAL_RESOURCE_CLASS': '1', 'resources:VCPU': '0', |
| | 'resources:MEMORY_MB': '0', 'resources:DISK_GB': '0', 'capabilities': |
| | 'boot_mode:uefi,boot_option:local'} |
| id | 021c3021-56ec-4eba-bf57-c516ee9b2ee3 |
| name | bm-flavor |
| os-flavor-access:is_public | True |
| properties | capabilities='boot_mode:uefi,boot_option:local', resources:CUSTOM_BAREMETAL_RESOURCE_CLASS='1', |
| | resources:DISK_GB='0', resources:MEMORY_MB='0', resources:VCPU='0' |
| ram | 63700 |
| rxtx_factor | 1.0 |
| swap | 0 |
| vcpus | 20 |
+----------------------------+-------------------------------------------------------------------------------------------------+
(overcloud) [stack@undercloud v4]$
(overcloud) [stack@undercloud v4]$
(overcloud) [stack@undercloud v4]$ openstack baremetal node show baremetal-node --fit-width
+------------------------+-----------------------------------------------------------------------------------------------------+
| Field | Value |
+------------------------+-----------------------------------------------------------------------------------------------------+
| allocation_uuid | None |
| automated_clean | None |
| bios_interface | no-bios |
| boot_interface | ipxe |
| chassis_uuid | None |
| clean_step | {} |
| conductor | overcloud-controller-0.localdomain |
| conductor_group | |
| console_enabled | False |
| console_interface | ipmitool-socat |
| created_at | 2022-02-14T10:05:32+00:00 |
| deploy_interface | iscsi |
| deploy_step | {} |
| description | None |
| driver | ipmi |
| driver_info | {'ipmi_port': 623, 'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_address': '10.0.1.183', |
| | 'deploy_kernel': '95a5b644-c04e-4a66-8f2b-e1e9806bed6e', 'deploy_ramdisk': |
| | '17644220-e623-4981-ae77-d789657851ba'} |
| driver_internal_info | {'agent_erase_devices_iterations': 1, 'agent_erase_devices_zeroize': True, |
| | 'agent_continue_if_ata_erase_failed': False, 'agent_enable_ata_secure_erase': True, |
| | 'disk_erasure_concurrency': 1, 'last_power_state_change': '2022-02-14T10:15:05.062161', |
| | 'agent_version': '5.0.5.dev25', 'agent_last_heartbeat': '2022-02-14T10:14:59.666025', |
| | 'hardware_manager_version': {'generic_hardware_manager': '1.1'}, 'agent_cached_clean_steps': |
| | {'deploy': [{'step': 'erase_devices', 'priority': 10, 'interface': 'deploy', 'reboot_requested': |
| | False, 'abortable': True}, {'step': 'erase_devices_metadata', 'priority': 99, 'interface': |
| | 'deploy', 'reboot_requested': False, 'abortable': True}], 'raid': [{'step': 'delete_configuration', |
| | 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}, {'step': |
| | 'create_configuration', 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': |
| | True}]}, 'agent_cached_clean_steps_refreshed': '2022-02-14 10:14:58.093777', 'clean_steps': None} |
| extra | {} |
| fault | None |
| inspect_interface | inspector |
| inspection_finished_at | None |
| inspection_started_at | None |
| instance_info | {} |
| instance_uuid | None |
| last_error | None |
| maintenance | False |
| maintenance_reason | None |
| management_interface | ipmitool |
| name | baremetal-node |
| network_interface | flat |
| owner | None |
| power_interface | ipmitool |
| power_state | power off |
| properties | {'cpus': 20, 'memory_mb': 63700, 'local_gb': 470, 'cpu_arch': 'x86_64', 'capabilities': |
| | 'boot_mode:uefi,boot_option:local', 'vendor': 'hewlett-packard'} |
| protected | False |
| protected_reason | None |
| provision_state | available |
| provision_updated_at | 2022-02-14T10:15:27+00:00 |
| raid_config | {} |
| raid_interface | no-raid |
| rescue_interface | agent |
| reservation | None |
| resource_class | baremetal-resource-class |
| storage_interface | noop |
| target_power_state | None |
| target_provision_state | None |
| target_raid_config | {} |
| traits | [] |
| updated_at | 2022-02-14T10:15:27+00:00 |
| uuid | cd021878-40eb-407c-87c5-ce6ef92d29eb |
| vendor_interface | ipmitool |
+------------------------+-----------------------------------------------------------------------------------------------------+
(overcloud) [stack@undercloud v4]$,
On further debugging, we found that in the nova-scheduler logs :
2022-02-14 12:58:22.830 7 WARNING keystoneauth.discover [-] Failed to contact the endpoint at http://172.16.2.224:8778/placement for discovery. Fallback to using that endpoint as the base url.
2022-02-14 12:58:23.438 7 WARNING keystoneauth.discover [req-ad5801e4-efd7-4159-a601-68e72c0d651f - - - - -] Failed to contact the endpoint at http://172.16.2.224:8778/placement for discovery. Fallback to using that endpoint as the base url.
where 172.16.2.224 is the internal IP.
going by document :
it is given as below for commands:
(overcloud) [root@overcloud-controller-0 ~]# token=$(openstack token issue -f value -c id)
(overcloud) [root@overcloud-controller-0 ~]# curl -sH "X-Auth-Token: $token" $endpoint/resource_providers/<node id> | jq .inventories
null
result is the same even if we run the curl command on public endpoint.
Please advice.