Scheduler sends VM to HV that lacks resources
fsbiz at yahoo.com
fsbiz at yahoo.com
Thu Nov 14 08:03:45 UTC 2019
I am running stable Queens with hundreds of ironic baremetal nodes.
Things are mostly stable but occasionally some baremetal node provisions are failing. These failures have been tracked to nova placement failure leading to 409 errors.My nova and baremetal filters do NOT have the 3 filters you mention.
[root at sc-control03 objects]# grep filter /etc/nova/nova.conf | grep filters
# * enabled_filters
#enabled_filters=RetryFilter,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter#use_baremetal_filters=false#baremetal_enabled_filters=RetryFilter,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ExactRamFilter,ExactDiskFilter,ExactCoreFilter
The baremetal nodes are all using resource class. My image does NOT have the changes for
https://review.opendev.org/#/c/565841
Ultimately, nova-conductor is reported "NoValidHost: No valid host was found. There are not enough hosts available"This has been traced to nova-placement-api "Allocation for CUSTOM_RRR430 on resource provider 3cacac3f-9af0-4e39-9bc8-d1f362bdb730 violates min_unit, max_unit, or step_size. Requested: 2, min_unit: 1, max_unit: 1, step_size: 1"
Any pointers on what next steps I should be looking at ?
thanks,Fred.
Relevant logs:
nova-conductor.log2019-11-12 10:26:02.593 1666486 ERROR nova.conductor.manager [req-fa1bfb2e-c765-432d-aa66-e16db8329312 - - - - -] Failed to schedule instances: NoValidHost_Remote: No valid host was found. There are not enough hosts available.Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 226, in inner return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 154, in select_destinations allocation_request_version, return_alternates)
File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 91, in select_destinations allocation_request_version, return_alternates)
File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 243, in _schedule claimed_instance_uuids)
File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 280, in _ensure_sufficient_hosts raise exception.NoValidHost(reason=reason)
NoValidHost: No valid host was found. There are not enough hosts available.
nova-placement-api.log
3cacac3f-9af0-4e39-9bc8-d1f362bdb730 = resource ID of baremetal node
84ea2b90-06b2-489e-92ea-24b859b3c997 = instance ID
2019-11-12 10:26:02.427 4161131 INFO nova.api.openstack.placement.requestlog [req-66a6dc45-8326-4e24-9216-fc77099303ba 1ee9f9bf77294e8e8bf50bb35c581689 acf8cd411e5e4751a61d1ed54e8e874d - default default] 10.33.24.13 "GET /allocations/84ea2b90-06b2-489e-92ea-24b859b3c997" status: 200 len: 111 microversion: 1.0
2019-11-12 10:26:02.461 4161129 WARNING nova.objects.resource_provider [req-6d79841e-6abe-490e-b79b-8d88b04215af 1ee9f9bf77294e8e8bf50bb35c581689 acf8cd411e5e4751a61d1ed54e8e874d - default default] Allocation for CUSTOM_Z370_A on resource provider 3cacac3f-9af0-4e39-9bc8-d1f362bdb730 violates min_unit, max_unit, or step_size. Requested: 2, min_unit: 1, max_unit: 1, step_size: 1
2019-11-12 10:26:02.568 4161129 INFO nova.api.openstack.placement.requestlog [req-6d79841e-6abe-490e-b79b-8d88b04215af 1ee9f9bf77294e8e8bf50bb35c581689 acf8cd411e5e4751a61d1ed54e8e874d - default default] 10.33.24.13 "PUT /allocations/84ea2b90-06b2-489e-92ea-24b859b3c997" status: 409 len: 383 microversion: 1.17
http_access_log10.33.24.13 - - [12/Nov/2019:10:26:02 -0800] "GET /allocations/84ea2b90-06b2-489e-92ea-24b859b3c997 HTTP/1.1" 200 111 "-" "nova-scheduler keystoneauth1/3.4.0 python-requests/2.14.2 CPython/2.7.5"10.33.24.13 - - [12/Nov/2019:10:26:02 -0800] "PUT /allocations/84ea2b90-06b2-489e-92ea-24b859b3c997 HTTP/1.1" 409 383 "-" "nova-scheduler keystoneauth1/3.4.0 python-requests/2.14.2 CPython/2.7.5"
On Wednesday, November 13, 2019, 11:36:35 AM PST, Albert Braden <albert.braden at synopsys.com> wrote:
Removing these 3 obsolete filters appears to have fixed the problem. Thank you for your advice!
-----Original Message-----
From: Matt Riedemann <mriedemos at gmail.com>
Sent: Tuesday, November 12, 2019 1:14 PM
To: openstack-discuss at lists.openstack.org
Subject: Re: Scheduler sends VM to HV that lacks resources
On 11/12/2019 2:47 PM, Albert Braden wrote:
> It's probably a config error. Where should I be looking? This is our nova config on the controllers:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__paste.fedoraproject.org_paste_kNe1eRimk4ifrAuuN790bg&d=DwICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=TZI4wT8_y-RAnwbbXaWBhdvAhhcbY1qymxKLRVpPt2U&s=3aQNqwtEMfOC7U_QUTqNqXiZv4yJy6ceB4kCuZKuL0o&e=
If your deployment is pike or newer (I'm guessing rocky because your
other email says rocky), then you don't need these filters:
RetryFilter - alternate hosts bp in queens release makes this moot
CoreFilter - placement filters on VCPU
RamFilter - placement filters on MEMORY_MB
--
Thanks,
Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20191114/f89f8ee1/attachment-0001.html>
More information about the openstack-discuss
mailing list