Scheduler sends VM to HV that lacks resources

fsbiz at yahoo.com fsbiz at yahoo.com
Thu Nov 14 08:03:45 UTC 2019


 I am running stable Queens with hundreds of ironic baremetal nodes.
Things are mostly stable but occasionally some baremetal node provisions are failing.  These failures have been tracked to nova placement failure leading to 409 errors.My nova and baremetal filters do NOT have the 3 filters you mention.
[root at sc-control03 objects]# grep filter /etc/nova/nova.conf | grep filters
# * enabled_filters
#enabled_filters=RetryFilter,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter#use_baremetal_filters=false#baremetal_enabled_filters=RetryFilter,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ExactRamFilter,ExactDiskFilter,ExactCoreFilter



The baremetal nodes are all using resource class.  My image does NOT  have the changes for
https://review.opendev.org/#/c/565841



Ultimately, nova-conductor is reported "NoValidHost: No valid host was found. There are not enough hosts available"This has been traced to nova-placement-api "Allocation for CUSTOM_RRR430 on resource provider 3cacac3f-9af0-4e39-9bc8-d1f362bdb730 violates min_unit, max_unit, or step_size. Requested: 2, min_unit: 1, max_unit: 1, step_size: 1"
Any pointers on what next steps I should be looking at ?
thanks,Fred.

Relevant logs: 
nova-conductor.log2019-11-12 10:26:02.593 1666486 ERROR nova.conductor.manager [req-fa1bfb2e-c765-432d-aa66-e16db8329312 - - - - -] Failed to schedule instances: NoValidHost_Remote: No valid host was found. There are not enough hosts available.Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 226, in inner    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 154, in select_destinations    allocation_request_version, return_alternates)
  File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 91, in select_destinations    allocation_request_version, return_alternates)
  File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 243, in _schedule    claimed_instance_uuids)
  File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 280, in _ensure_sufficient_hosts    raise exception.NoValidHost(reason=reason)
NoValidHost: No valid host was found. There are not enough hosts available.





nova-placement-api.log 
3cacac3f-9af0-4e39-9bc8-d1f362bdb730 = resource ID of baremetal node
84ea2b90-06b2-489e-92ea-24b859b3c997 = instance ID



2019-11-12 10:26:02.427 4161131 INFO nova.api.openstack.placement.requestlog [req-66a6dc45-8326-4e24-9216-fc77099303ba 1ee9f9bf77294e8e8bf50bb35c581689 acf8cd411e5e4751a61d1ed54e8e874d - default default] 10.33.24.13 "GET /allocations/84ea2b90-06b2-489e-92ea-24b859b3c997" status: 200 len: 111 microversion: 1.0

2019-11-12 10:26:02.461 4161129 WARNING nova.objects.resource_provider [req-6d79841e-6abe-490e-b79b-8d88b04215af 1ee9f9bf77294e8e8bf50bb35c581689 acf8cd411e5e4751a61d1ed54e8e874d - default default] Allocation for CUSTOM_Z370_A on resource provider 3cacac3f-9af0-4e39-9bc8-d1f362bdb730 violates min_unit, max_unit, or step_size. Requested: 2, min_unit: 1, max_unit: 1, step_size: 1
2019-11-12 10:26:02.568 4161129 INFO nova.api.openstack.placement.requestlog [req-6d79841e-6abe-490e-b79b-8d88b04215af 1ee9f9bf77294e8e8bf50bb35c581689 acf8cd411e5e4751a61d1ed54e8e874d - default default] 10.33.24.13 "PUT /allocations/84ea2b90-06b2-489e-92ea-24b859b3c997" status: 409 len: 383 microversion: 1.17

http_access_log10.33.24.13 - - [12/Nov/2019:10:26:02 -0800] "GET /allocations/84ea2b90-06b2-489e-92ea-24b859b3c997 HTTP/1.1" 200 111 "-" "nova-scheduler keystoneauth1/3.4.0 python-requests/2.14.2 CPython/2.7.5"10.33.24.13 - - [12/Nov/2019:10:26:02 -0800] "PUT /allocations/84ea2b90-06b2-489e-92ea-24b859b3c997 HTTP/1.1" 409 383 "-" "nova-scheduler keystoneauth1/3.4.0 python-requests/2.14.2 CPython/2.7.5"












    On Wednesday, November 13, 2019, 11:36:35 AM PST, Albert Braden <albert.braden at synopsys.com> wrote:  
 
 Removing these 3 obsolete filters appears to have fixed the problem. Thank you for your advice!

-----Original Message-----
From: Matt Riedemann <mriedemos at gmail.com> 
Sent: Tuesday, November 12, 2019 1:14 PM
To: openstack-discuss at lists.openstack.org
Subject: Re: Scheduler sends VM to HV that lacks resources

On 11/12/2019 2:47 PM, Albert Braden wrote:
> It's probably a config error. Where should I be looking? This is our nova config on the controllers:
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__paste.fedoraproject.org_paste_kNe1eRimk4ifrAuuN790bg&d=DwICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=TZI4wT8_y-RAnwbbXaWBhdvAhhcbY1qymxKLRVpPt2U&s=3aQNqwtEMfOC7U_QUTqNqXiZv4yJy6ceB4kCuZKuL0o&e= 

If your deployment is pike or newer (I'm guessing rocky because your 
other email says rocky), then you don't need these filters:

RetryFilter - alternate hosts bp in queens release makes this moot
CoreFilter - placement filters on VCPU
RamFilter - placement filters on MEMORY_MB

-- 

Thanks,

Matt

  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20191114/f89f8ee1/attachment-0001.html>


More information about the openstack-discuss mailing list