[openstack-dev] [nova] How to debug no valid host failures with placement

Ben Nemec openstack at nemebean.com
Wed Aug 1 17:17:36 UTC 2018



On 08/01/2018 11:23 AM, Chris Friesen wrote:
> On 08/01/2018 09:58 AM, Andrey Volkov wrote:
>> Hi,
>>
>> It seems you need first to check what placement knows about resources 
>> of your cloud.
>> This can be done either with REST API [1] or with osc-placement [2].
>> For osc-placement you could use:
>>
>> pip install osc-placement
>> openstack allocation candidate list --resource DISK_GB=20 --resource
>> MEMORY_MB=2048 --resource VCPU=1 --os-placement-api-version 1.10
>>
>> And you can explore placement state with other commands like openstack 
>> resource
>> provider list, resource provider inventory list, resource provider 
>> usage show.
>>
> 
> Unfortunately this doesn't help figure out what the missing resources 
> were *at the time of the failure*.
> 
> The fact that there is no real way to get the equivalent of the old 
> detailed scheduler logs is a known shortcoming in placement, and will 
> become more of a problem if/when we move more complicated things like 
> CPU pinning, hugepages, and NUMA-awareness into placement.
> 
> The problem is that getting useful logs out of placement would require 
> significant development work.

Yeah, in my case I only had one compute node so it was obvious what the 
problem was, but if I had a scheduling failure on a busy cloud with 
hundreds of nodes I don't see how you would ever track it down.  Maybe 
we need to have a discussion with operators about how often they do 
post-mortem debugging of this sort of thing?



More information about the OpenStack-dev mailing list