[openstack-dev] [nova] How to debug no valid host failures with placement

melanie witt melwittt at gmail.com
Wed Aug 1 17:32:55 UTC 2018


On Wed, 1 Aug 2018 12:17:36 -0500, Ben Nemec wrote:
> 
> 
> On 08/01/2018 11:23 AM, Chris Friesen wrote:
>> On 08/01/2018 09:58 AM, Andrey Volkov wrote:
>>> Hi,
>>>
>>> It seems you need first to check what placement knows about resources
>>> of your cloud.
>>> This can be done either with REST API [1] or with osc-placement [2].
>>> For osc-placement you could use:
>>>
>>> pip install osc-placement
>>> openstack allocation candidate list --resource DISK_GB=20 --resource
>>> MEMORY_MB=2048 --resource VCPU=1 --os-placement-api-version 1.10
>>>
>>> And you can explore placement state with other commands like openstack
>>> resource
>>> provider list, resource provider inventory list, resource provider
>>> usage show.
>>>
>>
>> Unfortunately this doesn't help figure out what the missing resources
>> were *at the time of the failure*.
>>
>> The fact that there is no real way to get the equivalent of the old
>> detailed scheduler logs is a known shortcoming in placement, and will
>> become more of a problem if/when we move more complicated things like
>> CPU pinning, hugepages, and NUMA-awareness into placement.
>>
>> The problem is that getting useful logs out of placement would require
>> significant development work.
> 
> Yeah, in my case I only had one compute node so it was obvious what the
> problem was, but if I had a scheduling failure on a busy cloud with
> hundreds of nodes I don't see how you would ever track it down.  Maybe
> we need to have a discussion with operators about how often they do
> post-mortem debugging of this sort of thing?

I think it's definitely a significant issue that troubleshooting "No 
allocation candidates returned" from placement is so difficult. However, 
it's not straightforward to log detail in placement when the request for 
allocation candidates is essentially "SELECT * FROM nodes WHERE cpu 
usage < needed and disk usage < needed and memory usage < needed" and 
the result is returned from the API.

I think better logging is something we want to have, so if anyone has 
ideas around it, do share them.

-melanie






More information about the OpenStack-dev mailing list