[openstack-dev] [nova] How to debug no valid host failures with placement

Chris Friesen chris.friesen at windriver.com
Wed Aug 1 17:24:08 UTC 2018


On 08/01/2018 11:17 AM, Ben Nemec wrote:
>
>
> On 08/01/2018 11:23 AM, Chris Friesen wrote:

>> The fact that there is no real way to get the equivalent of the old detailed
>> scheduler logs is a known shortcoming in placement, and will become more of a
>> problem if/when we move more complicated things like CPU pinning, hugepages,
>> and NUMA-awareness into placement.
>>
>> The problem is that getting useful logs out of placement would require
>> significant development work.
>
> Yeah, in my case I only had one compute node so it was obvious what the problem
> was, but if I had a scheduling failure on a busy cloud with hundreds of nodes I
> don't see how you would ever track it down.  Maybe we need to have a discussion
> with operators about how often they do post-mortem debugging of this sort of thing?

For Wind River's Titanium Cloud it was enough of an issue that we customized the 
scheduler to emit detailed logs on scheduler failure.

We started upstreaming it[1] but the effort stalled out when the upstream folks 
requested major implementation changes.

Chris


[1] https://blueprints.launchpad.net/nova/+spec/improve-sched-logging



More information about the OpenStack-dev mailing list