[openstack-dev] [nova] How to debug no valid host failures with placement

Joshua Harlow harlowja at fastmail.com
Thu Aug 2 05:34:45 UTC 2018


If I could, I would have something *like* the EXPLAIN syntax for looking 
at a sql query, but instead of telling me the query plan for a sql 
query, it would tell me the decisions (placement plan?) that resulted in 
a given resource being placed at a certain location.

And I would be able to say request the explanation for a given request 
id (historical even) so that analysis could be done post-change and 
pre-change (say I update the algorithm for selection) so that the 
effects of alternations to said decisions could be determined.

If it could also have a front-end like what is at http://sorting.at/ 
(press the play button) that'd be super sweet also (but not for sorting, 
but instead for placement, which if u squint at that webpage could have 
something similar built).

My 3 cents, ha

-Josh

Chris Friesen wrote:
> On 08/01/2018 11:32 AM, melanie witt wrote:
>
>> I think it's definitely a significant issue that troubleshooting "No
>> allocation
>> candidates returned" from placement is so difficult. However, it's not
>> straightforward to log detail in placement when the request for
>> allocation
>> candidates is essentially "SELECT * FROM nodes WHERE cpu usage <
>> needed and disk
>> usage < needed and memory usage < needed" and the result is returned
>> from the API.
>
> I think the only way to get useful info on a failure would be to break
> down the huge SQL statement into subclauses and store the results of the
> intermediate queries. So then if it failed placement could log something
> like:
>
> hosts with enough CPU: <list1>
> hosts that also have enough disk: <list2>
> hosts that also have enough memory: <list3>
> hosts that also meet extra spec host aggregate keys: <list 4>
> hosts that also meet image properties host aggregate keys: <list 5>
> hosts that also have requested PCI devices: <list 6>
>
> And maybe we could optimize the above by only emitting logs where the
> list has a length less than X (to avoid flooding the logs with hostnames
> in large clusters).
>
> This would let you zero in on the things that finally caused the list to
> be whittled down to nothing.
>
> Chris
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list