On Tue, 2019-11-12 at 20:30 +0000, Albert Braden wrote:
I'm running Rocky and trying to figure out filter order. I'm reading this doc: https://docs.openstack.org/nova/rocky/user/filter-scheduler.html
It says:
Each filter selects hosts in a different way and has different costs. The order of filter_scheduler.enabled_filters affects scheduling performance. The general suggestion is to filter out invalid hosts as soon as possible to avoid unnecessary costs. We can sort filter_scheduler.enabled_filters items by their costs in reverse order. For example, ComputeFilter is better before any resource calculating filters like RamFilter, CoreFilter.
Is there a document that specifies filter costs, or ranks filters by cost? Is there a well-known process for determining the optimal filter order?
im not a aware of a specific document that cover it but this will very based on deployment. as a general guideline you should order your filter by which ones elmiate the most hosts. so the AvailabilityZoneFilter should generally be first. in older release the retry filter shoudl go first. the numa toplogy filter and pci passthough filter are kind fo expensive. so they are better to have near the end. so i would start with the Aggreaget* filters first folowed by "cheap" filter that dont have any complex boolean logic so SameHostFilter, DifferentHostFilter, IoOpsFilter, NumInstancesFilter there are a few others the the more complex filters like numa toplogy, pci passthogh, ComputeCapabilitiesFilter, JsonFilter effectivly what you want to do is maxius the infomation gain at each filtering step will miniusing the cost(reducing the possible host with as few cpu cycles as posible) its important to only enable the filter that matter to your deployment also but if we had a perfect costing for each filter then you could follow the ID3 algorithm to get an optimal layout. https://en.wikipedia.org/wiki/ID3_algorithm i have wanted to experiment with tracing the boot requests on large public clould and model this for some time but i always endup finding other things to thinker with instead but i think even with out that data to work with you could do some intersting things with code complexity metricts as a proxy to try and auto sort them. perhaps some of the operator can share what they do i know cern pre placement used to map tenant to cells as there first filtering step which signifcatly helped them with scale but if the goal is speed then you need to have each step give you the maxium infomation gain for the minium addtional cost. that is why the aggreate filters and multi host filters like affintiy filters tend to be better at the start of the list and very detailed filters like the numa topolgy filter then to be better at the end.