Hi all,
Recently I've been looking into various load optimizations tooling for OpenStack (like Watcher), and noticed a lacking feature that could make such tooling much more robust and useful.
The optimization tooling usually collects some metric about compute hosts and instances that nova may be totally unaware of (e.g. power consumption of a compute node in Watts), and then tries to reshuffle the instances between hypervisors to optimize said metric in some sense. The problem is, the tool has no idea beforehand if nova will actually allow moving this given instance to this given hypervisor. I am talking about resource consumption, resource provider traits, image and aggregate metadata matching, numa placement, server group affinities etc. To know all this for sure, the optimization tool would have to duplicate most of the nova scheduler and good part of placement inside of itself - which is not feasible. This is why AFAIU in Watcher all available strategies mention that they expect "any VM to be migrateable to any hypervisor", which is far from many if not most real life deployments.
What I suggest is adding a new API to nova to effectively have a 'migration dry run' - it should return the list of target hosts a given instance may be migrated to, accounting for all the knowledge nova has about the instance and the hypervisors. The list can be already sorted by nova weighers to signal the preferred choice from nova side.
AFAIU such a new API would not require any drastic changes in Nova - the most of the code is already there, used during migrations, only that at the last step, instead of choosing the best compute and actually initiating the migration process, the result of filtering and weighing will be returned to the caller in some form.
Would there be interest in such an API? If yes, I could start writing and proposing a spec.
Best regards,
-- Dr. Pavlo Shchelokovskyy
Principal Software Engineer
Mirantis Inc