On 7/5/2019 1:45 AM, Massimo Sgaravatto wrote:
I tried to check the allocations on each compute node of a Ocata cloud, using the command:
curl -s ${PLACEMENT_ENDPOINT}/resource_providers/${UUID}/allocations -H "x-auth-token: $TOKEN" | python -m json.tool
Just FYI you can use osc-placement (openstack client plugin) for command line: https://docs.openstack.org/osc-placement/latest/index.html
I found that, on a few compute nodes, there are some instances for which there is not a corresponding allocation.
The heal_allocations command [1] might be able to find and fix these up for you. The bad news for you is that heal_allocations wasn't added until Rocky and you're on Ocata. The good news is you should be able to take the current version of the code from master (or stein) and run that in a container or virtual environment against your Ocata cloud (this would be particularly useful if you want to use the --dry-run or --instance options added in Train). You could also potentially backport those changes to your internal branch, or we could start a discussion upstream about backporting that tooling to stable branches - though going to Ocata might be a bit much at this point given Ocata and Pike are in extended maintenance mode [2]. As for *why* the instances on those nodes are missing allocations, it's hard to say without debugging things. The allocation and resource tracking code has changed quite a bit since Ocata (in Pike the scheduler started creating the allocations but the resource tracker in the compute service could still overwrite those allocations if you had older nodes during a rolling upgrade). My guess would be a migration failed or there was just a bug in Ocata where we didn't cleanup or allocate properly. Again, heal_allocations should add the missing allocation for you if you can setup the environment to run that command.
On another Rocky cloud, we had the opposite problem: there were allocations also for some instances that didn't exist anymore. And this caused problems since we were not able to use all the resources of the relevant compute nodes: we had to manually remove the fwrong" allocations to fix the problem ...
Yup, this could happen for different reasons, usually all due to known bugs for which you don't have the fix yet, e.g. [3][4], or something is failing during a migration and we aren't cleaning up properly (an unreported/not-yet-fixed bug).
I wonder why/how this problem can happen ...
I mentioned some possibilities above - but I'm sure there are other bugs that have been fixed which I've omitted here, or things that aren't fixed yet, especially in failure scenarios (rollback/cleanup handling is hard). Note that your Ocata and Rocky cases could be different because since Queens (once all compute nodes are >=Queens) during resize, cold and live migration the migration record in nova holds the source node allocations during the migration so the actual *consumer* of the allocations for a provider in placement might not be an instance (server) record but actually a migration, so if you were looking for an allocation consumer by ID in nova using something like "openstack server show $consumer_id" it might return NotFound because the consumer is actually not an instance but a migration record and the allocation was leaked.
And how can we fix the issue ? Should we manually add the missing allocations / manually remove the wrong ones ?
Coincidentally a thread related to this [5] re-surfaced a couple of weeks ago. I am not sure what Sylvain's progress is on that audit tool, but the linked bug in that email has some other operator scripts you could try for the case that there are leaked/orphaned allocations on compute nodes that no longer have instances.
Thanks, Massimo
[1] https://docs.openstack.org/nova/latest/cli/nova-manage.html#placement [2] https://docs.openstack.org/project-team-guide/stable-branches.html [3] https://bugs.launchpad.net/nova/+bug/1825537 [4] https://bugs.launchpad.net/nova/+bug/1821594 [5] http://lists.openstack.org/pipermail/openstack-discuss/2019-June/007241.html -- Thanks, Matt