On 11/11/2019 7:03 AM, Chris Dent wrote:
Or using separate processes? For the ironic and vsphere contexts, increased CPU usage by the nova-compute process does not impact on the workload resources, so parallization is likely a good option.
I don't know how much it would help - someone would have to actually test it out and get metrics - but one easy win might just be using a thread or process executor pool here [1] so that N compute nodes could be processed through the update_available_resource periodic task concurrently, maybe $ncpu or some factor thereof. By default make it serialized for backward compatibility and non-ironic deployments. Making that too highly concurrent could have negative impacts on other things running on that host, like the neutron agent, or potentially storming conductor/rabbit with a ton of DB requests from that compute.
That doesn't help with the scenario that the big COMPUTE_RESOURCE_SEMAPHORE lock is held by the periodic task while spawning, moving, or deleting an instance that also needs access to the big lock to update the resource tracker, but baby steps if any steps in this area of the code would be my recommendation.
[1] https://github.com/openstack/nova/blob/20.0.0/nova/compute/manager.py#L8629