On Sun, 10 Nov 2019, Matt Riedemann wrote:
On 11/10/2019 10:44 AM, Balázs Gibizer wrote:
On 3500 baremetal nodes _update_available_resource takes 1.5 hour.
Why have a single nova-compute service manage this many nodes? Or even 1000?
Why not try to partition things a bit more reasonably like a normal cell where you might have ~200 nodes per compute service host (I think CERN keeps their cells to around 200 physical compute hosts for scaling)?
Without commenting on the efficacy of doing things this way, I can report that 1000 (or even 3500) instances (not nodes) is a thing that can happen in some openstack + vsphere setups and tends to exercise some of the same architectural problems that a lots-of- ironic (nodes) setup encounters.
As far as I can tell the root architecture problem is:
a) there are lots loops b) there is an expectation that those loops will have a small number of iterations
(b) is generally true for a run of the mill KVM setup, but not otherwise.
(b) not being true in other contexts creates an impedance mismatch that is hard to overcome without doing at least one of the two things suggested elsewhere in this thread:
1. manage fewer pieces per nova-compute (Matt) 2. "algorithmic improvement" (Arne)
On 2, I wonder if there's been any exploration of using something like a circular queue and time-bounding the periodic jobs? Or using separate processes? For the ironic and vsphere contexts, increased CPU usage by the nova-compute process does not impact on the workload resources, so parallization is likely a good option.