[nova][ironic][ptg] Resource tracker scaling issues
    Matt Riedemann 
    mriedemos at gmail.com
       
    Sun Nov 10 21:07:51 UTC 2019
    
    
  
On 11/10/2019 10:44 AM, Balázs Gibizer wrote:
> On 3500 baremetal nodes _update_available_resource takes 1.5 hour.
Why have a single nova-compute service manage this many nodes? Or even 1000?
Why not try to partition things a bit more reasonably like a normal cell 
where you might have ~200 nodes per compute service host (I think CERN 
keeps their cells to around 200 physical compute hosts for scaling)?
That way you can also leverage the compute service hashring / failover 
feature for HA?
I realize the locking stuff is not great, but at what point is it 
unreasonable to expect a single compute service to manage that many 
nodes/instances?
-- 
Thanks,
Matt
    
    
More information about the openstack-discuss
mailing list