[neutron] api performance at scale
    Matt Riedemann 
    mriedemos at gmail.com
       
    Tue Dec  3 17:44:52 UTC 2019
    
    
  
On 12/3/2019 11:24 AM, Erik Olof Gunnar Andersson wrote:
> For us nova used to be the biggest concern, but a lot of work has been 
> done and nova now performers great. Instead we are having issues to get 
> Neutron to perform at scale. Obvious calls like security groups are 
> performing really poorly, and nova-compute defaults for refreshing the 
> network cache on computes causes massive issues with Neutron.
I wonder how much of the performance hit is due to rootwrap usage in 
neutron (nova's conversion to privsep was completed in Train).
Nova might be the bees knees, but I know there are things in nova we 
could do to be smarter about not hammering the neutron API as much, e.g.:
https://review.opendev.org/#/c/465792/ - make bulk queries to neutron 
when refreshing the instance network info cache
https://review.opendev.org/#/q/I7de14456d04370c842b4c35597dca3a628a826a2 
- be smarter about filtering to avoid expensive joins
https://bugs.launchpad.net/nova/+bug/1567655 - nova's internal network 
info cache only stores information about ports and their related 
networks/subnets/ips but the security group information related to the 
ports attached to a server is fetched directly anytime it's needed, 
including when listing servers with details. So if you're an admin 
listing all servers across all tenants, that could get pretty slow. I've 
long thought we should cache the security group information like we do 
for ports for read-only operations like GET /servers/detail but it's a 
non-trivial amount of work to make that happen and we'd definitely want 
benchmarks and such to justify the change.
Note ttx has started a large ops SIG or whatever so this is probably 
something to discuss there:
https://wiki.openstack.org/wiki/Large_Scale_SIG
-- 
Thanks,
Matt
    
    
More information about the openstack-discuss
mailing list