[neutron] api performance at scale
Matt Riedemann
mriedemos at gmail.com
Tue Dec 3 17:44:52 UTC 2019
On 12/3/2019 11:24 AM, Erik Olof Gunnar Andersson wrote:
> For us nova used to be the biggest concern, but a lot of work has been
> done and nova now performers great. Instead we are having issues to get
> Neutron to perform at scale. Obvious calls like security groups are
> performing really poorly, and nova-compute defaults for refreshing the
> network cache on computes causes massive issues with Neutron.
I wonder how much of the performance hit is due to rootwrap usage in
neutron (nova's conversion to privsep was completed in Train).
Nova might be the bees knees, but I know there are things in nova we
could do to be smarter about not hammering the neutron API as much, e.g.:
https://review.opendev.org/#/c/465792/ - make bulk queries to neutron
when refreshing the instance network info cache
https://review.opendev.org/#/q/I7de14456d04370c842b4c35597dca3a628a826a2
- be smarter about filtering to avoid expensive joins
https://bugs.launchpad.net/nova/+bug/1567655 - nova's internal network
info cache only stores information about ports and their related
networks/subnets/ips but the security group information related to the
ports attached to a server is fetched directly anytime it's needed,
including when listing servers with details. So if you're an admin
listing all servers across all tenants, that could get pretty slow. I've
long thought we should cache the security group information like we do
for ports for read-only operations like GET /servers/detail but it's a
non-trivial amount of work to make that happen and we'd definitely want
benchmarks and such to justify the change.
Note ttx has started a large ops SIG or whatever so this is probably
something to discuss there:
https://wiki.openstack.org/wiki/Large_Scale_SIG
--
Thanks,
Matt
More information about the openstack-discuss
mailing list