[neutron][ovn] OVN Performance
Sean Mooney
smooney at redhat.com
Thu Aug 27 15:10:30 UTC 2020
On Thu, 2020-08-27 at 14:37 +0000, Apsey, Christopher wrote:
> All,
>
> I know that OVN is going to become the default neutron backend at some point and displace linuxbridge as the default
> configuration option in the docs, but we have noticed a pretty significant performance disparity between OVN and
> linuxbridge on identical hardware over the past year or so in a few different environments[1].
the default backend in the docs is not linux bridge right now is it.
i tought i has been ml2/ovs for many years.
> I know that example is unscientific, but similar results have been borne out in many different scenarios from what
> we have observed. There are three main problems from what we see:
>
>
> 1. OVN does not handle large concurrent requests as well as linuxbridge. Additionally, linuxbridge concurrent
> capacity grows (not linearly, but grows nonetheless) by adding additional neutron API endpoints and RPC agents. OVN
> does not really horizontally scale by adding additional API endpoints, from what we have observed.
>
> 2. OVN gets significantly slower as load on the system grows. We have observed a soft cap of about 2000-2500
> instances in a given deployment before ovn-backed neutron stops responding altogether to nova requests (even for
> booting a single instance). We have observed linuxbridge get to 5000+ instances before it starts to struggle on the
> same hardware (and we think that linuxbridge can go further with improved provider network design in that particular
> case).
>
> 3. Once the southbound database process hits 100% CPU usage on the leader in the ovn cluster, it’s game over
> (probably causes 1+2)
>
> It's entirely possible that we just don’t understand OVN well enough to tune it [2][3][4], but then the question
> becomes how do we get that tuning knowledge into the docs so people don’t scratch their heads when their cool new OVN
> deployment scales 40% as well as their ancient linuxbridge-based one?
>
> If it is ‘known’ that OVN has some scaling challenges, is there a plan to fix it, and what is the best way to
> contribute to doing so?
>
> We have observed similar results on Ubuntu 18.04/20.04 and CentOS 7/8 on Stein, Train, and Ussuri.
>
> [1] https://pastebin.com/kyyURTJm
> [2] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/ovsdb
> [3] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/neutron
> [4] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/compute
>
> Chris Apsey
> GEORGIA CYBER CENTER
>
More information about the openstack-discuss
mailing list