>
the default backend in the docs is not linux bridge right now is it.
> i tought i has been ml2/ovs for many years.
Nope – still defaults to linuxbridge on master -
https://docs.openstack.org/neutron/latest/install/controller-install-rdo.html.
And I don’t think that’s necessarily a bad thing if it’s the simplest option to get working well at the moment, but if the future is OVN, OVN should be at least
as good in all respects.
Chris Apsey
GEORGIA CYBER CENTER
From: Sean Mooney <smooney@redhat.com>
Sent: Thursday, August 27, 2020 11:11 AM
To: Apsey, Christopher <CAPSEY@augusta.edu>; openstack-discuss@lists.openstack.org
Subject: [EXTERNAL] Re: [neutron][ovn] OVN Performance
CAUTION: EXTERNAL SENDER This email originated from an external source. Please exercise caution before opening attachments, clicking links, replying, or providing information to the sender. If you believe it to be fraudulent, contact the
AU Cybersecurity Hotline at 72-CYBER (2-9237 / 706-722-9237) or
72CYBER@augusta.edu
On Thu, 2020-08-27 at 14:37 +0000, Apsey, Christopher wrote:
> All,
>
> I know that OVN is going to become the default neutron backend at some point and displace linuxbridge as the default
> configuration option in the docs, but we have noticed a pretty significant performance disparity between OVN and
> linuxbridge on identical hardware over the past year or so in a few different environments[1].
the default backend in the docs is not linux bridge right now is it.
i tought i has been ml2/ovs for many years.
> I know that example is unscientific, but similar results have been borne out in many different scenarios from what
> we have observed. There are three main problems from what we see:
>
>
> 1. OVN does not handle large concurrent requests as well as linuxbridge. Additionally, linuxbridge concurrent
> capacity grows (not linearly, but grows nonetheless) by adding additional neutron API endpoints and RPC agents. OVN
> does not really horizontally scale by adding additional API endpoints, from what we have observed.
>
> 2. OVN gets significantly slower as load on the system grows. We have observed a soft cap of about 2000-2500
> instances in a given deployment before ovn-backed neutron stops responding altogether to nova requests (even for
> booting a single instance). We have observed linuxbridge get to 5000+ instances before it starts to struggle on the
> same hardware (and we think that linuxbridge can go further with improved provider network design in that particular
> case).
>
> 3. Once the southbound database process hits 100% CPU usage on the leader in the ovn cluster, it’s game over
> (probably causes 1+2)
>
> It's entirely possible that we just don’t understand OVN well enough to tune it [2][3][4], but then the question
> becomes how do we get that tuning knowledge into the docs so people don’t scratch their heads when their cool new OVN
> deployment scales 40% as well as their ancient linuxbridge-based one?
>
> If it is ‘known’ that OVN has some scaling challenges, is there a plan to fix it, and what is the best way to
> contribute to doing so?
>
> We have observed similar results on Ubuntu 18.04/20.04 and CentOS 7/8 on Stein, Train, and Ussuri.
>
> [1] https://pastebin.com/kyyURTJm
> [2] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/ovsdb
> [3] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/neutron
> [4] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/compute
>
> Chris Apsey
> GEORGIA CYBER CENTER
>