On Wed, Aug 18, 2021 at 7:59 AM Sean Mooney <smooney@redhat.com> wrote:
On Tue, Aug 17, 2021 at 3:14 PM Moshe Levi <moshele@nvidia.com> wrote:
-----Original Message----- From: Sean Mooney <smooney@redhat.com> Sent: Tuesday, August 17, 2021 2:32 PM To: Ihar Hrachyshka <ihrachys@redhat.com>; openstack- discuss@lists.openstack.org Subject: Re: [neutron][ovn] support for stateless NAT for floating ip in ml2 ovn
External email: Use caution opening links or attachments
On Mon, 2021-08-16 at 20:20 -0400, Ihar Hrachyshka wrote:
Hi all,
OVN support stateless NAT operations [1] for use case of 1:1 mapped between inner and external ips, i.e dnat_and_snat rule. In openstack is the floating ip use-case. Looking on ml2 ovn support it seem that it only support floating ip with connection tracking. Can ml2 ovn support also the stateless NAT option? Is there concerns using stateless NAT?
Hi Moshe, moshe out of interest does hardware offloaded ovs support hardware offloaded NAT (stateful or stateless) yet with connectx-6 dx? So connectx-6 dx can offload NAT (just SNAT or DNAT), but in the case of dnat_and_snat rules offload won't work. We need to optimize ovn to make it work because of it snat and dnat zones. We can offload stateless without changing ovn and the driver and the performance is better because we don't have the snat /dnat zone in ovn.
im not sure how feature enabled the connection tracker suppport is in the tc flower offload path so if this provided the ablity to do hardware offloaded floating ip nat i think this would be another point in favor of supporting it.
[ML] so the stateless nat we can do today. (we check it in openstack env and just adding stateless config manual in ovn tables.
You mean like this? https://review.opendev.org/c/openstack/neutron/+/804807 (This also passed CI.)
the dpdk connection track is certelly still a bottle neck to dpdk perfromance so for ovs-dpdk i would expect to see a nice uplift in perfermance vs userspace contrack based nat so that is another reason to support this in my view.
you are talking about an "option". Do you mean OpenStack would have a new API extension for FIPs to choose it? Or a configuration option?
I was talking about config option, because I wasn’t sure if we want to keep the stateful behavior or there are use case when customer will want to use stateful NAT.
I just can't figure out which scenario it would be, considering that an admin allocates addresses to FIP pools for monopolistic use by OpenStack, and FIPs are 1:1 mapped to fixed IP addresses. Which scenario do you have in mind?
I understand the initial gut reaction to have it opt-in but IMHO it makes sense only if we can explain why switching to always-stateless won't work.
AFAIU the only limitation for stateless dnat_and_snat rules in OVN is that the mapping must be 1:1, which I think is always the case with OpenStack FIPs (fixed_ip_address attribute is not a list). If so, perhaps always using stateless NAT rules is the way to go (so no api or config option). Am I missing something?
[ML] maybe this is why I raise this question here as I don't know the background of why it was implemented as stateful 😊
I am not aware of any concerns using stateless NAT. But to clarify your motivation: do you expect it to perform better cpu/bandwidth wise?
[ML] motivation is to enable ovs hardware offload to offload floating ip use-case and to improve performance. im pretty sure this has been dicussed in the past. when david and i where working on neutron at intel on the learn action firewally and openflow bridge based routing a few years ago im pretty sure we discused using nat for FIPs when contrack suppport was first being added to ovs.
this makes perfect sense to me to support FIPs via ovs openflow nat rules even on ml2/ovs i dont think that needs to be restricted to ovn although ill admit the ip tables nat entries in teh kernel router namespace proably scalse better the the ovs implemenation based on teh connection tracker today.
stateful vs statless not is an interesting question. https://patchwork.ozlabs.org/project/openvswitch/cover/1570154179- 14525-1-git-send-email-ankur.sharma@nutanix.com/ seams to imply that it must be implemented as a dnat_and_snat rule and i think that has the implication that since its stateless it will always take affect even in the same subnet? i dont know if we can restrict that so that the stateless rule only take effect if we are leaving the openstack env or not.
I am not sure what you mean by saying "it will always take affect even in the same subnet". Could you please elaborate?
AFAIU stateless NAT translation flows won't be reached when ports of the same logical switch communicate, same as they are not supposed to be triggered with ct_* rules. (Which is achieved by short-circuiting tables in OVN table architecture.)
On Tue, 2021-08-17 at 18:10 -0400, Ihar Hrachyshka wrote: that is what i was concerned about. iw was not sure if in stateless mode that behavior would chagne. if no nat happens when talking to servers in the same neutorn netorks or in other neutron networks within hte same tenant then i think we shoudl be gould to alwasy enabel it
I'll double check the logic before merging anything.
looking at the patches i think that would be the main delta although i am just guessing that that will be a side efffect of the stateless implemenation after a quick glance. if i am correct about that i think we would need to opt into stateless nat for FIPs to maintain backwards compatiablity. this could be done at a port level with a new extentions, at a config level ideallly globally, or perhaps we can add an extion to the FIP api to specify that it shoudl be statefully or stateless. that latter is proably the cleanest of the 3 espically if we were to also support this in other ml2/drivers eventually but i dont think we could just swtich to stateless fips if it will afffect the east west flows in any way.
if there is no affeect on east west flows and it only affect north/south folow into and out of the openstack cluster then moveign to statelesss in all cases likely would improve perfermance as there will be less contention for the connection tracker. [ML] in our test we didn't encounter effect on the east west flows.
and based on ^ i think that is an endorsement for always using stateless nat however we need to be able to fallback to statefull not if ovn is not new enough to supprot it so https://review.opendev.org/c/openstack/neutron/+/804807 feels incomplete to me. https://patchwork.ozlabs.org/project/openvswitch/cover/1570154179-14525-1-gi... this was merged in late 2019 so its reltivly recent addtion. im not sure we will want to raise our min ovn version to require statelsess support.
Not too hard to fallback; but on this note, do we maintain minimal OVN version anywhere in neutron? I remember when I was adding support for allow-stateless ACLs, I was told we don't track it (hence runtime schema inspection in https://review.opendev.org/c/openstack/neutron/+/789974) Considering potential backports in downstream products, perhaps a runtime schema check is a better approach anyway.
Thanks, Ihar