[neutron] OVS OpenFlow L3 DVR / dvr_bridge agent_mode
Hi Neutron, I've been internally collaborating on the ``dvr_bridge`` L3 agent mode [1][2][3] work (David Shaughnessy, Xubo Zhang), which allows the L3 agent to make use of Open vSwitch / OpenFlow to implement ``distributed`` IPv4 Routers thus bypassing kernel namespaces and iptables and opening the door for higher performance by keeping packets in OVS for longer. I want to share a few questions in order to gather feedback from you. I understand parts of these questions may have been answered in the past before my involvement, but I believe it's still important to revisit and clarify them. This can impact how long it's going to take to complete the work and whether it can make it to stein-3. 1. Should OVS support also be added to the legacy router? And if so, would it make more sense to have a new variable (not ``agent_mode``) to specify what backend to use (OVS or kernel) instead of creating more combinations? 2. What is expected in terms of CI for this? Regarding testing, what should this first patch include apart from the unit tests? (since the l3_agent.ini needs to be configured differently). 3. What problems can be anticipated by having the same agent managing both kernel and OVS powered routers (depending on whether they were created as ``distributed``)? We are experimenting with different ways of decoupling RouterInfo (mainly as part of the L3 agent refactor patch) and haven't been able to find the right balance yet. On one end we have an agent that is still coupled with kernel-based RouterInfo, and on the other end we have an agent that either only accepts OVS-based RouterInfos or only kernel-based RouterInfos depending on the ``agent_mode``. We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor one should be able to pass Zuul after a recheck. [1] Spec: https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr [2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536 [3] Gerrit topic: https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:mer...) [4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29 [5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17 Thank you! Best regards, Igor D.C.
Hi,
Wiadomość napisana przez Duarte Cardoso, Igor <igor.duarte.cardoso@intel.com> w dniu 29.01.2019, o godz. 08:25:
Hi Neutron,
I've been internally collaborating on the ``dvr_bridge`` L3 agent mode [1][2][3] work (David Shaughnessy, Xubo Zhang), which allows the L3 agent to make use of Open vSwitch / OpenFlow to implement ``distributed`` IPv4 Routers thus bypassing kernel namespaces and iptables and opening the door for higher performance by keeping packets in OVS for longer.
I want to share a few questions in order to gather feedback from you. I understand parts of these questions may have been answered in the past before my involvement, but I believe it's still important to revisit and clarify them. This can impact how long it's going to take to complete the work and whether it can make it to stein-3.
1. Should OVS support also be added to the legacy router? And if so, would it make more sense to have a new variable (not ``agent_mode``) to specify what backend to use (OVS or kernel) instead of creating more combinations?
IMHO new config option could be better. Than You can have agent_mode like it is now and new „switch” to change between OVS and kernel backend for it. We can of course forbid some combinations at the beginning and add support for them later if that would be necessary.
2. What is expected in terms of CI for this? Regarding testing, what should this first patch include apart from the unit tests? (since the l3_agent.ini needs to be configured differently).
I think that we should propose new neutron-tempest-plugin scenario job (based on neutron-tempest-plugin-dvr-multinode-scenario probably) but with configured DVR mode in this new way. That should be enough for the beginning IMO. Of course some unit/functional tests should be added also to Your patch :)
3. What problems can be anticipated by having the same agent managing both kernel and OVS powered routers (depending on whether they were created as ``distributed``)? We are experimenting with different ways of decoupling RouterInfo (mainly as part of the L3 agent refactor patch) and haven't been able to find the right balance yet. On one end we have an agent that is still coupled with kernel-based RouterInfo, and on the other end we have an agent that either only accepts OVS-based RouterInfos or only kernel-based RouterInfos depending on the ``agent_mode``.
Please keep in mind that there is spec about refactor RouterInfo to make it less coupled with L3 agent’s code. It’s in [1]. Maybe You can work on this together :)
We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor one should be able to pass Zuul after a recheck.
[1] Spec: https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr [2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536 [3] Gerrit topic: https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:mer...) [4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29 [5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17
Thank you!
Best regards, Igor D.C.
[1] https://review.openstack.org/#/c/625647/ — Slawek Kaplonski Senior software engineer Red Hat
Hi,
Wiadomość napisana przez Duarte Cardoso, Igor <igor.duarte.cardoso@intel.com> w dniu 29.01.2019, o godz. 08:25:
Hi Neutron,
I've been internally collaborating on the ``dvr_bridge`` L3 agent mode [1][2][3] work (David Shaughnessy, Xubo Zhang), which allows the L3 agent to make use of Open vSwitch / OpenFlow to implement ``distributed`` IPv4 Routers thus bypassing kernel namespaces and iptables and opening the door for higher performance by keeping packets in OVS for longer.
I want to share a few questions in order to gather feedback from you. I understand parts of these questions may have been answered in the past before my involvement, but I believe it's still important to revisit and clarify them. This can impact how long it's going to take to complete the work and whether it can make it to stein-3.
1. Should OVS support also be added to the legacy router? And if so, would it make more sense to have a new variable (not ``agent_mode``) to specify what backend to use (OVS or kernel) instead of creating more combinations?
IMHO new config option could be better. Than You can have agent_mode like it is now and new „switch” to change between OVS and kernel backend for it. We can of course forbid some combinations at the beginning and add support for them later if that would be necessary. i would like to see it implement in the legacy router case too.
On Tue, 2019-01-29 at 08:52 +0100, Slawomir Kaplonski wrote: there will be little extra code required to do so and it will make testing the shared code simpler. when this feature was first concived for icehose it was targeting repalceing the legacy router and later extended to be an alternitve to dvr. but as suggested above a new config option sound like a good way to go.
2. What is expected in terms of CI for this? Regarding testing, what should this first patch include apart from the unit tests? (since the l3_agent.ini needs to be configured differently).
I think that we should propose new neutron-tempest-plugin scenario job (based on neutron-tempest-plugin-dvr-multinode- scenario probably) but with configured DVR mode in this new way. That should be enough for the beginning IMO. Of course some unit/functional tests should be added also to Your patch :)
when this was proposed a few cycles ago the expection in testing was fullstack tests + unit and functional. the intent being to not need another job in the gate for a different routing mod however if the neutron team are open to adding a tempest job for this configuration then that is obviously better. from my understanding of the feautre this can be tested entirly upstream but it may be nice to add testing with dpdk via intel nfv ci which i belive still runs on neutron changes. it should relitivly simple to change the agent mode to dvr_bridge or whatever the new option is for the exisiting job. i am creating a personal replacement for the nfv ci for nova that will be doing some ovs-dpdk testing also so i can look into enabling this feature to get indirect testing depending on capasity also.
3. What problems can be anticipated by having the same agent managing both kernel and OVS powered routers (depending on whether they were created as ``distributed``)? We are experimenting with different ways of decoupling RouterInfo (mainly as part of the L3 agent refactor patch) and haven't been able to find the right balance yet. On one end we have an agent that is still coupled with kernel- based RouterInfo, and on the other end we have an agent that either only accepts OVS-based RouterInfos or only kernel-based RouterInfos depending on the ``agent_mode``.
Please keep in mind that there is spec about refactor RouterInfo to make it less coupled with L3 agent’s code. It’s in [1]. Maybe You can work on this together :)
We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor one should be able to pass Zuul after a recheck.
[1] Spec: https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr [2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536 [3] Gerrit topic: https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:mer...) [4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29 [5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17
Thank you!
Best regards, Igor D.C.
[1] https://review.openstack.org/#/c/625647/
— Slawek Kaplonski Senior software engineer Red Hat
On 1/29/19 1:25 AM, Duarte Cardoso, Igor wrote:
Hi Neutron,
I've been internally collaborating on the ``dvr_bridge`` L3 agent mode [1][2][3] work (David Shaughnessy, Xubo Zhang), which allows the L3 agent to make use of Open vSwitch / OpenFlow to implement ``distributed`` IPv4 Routers thus bypassing kernel namespaces and iptables and opening the door for higher performance by keeping packets in OVS for longer.
I want to share a few questions in order to gather feedback from you. I understand parts of these questions may have been answered in the past before my involvement, but I believe it's still important to revisit and clarify them. This can impact how long it's going to take to complete the work and whether it can make it to stein-3.
1. Should OVS support also be added to the legacy router?
And if so, would it make more sense to have a new variable (not ``agent_mode``) to specify what backend to use (OVS or kernel) instead of creating more combinations?
Personally, I would like to see all routers implemented completely in the OVS data path. We can't do everything at once, so the DVR-first approach here seems reasonable to me. As to the question of config flags, agent_mode has a specific meaning. It effectively tells the agent what role it's playing (SNAT, SNAT_HA, etc.), not how to do it. dvr_bridge isn't a new mode, it's really a change to the backend implementation of the router (ie the "how"). Because of that, I'm partial to an "agent_mode" flag which will toggle the router implementation between OVS and namespace implementations.
2. What is expected in terms of CI for this? Regarding testing, what should this first patch include apart from the unit tests? (since the l3_agent.ini needs to be configured differently).
3. What problems can be anticipated by having the same agent managing both kernel and OVS powered routers (depending on whether they were created as ``distributed``)?
We are experimenting with different ways of decoupling RouterInfo (mainly as part of the L3 agent refactor patch) and haven't been able to find the right balance yet. On one end we have an agent that is still coupled with kernel-based RouterInfo, and on the other end we have an agent that either only accepts OVS-based RouterInfos or only kernel-based RouterInfos depending on the ``agent_mode``.
We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor one should be able to pass Zuul after a recheck.
[1] Spec: https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr
[2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536
[3] Gerrit topic: https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:mer...)
[4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29
[5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17
Thank you!
Best regards,
Igor D.C.
-Ryan
"I'm partial to an "agent_mode" flag which will toggle the router..." In my previous email I mention being in favor of not overloading agent_mode, I realized I had a typo that might be confusing. I'm partial to introducing something like "agent_backend" for toggling OVS vs. namespace routers, not "agent_mode". Sorry for the typo. -Ryan On 1/31/19 10:02 AM, Ryan Tidwell wrote:
On 1/29/19 1:25 AM, Duarte Cardoso, Igor wrote:
Hi Neutron,
I've been internally collaborating on the ``dvr_bridge`` L3 agent mode [1][2][3] work (David Shaughnessy, Xubo Zhang), which allows the L3 agent to make use of Open vSwitch / OpenFlow to implement ``distributed`` IPv4 Routers thus bypassing kernel namespaces and iptables and opening the door for higher performance by keeping packets in OVS for longer.
I want to share a few questions in order to gather feedback from you. I understand parts of these questions may have been answered in the past before my involvement, but I believe it's still important to revisit and clarify them. This can impact how long it's going to take to complete the work and whether it can make it to stein-3.
1. Should OVS support also be added to the legacy router?
And if so, would it make more sense to have a new variable (not ``agent_mode``) to specify what backend to use (OVS or kernel) instead of creating more combinations?
Personally, I would like to see all routers implemented completely in the OVS data path. We can't do everything at once, so the DVR-first approach here seems reasonable to me. As to the question of config flags, agent_mode has a specific meaning. It effectively tells the agent what role it's playing (SNAT, SNAT_HA, etc.), not how to do it. dvr_bridge isn't a new mode, it's really a change to the backend implementation of the router (ie the "how"). Because of that, I'm partial to an "agent_mode" flag which will toggle the router implementation between OVS and namespace implementations.
2. What is expected in terms of CI for this? Regarding testing, what should this first patch include apart from the unit tests? (since the l3_agent.ini needs to be configured differently).
3. What problems can be anticipated by having the same agent managing both kernel and OVS powered routers (depending on whether they were created as ``distributed``)?
We are experimenting with different ways of decoupling RouterInfo (mainly as part of the L3 agent refactor patch) and haven't been able to find the right balance yet. On one end we have an agent that is still coupled with kernel-based RouterInfo, and on the other end we have an agent that either only accepts OVS-based RouterInfos or only kernel-based RouterInfos depending on the ``agent_mode``.
We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor one should be able to pass Zuul after a recheck.
[1] Spec: https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr
[2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536
[3] Gerrit topic: https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:mer...)
[4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29
[5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17
Thank you!
Best regards,
Igor D.C.
-Ryan
Hi Igor, Please see my comments in-line below On Tue, Jan 29, 2019 at 1:26 AM Duarte Cardoso, Igor < igor.duarte.cardoso@intel.com> wrote:
Hi Neutron,
I've been internally collaborating on the ``dvr_bridge`` L3 agent mode [1][2][3] work (David Shaughnessy, Xubo Zhang), which allows the L3 agent to make use of Open vSwitch / OpenFlow to implement ``distributed`` IPv4 Routers thus bypassing kernel namespaces and iptables and opening the door for higher performance by keeping packets in OVS for longer.
I want to share a few questions in order to gather feedback from you. I understand parts of these questions may have been answered in the past before my involvement, but I believe it's still important to revisit and clarify them. This can impact how long it's going to take to complete the work and whether it can make it to stein-3.
1. Should OVS support also be added to the legacy router?
And if so, would it make more sense to have a new variable (not ``agent_mode``) to specify what backend to use (OVS or kernel) instead of creating more combinations?
I would like to see the legacy router also implemented. And yes, we need to specify a new config option. As it has already been pointed out, we need to separate what the agent does in each host from the backend technology implementing the routers.
2. What is expected in terms of CI for this? Regarding testing, what should this first patch include apart from the unit tests? (since the l3_agent.ini needs to be configured differently).
I agree with Slawek. We would like to see a scenario job.
3. What problems can be anticipated by having the same agent managing both kernel and OVS powered routers (depending on whether they were created as ``distributed``)?
We are experimenting with different ways of decoupling RouterInfo (mainly as part of the L3 agent refactor patch) and haven't been able to find the right balance yet. On one end we have an agent that is still coupled with kernel-based RouterInfo, and on the other end we have an agent that either only accepts OVS-based RouterInfos or only kernel-based RouterInfos depending on the ``agent_mode``.
I also agree with Slawek here. It would a good idea if we can get the two efforts in synch so we can untangle RouterInfo from the agent code
We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor one should be able to pass Zuul after a recheck.
[1] Spec: https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr
[2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536
[3] Gerrit topic: https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:mer...)
[4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29
[5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17
Thank you!
Best regards,
Igor D.C.
Thank you Slawek, Seán, Ryan, Miguel. We’ll get to work on this new refactoring, legacy router implementation and the missing unit/functional tests. We’re setting lower priority to the scenario job but hopefully it can be done in stein-3 as well. Best regards, Igor D.C. From: Miguel Lavalle <miguel@mlavalle.com> Sent: Friday, February 1, 2019 5:07 PM To: openstack-discuss@lists.openstack.org Subject: Re: [neutron] OVS OpenFlow L3 DVR / dvr_bridge agent_mode Hi Igor, Please see my comments in-line below On Tue, Jan 29, 2019 at 1:26 AM Duarte Cardoso, Igor <igor.duarte.cardoso@intel.com<mailto:igor.duarte.cardoso@intel.com>> wrote: Hi Neutron, I've been internally collaborating on the ``dvr_bridge`` L3 agent mode [1][2][3] work (David Shaughnessy, Xubo Zhang), which allows the L3 agent to make use of Open vSwitch / OpenFlow to implement ``distributed`` IPv4 Routers thus bypassing kernel namespaces and iptables and opening the door for higher performance by keeping packets in OVS for longer. I want to share a few questions in order to gather feedback from you. I understand parts of these questions may have been answered in the past before my involvement, but I believe it's still important to revisit and clarify them. This can impact how long it's going to take to complete the work and whether it can make it to stein-3. 1. Should OVS support also be added to the legacy router? And if so, would it make more sense to have a new variable (not ``agent_mode``) to specify what backend to use (OVS or kernel) instead of creating more combinations? I would like to see the legacy router also implemented. And yes, we need to specify a new config option. As it has already been pointed out, we need to separate what the agent does in each host from the backend technology implementing the routers. 2. What is expected in terms of CI for this? Regarding testing, what should this first patch include apart from the unit tests? (since the l3_agent.ini needs to be configured differently). I agree with Slawek. We would like to see a scenario job. 3. What problems can be anticipated by having the same agent managing both kernel and OVS powered routers (depending on whether they were created as ``distributed``)? We are experimenting with different ways of decoupling RouterInfo (mainly as part of the L3 agent refactor patch) and haven't been able to find the right balance yet. On one end we have an agent that is still coupled with kernel-based RouterInfo, and on the other end we have an agent that either only accepts OVS-based RouterInfos or only kernel-based RouterInfos depending on the ``agent_mode``. I also agree with Slawek here. It would a good idea if we can get the two efforts in synch so we can untangle RouterInfo from the agent code We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor one should be able to pass Zuul after a recheck. [1] Spec: https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr [2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536 [3] Gerrit topic: https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:mer...) [4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29 [5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17 Thank you! Best regards, Igor D.C.
participants (5)
-
Duarte Cardoso, Igor
-
Miguel Lavalle
-
Ryan Tidwell
-
Sean Mooney
-
Slawomir Kaplonski