[neutron] bonding sriov nic inside VMs
Folks, As you know, SR-IOV doesn't support bonding so the only solution is to implement LACP bonding inside the VM. I did some tests in the lab to create two physnet and map them with two physical nic and create VF and attach them to VM. So far all good but one problem I am seeing is each neutron port I create has an IP address associated and I can use only one IP on bond but that is just a waste of IP in the Public IP pool. Are there any way to create sriov port but without IP address?
Folks,
As you know, SR-IOV doesn't support bonding so the only solution is to implement LACP bonding inside the VM.
I did some tests in the lab to create two physnet and map them with two physical nic and create VF and attach them to VM. So far all good but one problem I am seeing is each neutron port I create has an IP address associated and I can use only one IP on bond but that is just a waste of IP in the Public IP pool.
Are there any way to create sriov port but without IP address? techinially we now support adressless port in neutron and nova. so that shoudl be possible. if you tried to do this with hardware offloaed ovs rather then the standard sriov with the sriov nic agent you likel will need to also use the allowed_adress_pairs extension to ensure that ovs did not drop the packets based on the ip adress. if you are using heriarcical port binding where you TOR is manged by an ml2 driver you might also need the allowed_adress_pairs extension with the sriov nic agent to make sure
On Thu, 2023-03-09 at 16:43 -0500, Satish Patel wrote: the packets are not drop at the swtitch level. as you likely arlready no we do not support VF bonding in openstack or bonded ports in general in then neutron api. there was an effort a few years ago to make a bond port extention that mirror hwo trunk ports work i.e. hanving 2 neutron subport and a bond port that agreates them but we never got that far with the design. that would have enabeld boning to be implemtned in diffent ml2 driver like ovs/sriov/ovn ectra with a consitent/common api. some people have used mellonox's VF lag functionalty howver that was never actully enable propelry in nova/neutron so its not officlaly supported upstream but that functional allow you to attach only a singel VF to the guest form bonded ports on a single card. there is no supprot in nova/neutron for that offically as i said it just happens to work unitnetionally so i would not advise that you use it in produciton unless your happy to work though any issues you find yourself.
Thanks Sean, I don't have NIC which supports hardware offloading or any kind of feature. I am using intel nic 82599 just for SRIOV and looking for bonding support which is only possible inside VM. As you know we already run a large SRIOV environment with openstack but my biggest issue is to upgrade switches without downtime. I want to be more resilient to not worry about that. Do you still think it's dangerous or not a good idea to bond sriov nic inside VM? what could go wrong here just trying to understand before i go crazy :) On Fri, Mar 10, 2023 at 6:57 AM Sean Mooney <smooney@redhat.com> wrote:
Folks,
As you know, SR-IOV doesn't support bonding so the only solution is to implement LACP bonding inside the VM.
I did some tests in the lab to create two physnet and map them with two physical nic and create VF and attach them to VM. So far all good but one problem I am seeing is each neutron port I create has an IP address associated and I can use only one IP on bond but that is just a waste of IP in the Public IP pool.
Are there any way to create sriov port but without IP address? techinially we now support adressless port in neutron and nova. so that shoudl be possible. if you tried to do this with hardware offloaed ovs rather then the standard sriov with the sriov nic agent you likel will need to also use the allowed_adress_pairs extension to ensure that ovs did not drop the packets based on the ip adress. if you are using heriarcical port binding where you TOR is manged by an ml2 driver you might also need the allowed_adress_pairs extension with the sriov nic agent to make sure
On Thu, 2023-03-09 at 16:43 -0500, Satish Patel wrote: the packets are not drop at the swtitch level.
as you likely arlready no we do not support VF bonding in openstack or bonded ports in general in then neutron api. there was an effort a few years ago to make a bond port extention that mirror hwo trunk ports work i.e. hanving 2 neutron subport and a bond port that agreates them but we never got that far with the design. that would have enabeld boning to be implemtned in diffent ml2 driver like ovs/sriov/ovn ectra with a consitent/common api.
some people have used mellonox's VF lag functionalty howver that was never actully enable propelry in nova/neutron so its not officlaly supported upstream but that functional allow you to attach only a singel VF to the guest form bonded ports on a single card.
there is no supprot in nova/neutron for that offically as i said it just happens to work unitnetionally so i would not advise that you use it in produciton unless your happy to work though any issues you find yourself.
On Fri, 2023-03-10 at 08:30 -0500, Satish Patel wrote:
Thanks Sean,
I don't have NIC which supports hardware offloading or any kind of feature. I am using intel nic 82599 just for SRIOV and looking for bonding support which is only possible inside VM. As you know we already run a large SRIOV environment with openstack but my biggest issue is to upgrade switches without downtime. I want to be more resilient to not worry about that.
Do you still think it's dangerous or not a good idea to bond sriov nic inside VM? what could go wrong here just trying to understand before i go crazy :) lacp bond mode generaly dont work fully but you should be abel to get basic failover bondign working and perhaps tcp loadbalcing provide it does not require switch coperator to work form inside the guest.
just keep in mind that by defintion if you decalre a network as on a seperate phsynet to another then you as the operator are asserting that there is no l2 connectivity between those networks. as vlan 100 on physnet_1 is intended ot be a sperate vlan form vlan 100 on phsynet_2 if you break that and use phsynets to select PFs you are also breaking neutron multi teancy model meaning it is not safy to aloow end uers to create vlan networks and instead you can only use provider created vlan networks. so what you want to do is proably achiveable but you menthion phsyntes per pf and that sounds like you are breaking the physnets are seperate isolagged phsycial netowrks rule.
On Fri, Mar 10, 2023 at 6:57 AM Sean Mooney <smooney@redhat.com> wrote:
Folks,
As you know, SR-IOV doesn't support bonding so the only solution is to implement LACP bonding inside the VM.
I did some tests in the lab to create two physnet and map them with two physical nic and create VF and attach them to VM. So far all good but one problem I am seeing is each neutron port I create has an IP address associated and I can use only one IP on bond but that is just a waste of IP in the Public IP pool.
Are there any way to create sriov port but without IP address? techinially we now support adressless port in neutron and nova. so that shoudl be possible. if you tried to do this with hardware offloaed ovs rather then the standard sriov with the sriov nic agent you likel will need to also use the allowed_adress_pairs extension to ensure that ovs did not drop the packets based on the ip adress. if you are using heriarcical port binding where you TOR is manged by an ml2 driver you might also need the allowed_adress_pairs extension with the sriov nic agent to make sure
On Thu, 2023-03-09 at 16:43 -0500, Satish Patel wrote: the packets are not drop at the swtitch level.
as you likely arlready no we do not support VF bonding in openstack or bonded ports in general in then neutron api. there was an effort a few years ago to make a bond port extention that mirror hwo trunk ports work i.e. hanving 2 neutron subport and a bond port that agreates them but we never got that far with the design. that would have enabeld boning to be implemtned in diffent ml2 driver like ovs/sriov/ovn ectra with a consitent/common api.
some people have used mellonox's VF lag functionalty howver that was never actully enable propelry in nova/neutron so its not officlaly supported upstream but that functional allow you to attach only a singel VF to the guest form bonded ports on a single card.
there is no supprot in nova/neutron for that offically as i said it just happens to work unitnetionally so i would not advise that you use it in produciton unless your happy to work though any issues you find yourself.
Hi Sean, I have a few questions and they are in-line. This is the reference doc i am trying to achieve in my private cloud - https://www.redpill-linpro.com/techblog/2021/01/30/bonding-sriov-nics-with-o... On Fri, Mar 10, 2023 at 9:02 AM Sean Mooney <smooney@redhat.com> wrote:
On Fri, 2023-03-10 at 08:30 -0500, Satish Patel wrote:
Thanks Sean,
I don't have NIC which supports hardware offloading or any kind of feature. I am using intel nic 82599 just for SRIOV and looking for bonding support which is only possible inside VM. As you know we already run a large SRIOV environment with openstack but my biggest issue is to upgrade switches without downtime. I want to be more resilient to not worry about that.
Do you still think it's dangerous or not a good idea to bond sriov nic inside VM? what could go wrong here just trying to understand before i go crazy :) lacp bond mode generaly dont work fully but you should be abel to get basic failover bondign working and perhaps tcp loadbalcing provide it does not require switch coperator to work form inside the guest.
What do you mean by not working fully? Are you talking about active-active vs active-standby?
just keep in mind that by defintion if you decalre a network as on a seperate phsynet to another then you as the operator are asserting that there is no l2 connectivity between those networks.
This is interesting why not both physnet have the same L2 segment? Are you worried STP about the loop? But that is how LACP works both physical interfaces on the same segments.
as vlan 100 on physnet_1 is intended ot be a sperate vlan form vlan 100 on phsynet_2
I did a test in the lab with physnet_1 and physnet_2 both on the same VLAN ID in the same L2 domain and all works.
if you break that and use phsynets to select PFs you are also breaking neutron multi teancy model meaning it is not safy to aloow end uers to create vlan networks and instead you can only use provider created vlan networks.
This is a private cloud and we don't have any multi-tenancy model. We have all VLAN base providers and my Datacenter core router is the gateway for all my vlans providers.
so what you want to do is proably achiveable but you menthion phsyntes per pf and that sounds like you are breaking the physnets are seperate isolagged phsycial netowrks rule.
I can understand each physnet should be in a different tenant but in my case its vlan base provider and not sure what rules it's going to break.
On Fri, Mar 10, 2023 at 6:57 AM Sean Mooney <smooney@redhat.com> wrote:
On Thu, 2023-03-09 at 16:43 -0500, Satish Patel wrote:
Folks,
As you know, SR-IOV doesn't support bonding so the only solution is
implement LACP bonding inside the VM.
I did some tests in the lab to create two physnet and map them with two physical nic and create VF and attach them to VM. So far all good but one problem I am seeing is each neutron port I create has an IP address associated and I can use only one IP on bond but that is just a waste of IP in the Public IP pool.
Are there any way to create sriov port but without IP address? techinially we now support adressless port in neutron and nova. so that shoudl be possible. if you tried to do this with hardware offloaed ovs rather then the standard sriov with the sriov nic agent you likel will need to also use the allowed_adress_pairs extension to ensure that ovs did not drop the packets based on the ip adress. if you are using heriarcical
to port
binding where you TOR is manged by an ml2 driver you might also need the allowed_adress_pairs extension with the sriov nic agent to make sure the packets are not drop at the swtitch level.
as you likely arlready no we do not support VF bonding in openstack or bonded ports in general in then neutron api. there was an effort a few years ago to make a bond port extention that mirror hwo trunk ports work i.e. hanving 2 neutron subport and a bond port that agreates them but we never got that far with the design. that would have enabeld boning to be implemtned in diffent ml2 driver like ovs/sriov/ovn ectra with a consitent/common api.
some people have used mellonox's VF lag functionalty howver that was never actully enable propelry in nova/neutron so its not officlaly supported upstream but that functional allow you to attach only a singel VF to the guest form bonded ports on a single card.
there is no supprot in nova/neutron for that offically as i said it just happens to work unitnetionally so i would not advise that you use it in produciton unless your happy to work though any issues you find yourself.
On Fri, 2023-03-10 at 11:54 -0500, Satish Patel wrote:
Hi Sean,
I have a few questions and they are in-line. This is the reference doc i am trying to achieve in my private cloud - https://www.redpill-linpro.com/techblog/2021/01/30/bonding-sriov-nics-with-o... ^ is only safe in a multi tenant envionment if https://docs.openstack.org/neutron/latest/configuration/ml2-conf.html#ml2.te... does not container vlan or flat.
it is technially breaking neutron rules for how to use phsyents. in private cloud where tenatn isolation is not required operators have abused this for years for things like selecting numa nodes and many other usecase which are unsafe in a public cloud.
On Fri, Mar 10, 2023 at 9:02 AM Sean Mooney <smooney@redhat.com> wrote:
On Fri, 2023-03-10 at 08:30 -0500, Satish Patel wrote:
Thanks Sean,
I don't have NIC which supports hardware offloading or any kind of feature. I am using intel nic 82599 just for SRIOV and looking for bonding support which is only possible inside VM. As you know we already run a large SRIOV environment with openstack but my biggest issue is to upgrade switches without downtime. I want to be more resilient to not worry about that.
Do you still think it's dangerous or not a good idea to bond sriov nic inside VM? what could go wrong here just trying to understand before i go crazy :) lacp bond mode generaly dont work fully but you should be abel to get basic failover bondign working and perhaps tcp loadbalcing provide it does not require switch coperator to work form inside the guest.
What do you mean by not working fully? Are you talking about active-active vs active-standby?
some lacp modes require configuration on the swtich others do not you can only really do that form the pf as at the switch level you can bring down the port fo ronly some vlans in a failover case. https://docs.rackspace.com/blog/lacp-bonding-and-linux-configuration/ i belive mode 0, 1, 2, 5 and 6 can work withour sepcial switgh config. 3 and 4 i think reuqired switch cooperation IEEE 802.3ad (mode 4) in particalar i think neeed coperation with the switch. """The link is set up dynamically between two LACP-supporting peers.""" https://en.wikipedia.org/wiki/Link_aggregation that peerign session can only really run on the PFs balance-tlb (5) and balance-alb(6) shoudl work fine for teh VFs in the guest however.
just keep in mind that by defintion if you decalre a network as on a seperate phsynet to another then you as the operator are asserting that there is no l2 connectivity between those networks.
This is interesting why not both physnet have the same L2 segment? Are you worried STP about the loop? But that is how LACP works both physical interfaces on the same segments.
if they are on the same l2 segment then there is no multi tancy when using vlan or flat netowrks. more on this below.
as vlan 100 on physnet_1 is intended ot be a sperate vlan form vlan 100 on phsynet_2
I did a test in the lab with physnet_1 and physnet_2 both on the same VLAN ID in the same L2 domain and all works.
if you create 2 neutron networks physnet_1_vlan_100 and physnet_2_vlan_100 and map phsynet_1 to eth1 and phsnet_2 to eth2 and plug the both into the same TOR with vlan 100 trunked to both then boot one vm on physnet_1_vlan_100 and a second on physnet_2_vlan_100 then a few things will hapen. the vms will boot fine and both will get ips. second there will be no isolation between the two networks so if you use the same subnet on both then they will be able to direcly ping each other. its unsafe to have teant cretable vlan networks in this if you have overlaping vlan ranges between physnet_1 and physnet_2 as there will be no tenant isolation enforeced at teh network level. form a neutron point of view physnet_1_vlan_100 and physnet_2_vlan_100 are two entrily differnt netowrks and its the oeprators responsiblity to ensure there network fabric ensure the same vlan on two phsnets cant comunicate.
if you break that and use phsynets to select PFs you are also breaking neutron multi teancy model meaning it is not safy to aloow end uers to create vlan networks and instead you can only use provider created vlan networks.
This is a private cloud and we don't have any multi-tenancy model. We have all VLAN base providers and my Datacenter core router is the gateway for all my vlans providers.
ack in which case you can live with the fact that there is no mulit taenancy guarentees because the rules areound phsynets have been broken. this is prrety common in telco cloud by the way so you would not be the first to do this.
so what you want to do is proably achiveable but you menthion phsyntes per pf and that sounds like you are breaking the physnets are seperate isolagged phsycial netowrks rule.
I can understand each physnet should be in a different tenant but in my case its vlan base provider and not sure what rules it's going to break.
each physnet does not need to be a diffent tenatn the imporant thing is that neutron expects vlans on differnt physnets to be allcoateable seperatly. so the same vlan on 2 phsynets logically represnet 2 differnt networks.
On Fri, Mar 10, 2023 at 6:57 AM Sean Mooney <smooney@redhat.com> wrote:
On Thu, 2023-03-09 at 16:43 -0500, Satish Patel wrote:
Folks,
As you know, SR-IOV doesn't support bonding so the only solution is
implement LACP bonding inside the VM.
I did some tests in the lab to create two physnet and map them with two physical nic and create VF and attach them to VM. So far all good but one problem I am seeing is each neutron port I create has an IP address associated and I can use only one IP on bond but that is just a waste of IP in the Public IP pool.
Are there any way to create sriov port but without IP address? techinially we now support adressless port in neutron and nova. so that shoudl be possible. if you tried to do this with hardware offloaed ovs rather then the standard sriov with the sriov nic agent you likel will need to also use the allowed_adress_pairs extension to ensure that ovs did not drop the packets based on the ip adress. if you are using heriarcical
to port
binding where you TOR is manged by an ml2 driver you might also need the allowed_adress_pairs extension with the sriov nic agent to make sure the packets are not drop at the swtitch level.
as you likely arlready no we do not support VF bonding in openstack or bonded ports in general in then neutron api. there was an effort a few years ago to make a bond port extention that mirror hwo trunk ports work i.e. hanving 2 neutron subport and a bond port that agreates them but we never got that far with the design. that would have enabeld boning to be implemtned in diffent ml2 driver like ovs/sriov/ovn ectra with a consitent/common api.
some people have used mellonox's VF lag functionalty howver that was never actully enable propelry in nova/neutron so its not officlaly supported upstream but that functional allow you to attach only a singel VF to the guest form bonded ports on a single card.
there is no supprot in nova/neutron for that offically as i said it just happens to work unitnetionally so i would not advise that you use it in produciton unless your happy to work though any issues you find yourself.
Thank you Sean for the detailed explanation, I agree on LACP mode and I think Active-Standby would be a better and safer option for me. Yes, as you said telco is abusing many neutron rules and i think i am one of them because pretty much we are running telco applications :) As I said, it's a private cloud so I can break and bend rules to just make applications available 24x7. We don't have any multi-tenancy where I should be worried about security. Last question, Related MAC Address change because neutron doesn't allow change of Mac address correct so i have to set the same MAC Address on both sriov port. As per reference blog. On Fri, Mar 10, 2023 at 12:37 PM Sean Mooney <smooney@redhat.com> wrote:
On Fri, 2023-03-10 at 11:54 -0500, Satish Patel wrote:
Hi Sean,
I have a few questions and they are in-line. This is the reference doc i am trying to achieve in my private cloud -
https://www.redpill-linpro.com/techblog/2021/01/30/bonding-sriov-nics-with-o... ^ is only safe in a multi tenant envionment if
https://docs.openstack.org/neutron/latest/configuration/ml2-conf.html#ml2.te... does not container vlan or flat.
it is technially breaking neutron rules for how to use phsyents.
in private cloud where tenatn isolation is not required operators have abused this for years for things like selecting numa nodes and many other usecase which are unsafe in a public cloud.
On Fri, Mar 10, 2023 at 9:02 AM Sean Mooney <smooney@redhat.com> wrote:
On Fri, 2023-03-10 at 08:30 -0500, Satish Patel wrote:
Thanks Sean,
I don't have NIC which supports hardware offloading or any kind of feature. I am using intel nic 82599 just for SRIOV and looking for bonding support which is only possible inside VM. As you know we already run
large SRIOV environment with openstack but my biggest issue is to upgrade switches without downtime. I want to be more resilient to not worry about that.
Do you still think it's dangerous or not a good idea to bond sriov nic inside VM? what could go wrong here just trying to understand before i go crazy :) lacp bond mode generaly dont work fully but you should be abel to get basic failover bondign working and perhaps tcp loadbalcing provide it does not require switch coperator to work form inside the guest.
What do you mean by not working fully? Are you talking about active-active vs active-standby? some lacp modes require configuration on the swtich others do not you can only really do that form the pf as at the switch level you can bring down
a the port fo ronly some vlans in a failover case.
https://docs.rackspace.com/blog/lacp-bonding-and-linux-configuration/
i belive mode 0, 1, 2, 5 and 6 can work withour sepcial switgh config.
3 and 4 i think reuqired switch cooperation
IEEE 802.3ad (mode 4) in particalar i think neeed coperation with the switch. """The link is set up dynamically between two LACP-supporting peers.""" https://en.wikipedia.org/wiki/Link_aggregation
that peerign session can only really run on the PFs
balance-tlb (5) and balance-alb(6) shoudl work fine for teh VFs in the guest however.
just keep in mind that by defintion if you decalre a network as on a seperate phsynet to another then you as the operator are asserting that there is no l2 connectivity between those networks.
This is interesting why not both physnet have the same L2 segment? Are
you
worried STP about the loop? But that is how LACP works both physical interfaces on the same segments. if they are on the same l2 segment then there is no multi tancy when using vlan or flat netowrks. more on this below.
as vlan 100 on physnet_1 is intended ot be a sperate vlan form vlan 100 on phsynet_2
I did a test in the lab with physnet_1 and physnet_2 both on the same VLAN ID in the same L2 domain and all works.
if you create 2 neutron networks
physnet_1_vlan_100 and physnet_2_vlan_100
and map phsynet_1 to eth1 and phsnet_2 to eth2 and plug the both into the same TOR with vlan 100 trunked to both
then boot one vm on physnet_1_vlan_100 and a second on physnet_2_vlan_100
then a few things will hapen.
the vms will boot fine and both will get ips. second there will be no isolation between the two networks so if you use the same subnet on both then they will be able to direcly ping each other.
its unsafe to have teant cretable vlan networks in this if you have overlaping vlan ranges between physnet_1 and physnet_2 as there will be no tenant isolation enforeced at teh network level.
form a neutron point of view physnet_1_vlan_100 and physnet_2_vlan_100 are two entrily differnt netowrks and its the oeprators responsiblity to ensure there network fabric ensure the same vlan on two phsnets cant comunicate.
if you break that and use phsynets to select PFs you are also breaking neutron multi teancy model meaning it is not safy to aloow end uers to create vlan networks and instead you can only use provider created vlan networks.
This is a private cloud and we don't have any multi-tenancy model. We
have
all VLAN base providers and my Datacenter core router is the gateway for all my vlans providers. ack in which case you can live with the fact that there is no mulit taenancy guarentees because the rules areound phsynets have been broken.
this is prrety common in telco cloud by the way so you would not be the first to do this.
so what you want to do is proably achiveable but you menthion phsyntes
pf and that sounds like you are breaking the physnets are seperate isolagged phsycial netowrks rule.
I can understand each physnet should be in a different tenant but in my case its vlan base provider and not sure what rules it's going to break. each physnet does not need to be a diffent tenatn
per the imporant thing is that neutron expects vlans on differnt physnets to be allcoateable seperatly.
so the same vlan on 2 phsynets logically represnet 2 differnt networks.
On Fri, Mar 10, 2023 at 6:57 AM Sean Mooney <smooney@redhat.com>
On Thu, 2023-03-09 at 16:43 -0500, Satish Patel wrote:
Folks,
As you know, SR-IOV doesn't support bonding so the only solution
implement LACP bonding inside the VM.
I did some tests in the lab to create two physnet and map them with two physical nic and create VF and attach them to VM. So far all good but one problem I am seeing is each neutron port I create has an IP address associated and I can use only one IP on bond but that is just a waste of IP in the Public IP pool.
Are there any way to create sriov port but without IP address? techinially we now support adressless port in neutron and nova. so that shoudl be possible. if you tried to do this with hardware offloaed ovs rather then the standard sriov with the sriov nic agent you likel will need to also use the allowed_adress_pairs extension to ensure that ovs did not drop the packets based on the ip adress. if you are using heriarcical
is to port
binding where you TOR is manged by an ml2 driver you might also need the allowed_adress_pairs extension with the sriov nic agent to make sure the packets are not drop at the swtitch level.
as you likely arlready no we do not support VF bonding in openstack or bonded ports in general in then neutron api. there was an effort a few years ago to make a bond port extention
mirror hwo trunk ports work i.e. hanving 2 neutron subport and a bond port that agreates them but we never got that far with the design. that would have enabeld boning to be implemtned in diffent ml2 driver like ovs/sriov/ovn ectra with a consitent/common api.
some people have used mellonox's VF lag functionalty howver that was never actully enable propelry in nova/neutron so its not officlaly supported upstream but that functional allow you to attach only a singel VF to the guest form bonded ports on a single card.
there is no supprot in nova/neutron for that offically as i said it just happens to work unitnetionally so i would not advise that you use it in produciton unless your happy to work
wrote: that though
any
issues you find yourself.
participants (2)
-
Satish Patel
-
Sean Mooney