[ovn][neutron] sriov port not able to get ip from DHCP
Folks, I have built a new openstack using kolla-ansible with 2024.1 with OVN plugin. So far everything is working good. Today I have added a new compute node with SRIOV port and it turns out vm is not able to obtain IP from DHCP. With the same subnet if I create general vm then it works and obtains IP but with sriov port it doesn't work. If I assign IP static on the vm interface then my VM is able to ping the outside world. I have deployed many clouds with OVN and SRIOV and never had an issue so I never dig into any kind of troubleshooting. This is the first time I have seen this issue. I have checked port mapping etc and all looks good. In SRIOV case what is the flow of DHCP? does physical interface send DHCP request on wire or DHCP request route via OVN br-int bridge? How will handle that DHCP request? Following two SRIOV port I have created but they are not able to obtain IP address via dhcp (ovn-nb-db)[root@os-chn-ctrl03 /]# ovn-nbctl --no-leader-only find Logical_Switch_Port type=external _uuid : a274df3b-6967-4f73-91ff-8f6eb46f70e0 addresses : ["fa:16:3e:03:55:44 10.74.2.35"] dhcpv4_options : 7829902c-5cdf-49cf-b478-7b683669c726 dhcpv6_options : [] dynamic_addresses : [] enabled : true external_ids : {"neutron:cidrs"="10.74.2.35/23", "neutron:device_id"="f4b91945-32f9-4961-a2cf-26b408218e35", "neutron:device_owner"="compute:sriov", "neutron:mtu"="", "neutron:network_name"=neutron-87131346-90d6-48ca-a045-214ff02ee9c0, "neutron:port_capabilities"="rx;tx;sg;tso;gso;gro;rxvlan;txvlan;rxhash;txudptnl", "neutron:port_name"=my-sriov-port, "neutron:project_id"="0c4956dce56a4503878007355850a04a", "neutron:revision_number"="32", "neutron:security_group_ids"="0bd505e8-d086-4d72-8e56-b913a8672f08", "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=direct} ha_chassis_group : 0c41026b-3b01-485e-908a-523202aefcf9 mirror_rules : [] name : "0cbe012e-902a-4503-91a2-191b5f4acac5" options : {} parent_name : [] port_security : ["fa:16:3e:03:55:44 10.74.2.35"] tag : [] tag_request : [] type : external up : true _uuid : 1b245b6d-b720-41cd-a09c-79dc7eab3149 addresses : ["fa:16:3e:d8:4b:ab 10.74.2.181"] dhcpv4_options : 7829902c-5cdf-49cf-b478-7b683669c726 dhcpv6_options : [] dynamic_addresses : [] enabled : true external_ids : {"neutron:cidrs"="10.74.2.181/23", "neutron:device_id"="ac767d12-cff0-484e-b014-f3e42160cd8c", "neutron:device_owner"="compute:sriov", "neutron:mtu"="", "neutron:network_name"=neutron-87131346-90d6-48ca-a045-214ff02ee9c0, "neutron:port_capabilities"="rx;tx;sg;tso;gso;gro;rxvlan;txvlan;rxhash;txudptnl", "neutron:port_name"=my-sriov-p1, "neutron:project_id"="0c4956dce56a4503878007355850a04a", "neutron:revision_number"="5", "neutron:security_group_ids"="0bd505e8-d086-4d72-8e56-b913a8672f08", "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=direct} ha_chassis_group : 0c41026b-3b01-485e-908a-523202aefcf9 mirror_rules : [] name : "bdf9261c-5cfc-4df1-9c11-c8ec26535eef" options : {} parent_name : [] port_security : ["fa:16:3e:d8:4b:ab 10.74.2.181"] tag : [] tag_request : [] type : external up : true
After googling, I found following statement. If your deployment has SR-IOV instances, make sure that at least one of the OVN chassis named applications has the ``prefer-chassis-as-gw`` configuration option set to 'true'. But in my older deployment I didn't have the above option and still sriov working fine and receiving DHCP IP automatically. Is this a new option I need to set? Currently I have following options (openvswitch-vswitchd)[root@os-chn-ctrl02 /]# ovs-vsctl list open_vswitch _uuid : 773e18f8-b646-4cf2-8da3-2f6ab501f0e3 bridges : [d398d109-3ca2-4ca2-bb09-3235e329260d, eaeb638c-4cf1-44e5-b4fc-9088574542de] cur_cfg : 18 datapath_types : [netdev, system] datapaths : {system=ae76447d-f54e-44f0-aefb-775756b67689} db_version : [] dpdk_initialized : false dpdk_version : none external_ids : {hostname=os-chn-ctrl02, ovn-bridge-mappings="physnet1:br-ex", ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="172.16.0.12", ovn-encap-type=geneve, ovn-monitor-all="false", ovn-openflow-probe-interval="60", ovn-remote="tcp:10.74.0.11:6642,tcp:10.74.0.12:6642,tcp:10.74.0.13:6642", ovn-remote-probe-interval="60000", system-id=os-chn-ctrl02} iface_types : [afxdp, afxdp-nonpmd, bareudp, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, srv6, stt, system, tap, vxlan] manager_options : [] next_cfg : 18 other_config : {ovn-chassis-idx-os-chn-ctrl02="", vlan-limit="0"} ovs_version : [] ssl : [] statistics : {} system_type : [] system_version : [] On Sun, Feb 9, 2025 at 12:11 AM Satish Patel <satish.txt@gmail.com> wrote:
Folks,
I have built a new openstack using kolla-ansible with 2024.1 with OVN plugin. So far everything is working good. Today I have added a new compute node with SRIOV port and it turns out vm is not able to obtain IP from DHCP.
With the same subnet if I create general vm then it works and obtains IP but with sriov port it doesn't work. If I assign IP static on the vm interface then my VM is able to ping the outside world.
I have deployed many clouds with OVN and SRIOV and never had an issue so I never dig into any kind of troubleshooting. This is the first time I have seen this issue. I have checked port mapping etc and all looks good.
In SRIOV case what is the flow of DHCP? does physical interface send DHCP request on wire or DHCP request route via OVN br-int bridge? How will handle that DHCP request?
Following two SRIOV port I have created but they are not able to obtain IP address via dhcp
(ovn-nb-db)[root@os-chn-ctrl03 /]# ovn-nbctl --no-leader-only find Logical_Switch_Port type=external _uuid : a274df3b-6967-4f73-91ff-8f6eb46f70e0 addresses : ["fa:16:3e:03:55:44 10.74.2.35"] dhcpv4_options : 7829902c-5cdf-49cf-b478-7b683669c726 dhcpv6_options : [] dynamic_addresses : [] enabled : true external_ids : {"neutron:cidrs"="10.74.2.35/23", "neutron:device_id"="f4b91945-32f9-4961-a2cf-26b408218e35", "neutron:device_owner"="compute:sriov", "neutron:mtu"="", "neutron:network_name"=neutron-87131346-90d6-48ca-a045-214ff02ee9c0, "neutron:port_capabilities"="rx;tx;sg;tso;gso;gro;rxvlan;txvlan;rxhash;txudptnl", "neutron:port_name"=my-sriov-port, "neutron:project_id"="0c4956dce56a4503878007355850a04a", "neutron:revision_number"="32", "neutron:security_group_ids"="0bd505e8-d086-4d72-8e56-b913a8672f08", "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=direct} ha_chassis_group : 0c41026b-3b01-485e-908a-523202aefcf9 mirror_rules : [] name : "0cbe012e-902a-4503-91a2-191b5f4acac5" options : {} parent_name : [] port_security : ["fa:16:3e:03:55:44 10.74.2.35"] tag : [] tag_request : [] type : external up : true
_uuid : 1b245b6d-b720-41cd-a09c-79dc7eab3149 addresses : ["fa:16:3e:d8:4b:ab 10.74.2.181"] dhcpv4_options : 7829902c-5cdf-49cf-b478-7b683669c726 dhcpv6_options : [] dynamic_addresses : [] enabled : true external_ids : {"neutron:cidrs"="10.74.2.181/23", "neutron:device_id"="ac767d12-cff0-484e-b014-f3e42160cd8c", "neutron:device_owner"="compute:sriov", "neutron:mtu"="", "neutron:network_name"=neutron-87131346-90d6-48ca-a045-214ff02ee9c0, "neutron:port_capabilities"="rx;tx;sg;tso;gso;gro;rxvlan;txvlan;rxhash;txudptnl", "neutron:port_name"=my-sriov-p1, "neutron:project_id"="0c4956dce56a4503878007355850a04a", "neutron:revision_number"="5", "neutron:security_group_ids"="0bd505e8-d086-4d72-8e56-b913a8672f08", "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=direct} ha_chassis_group : 0c41026b-3b01-485e-908a-523202aefcf9 mirror_rules : [] name : "bdf9261c-5cfc-4df1-9c11-c8ec26535eef" options : {} parent_name : [] port_security : ["fa:16:3e:d8:4b:ab 10.74.2.181"] tag : [] tag_request : [] type : external up : true
I figured it out, my problem was I have the same subnet IP address configured on my controller/network node and that is why DHCP ARP was not getting completed. I have next question to fix that issue. How do I change priority of the chassis so I can move it to another network node? (ovn-nb-db)[root@os-chn-ctrl02 /]# ovn-nbctl --no-leader-only ha-chassis-group-list 0c41026b-3b01-485e-908a-523202aefcf9 (neutron-87131346-90d6-48ca-a045-214ff02ee9c0) 784ec6e9-9af8-45c6-86e0-75ae7ccf9278 (os-chn-ctrl03) priority 32767 7d83acec-67fa-4f01-9a42-9a6b5acf03b7 (os-chn-gen-comp-001) priority 32764 c75de7ab-b346-4aa5-886b-31b0fd12401e (os-chn-ctrl01) priority 32766 dd6ce2d4-9746-4429-862a-e78750bb4909 (os-chn-ctrl02) priority 32765 I want to give os-chn-gen-comp-001 highest priority but I'm not sure what command I should use and where. I don't have any virtual router etc. I have a VLAN based network so my VM is directly connected to VLANs. In the OVN doc I didn't find any command who changed priority. Everyone is talking about changing the priory of Gateway chassis but in my case I don't have Gateway. All I have are subnets (no routers). On Sun, Feb 9, 2025 at 11:03 AM Satish Patel <satish.txt@gmail.com> wrote:
After googling, I found following statement.
If your deployment has SR-IOV instances, make sure that at least one of the OVN chassis named applications has the ``prefer-chassis-as-gw`` configuration option set to 'true'.
But in my older deployment I didn't have the above option and still sriov working fine and receiving DHCP IP automatically. Is this a new option I need to set?
Currently I have following options
(openvswitch-vswitchd)[root@os-chn-ctrl02 /]# ovs-vsctl list open_vswitch _uuid : 773e18f8-b646-4cf2-8da3-2f6ab501f0e3 bridges : [d398d109-3ca2-4ca2-bb09-3235e329260d, eaeb638c-4cf1-44e5-b4fc-9088574542de] cur_cfg : 18 datapath_types : [netdev, system] datapaths : {system=ae76447d-f54e-44f0-aefb-775756b67689} db_version : [] dpdk_initialized : false dpdk_version : none external_ids : {hostname=os-chn-ctrl02, ovn-bridge-mappings="physnet1:br-ex", ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="172.16.0.12", ovn-encap-type=geneve, ovn-monitor-all="false", ovn-openflow-probe-interval="60", ovn-remote="tcp:10.74.0.11:6642,tcp:10.74.0.12:6642,tcp:10.74.0.13:6642", ovn-remote-probe-interval="60000", system-id=os-chn-ctrl02} iface_types : [afxdp, afxdp-nonpmd, bareudp, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, srv6, stt, system, tap, vxlan] manager_options : [] next_cfg : 18 other_config : {ovn-chassis-idx-os-chn-ctrl02="", vlan-limit="0"} ovs_version : [] ssl : [] statistics : {} system_type : [] system_version : []
On Sun, Feb 9, 2025 at 12:11 AM Satish Patel <satish.txt@gmail.com> wrote:
Folks,
I have built a new openstack using kolla-ansible with 2024.1 with OVN plugin. So far everything is working good. Today I have added a new compute node with SRIOV port and it turns out vm is not able to obtain IP from DHCP.
With the same subnet if I create general vm then it works and obtains IP but with sriov port it doesn't work. If I assign IP static on the vm interface then my VM is able to ping the outside world.
I have deployed many clouds with OVN and SRIOV and never had an issue so I never dig into any kind of troubleshooting. This is the first time I have seen this issue. I have checked port mapping etc and all looks good.
In SRIOV case what is the flow of DHCP? does physical interface send DHCP request on wire or DHCP request route via OVN br-int bridge? How will handle that DHCP request?
Following two SRIOV port I have created but they are not able to obtain IP address via dhcp
(ovn-nb-db)[root@os-chn-ctrl03 /]# ovn-nbctl --no-leader-only find Logical_Switch_Port type=external _uuid : a274df3b-6967-4f73-91ff-8f6eb46f70e0 addresses : ["fa:16:3e:03:55:44 10.74.2.35"] dhcpv4_options : 7829902c-5cdf-49cf-b478-7b683669c726 dhcpv6_options : [] dynamic_addresses : [] enabled : true external_ids : {"neutron:cidrs"="10.74.2.35/23", "neutron:device_id"="f4b91945-32f9-4961-a2cf-26b408218e35", "neutron:device_owner"="compute:sriov", "neutron:mtu"="", "neutron:network_name"=neutron-87131346-90d6-48ca-a045-214ff02ee9c0, "neutron:port_capabilities"="rx;tx;sg;tso;gso;gro;rxvlan;txvlan;rxhash;txudptnl", "neutron:port_name"=my-sriov-port, "neutron:project_id"="0c4956dce56a4503878007355850a04a", "neutron:revision_number"="32", "neutron:security_group_ids"="0bd505e8-d086-4d72-8e56-b913a8672f08", "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=direct} ha_chassis_group : 0c41026b-3b01-485e-908a-523202aefcf9 mirror_rules : [] name : "0cbe012e-902a-4503-91a2-191b5f4acac5" options : {} parent_name : [] port_security : ["fa:16:3e:03:55:44 10.74.2.35"] tag : [] tag_request : [] type : external up : true
_uuid : 1b245b6d-b720-41cd-a09c-79dc7eab3149 addresses : ["fa:16:3e:d8:4b:ab 10.74.2.181"] dhcpv4_options : 7829902c-5cdf-49cf-b478-7b683669c726 dhcpv6_options : [] dynamic_addresses : [] enabled : true external_ids : {"neutron:cidrs"="10.74.2.181/23", "neutron:device_id"="ac767d12-cff0-484e-b014-f3e42160cd8c", "neutron:device_owner"="compute:sriov", "neutron:mtu"="", "neutron:network_name"=neutron-87131346-90d6-48ca-a045-214ff02ee9c0, "neutron:port_capabilities"="rx;tx;sg;tso;gso;gro;rxvlan;txvlan;rxhash;txudptnl", "neutron:port_name"=my-sriov-p1, "neutron:project_id"="0c4956dce56a4503878007355850a04a", "neutron:revision_number"="5", "neutron:security_group_ids"="0bd505e8-d086-4d72-8e56-b913a8672f08", "neutron:subnet_pool_addr_scope4"="", "neutron:subnet_pool_addr_scope6"="", "neutron:vnic_type"=direct} ha_chassis_group : 0c41026b-3b01-485e-908a-523202aefcf9 mirror_rules : [] name : "bdf9261c-5cfc-4df1-9c11-c8ec26535eef" options : {} parent_name : [] port_security : ["fa:16:3e:d8:4b:ab 10.74.2.181"] tag : [] tag_request : [] type : external up : true
participants (1)
-
Satish Patel