<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.E-MailFormatvorlage17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 2.0cm 70.85pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">Hello,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">does anyone have an idea what the following failure could be caused by? In summary: guest VMs connected to a tenant network are receiving bogus ARP responses. These are mapping unused IP addresses to virtual bridge ports belonging to other
ports on the same compute host.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">We are using Kilo openvswitch-agent with ml2 plugin.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Please have a look at the following example. A VM with the fixed-ip 192.168.1.15 reports the following ARP cache:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> root@michael-test2:~# arp<o:p></o:p></p>
<p class="MsoNormal"> Address HWtype HWaddress Flags Mask Iface<o:p></o:p></p>
<p class="MsoNormal"> <span lang="DE">host-192-168-1-2.openst ether fa:16:3e:de:ab:ea C eth0<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE"> </span>192.168.1.13 ether a6:b2:dc:d8:39:c1 C eth0<o:p></o:p></p>
<p class="MsoNormal"> 192.168.1.119 (incomplete) eth0<o:p></o:p></p>
<p class="MsoNormal"> host-192-168-1-20.opens ether fa:16:3e:76:43:ce C eth0<o:p></o:p></p>
<p class="MsoNormal"> host-192-168-1-19.opens ether fa:16:3e:0d:a6:0b C eth0<o:p></o:p></p>
<p class="MsoNormal"> host-192-168-1-1.openst ether fa:16:3e:2a:81:ff C eth0<o:p></o:p></p>
<p class="MsoNormal"> <span lang="DE">192.168.1.14 ether 0e:bf:04:b7:ed:52 C eth0<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE"><o:p> </o:p></span></p>
<p class="MsoNormal">Please note that both 192.168.1.13 and 192.168.1.14 are not in use in this subnet. The displayed MAC addresses a6:b2:dc:d8:39:c1 and 0e:bf:04:b7:ed:52 actually belong to other instance qbr* and qvb* devices, living on their respective hypervisor
hosts!<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Looking at 0e:bf:04:b7:ed:52, for example, yields<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> <span lang="DE"># ip link list | grep -C1 -e 0e:bf:04:b7:ed:52<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE"> </span>59: qbr9ac24ac1-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default<o:p></o:p></p>
<p class="MsoNormal"> link/ether 0e:bf:04:b7:ed:52 brd ff:ff:ff:ff:ff:ff<o:p></o:p></p>
<p class="MsoNormal"> 60: qvo9ac24ac1-e1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP mode DEFAULT group default qlen 1000<o:p></o:p></p>
<p class="MsoNormal"> --<o:p></o:p></p>
<p class="MsoNormal"> 61: qvb9ac24ac1-e1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr9ac24ac1-e1 state UP mode DEFAULT group default qlen 1000<o:p></o:p></p>
<p class="MsoNormal"> link/ether 0e:bf:04:b7:ed:52 brd ff:ff:ff:ff:ff:ff<o:p></o:p></p>
<p class="MsoNormal"> 62: tap9ac24ac1-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master qbr9ac24ac1-e1 state UNKNOWN mode DEFAULT group default qlen 500<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">on the compute node. Using tcpdump on qbr9ac24ac1-e1 on the host and triggering a fresh ARM lookup on the guest VM results in<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> # tcpdump -i qbr9ac24ac1-e1 -vv -l | grep ARP<o:p></o:p></p>
<p class="MsoNormal"> tcpdump: WARNING: qbr9ac24ac1-e1: no IPv4 address assigned<o:p></o:p></p>
<p class="MsoNormal"> tcpdump: listening on qbr9ac24ac1-e1, link-type EN10MB (Ethernet), capture size 65535 bytes<o:p></o:p></p>
<p class="MsoNormal"> 14:00:32.089726 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.14 tell 192.168.1.15, length 28<o:p></o:p></p>
<p class="MsoNormal"> 14:00:32.089740 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.14 is-at 0e:bf:04:b7:ed:52 (oui Unknown), length 28<o:p></o:p></p>
<p class="MsoNormal"> 14:00:32.090141 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.14 is-at 7a:a5:71:63:47:94 (oui Unknown), length 28<o:p></o:p></p>
<p class="MsoNormal"> 14:00:32.090160 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.14 is-at 02:f9:33:d5:04:0d (oui Unknown), length 28<o:p></o:p></p>
<p class="MsoNormal"> 14:00:32.090168 ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.14 is-at 9a:a0:46:e4:03:06 (oui Unknown), length 28<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">As you can see there are four different devices claiming to own the unused IP address! Looking them up in neutron shows they are all related to existing ports on the subnet, but different ones:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> <span lang="DE"># neutron port-list | grep -e 47fbb8b5-55 -e 46647cca-32 -e e9e2d7c3-7e -e 9ac24ac1-e1<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE"> | 46647cca-3293-42ea-8ec2-0834e19422fa | | fa:16:3e:7d:9c:45 | {"subnet_id": "25dbbdc0-f438-4f89-8663-1772f9c7ef36", "ip_address": "192.168.1.8"} |<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE"> | 47fbb8b5-5549-46e4-850e-bd382375e0f8 | | fa:16:3e:fa:df:32 | {"subnet_id": "25dbbdc0-f438-4f89-8663-1772f9c7ef36", "ip_address": "192.168.1.7"} |<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE"> </span>| 9ac24ac1-e157-484e-b6a2-a1dded4731ac | | fa:16:3e:2a:80:6b | {"subnet_id": "25dbbdc0-f438-4f89-8663-1772f9c7ef36", "ip_address": "192.168.1.15"} |<o:p></o:p></p>
<p class="MsoNormal"> | e9e2d7c3-7e58-4bc2-a25f-d48e658b2d56 | | fa:16:3e:0d:a6:0b | {"subnet_id": "25dbbdc0-f438-4f89-8663-1772f9c7ef36", "ip_address": "192.168.1.19"} |<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Impact: Linux guest don't seem to suffer from bogus ARP entries, so the problem may not be noticed in a pure Linux environment. Windows guest do, however. They verify IP addresses offered by DHCP against ARP, and reject IP configuration
in case of conflicts. In the example above any Windows VM offered 192.168.1.13 or 192.168.1.14 will fail to configure its network interface. This is actually how we noticed the issue.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Cheers!<o:p></o:p></p>
<p class="MsoNormal">Michael<o:p></o:p></p>
</div>
</body>
</html>