[Openstack-operators] Networking breaks in CentOS guests but works with Ubuntu guests

Lorin Hochstein lorin at nimbisservices.com
Thu Apr 4 19:55:22 UTC 2013


Yeah, I've only loaded vhost_net on the compute host.

I'm running CentOS 6.3 on my latest test, but I've tried with CentOS 6.4 as
well.

I made some progress today (at least a potential workaround), but permit me
to ramble for a bit. I'm trying to run non-multihost. The eth1 on my
compute nodes are bridged to br100, and there's no IP address on br100 or
eth1.

Packets aren't getting into the VM from outside. If I manually put an IP
address on there and do an "arping" from the network node, the arp request
packets appear on vnet1 of the compute host but not on eth0 of the guest.
(Packets do leave, however, so I can do an arping from inside the guest and
the nova-network host will see the request. Similar to DHCP. It's like a
reverse black hole, things can only go out).

However, if I put an IP address of br100 of the compute host, then the
guest can reach the host on that address.

So, it looks like I'm going to have to switch to running multi-host to
resolve this issue, since the VM can communicate directly with a bridge on
the compute host if it has an IP.

Still, it's puzzling to me, and I don't have a sense about how to debug
this further. How do I dig in if the problem is that packets can go from
guest:eth0 to host:vnet1, but they don't go from host:vnet1 to guest:eth0
(when they originate from a different server and travel over layer 2), and
only with a specific image that works for other people?

Lorin



On Thu, Apr 4, 2013 at 11:33 AM, Narayan Desai <narayan.desai at gmail.com>wrote:

> iirc, vhost_net is only needed on the host.
>
> We have seen stability issues with 12.04 (only on particular host
> types) when using virtio without vhost_net. Enabling vhost_net on the
> host resolved the issues for us.
>
> Which version of Centos are you running?
>  -nld
>
> On Wed, Apr 3, 2013 at 3:59 PM, Lorin Hochstein
> <lorin at nimbisservices.com> wrote:
> > That was my instinct, but I've tried it both ways (toggling
> > libvirt_use_virtio_for_bridge, restarting nova-compute, launching new
> > instance), and vnc'd into the instance to confirmed that in one case the
> > virtio_net drivers were loaded, and in another case, they weren't, and
> the
> > result was the same. But it doesn't seem to be related. It's really
> > baffling.
> >
> > Lorin
> >
> >
> > On Wed, Apr 3, 2013 at 4:47 PM, Joe Topjian <joe.topjian at cybera.ca>
> wrote:
> >>
> >> That's really bizarre -- especially since it's only CentOS images. Do
> you
> >> think it might be something with virtio compatibility?
> >>
> >> I'm hesitant to lean on it being a compute/controller issue since other
> >> images work.
> >>
> >>
> >> On Wed, Apr 3, 2013 at 2:41 PM, Lorin Hochstein <
> lorin at nimbisservices.com>
> >> wrote:
> >>>
> >>> I've tested with multiple ones, including the CentOS6 image from that
> >>> page, as well as several we have rolled on our own.
> >>>
> >>> Right now I'm testing by manually putting on the IP by doing:
> >>>
> >>> ip addr add 10.40.0.4/16 broadcast 10.40.255.255 dev eth0
> >>>
> >>> I can't ping out at all. If I try to arping out, and then tcpdump, just
> >>> like in the DHCP case, I can see the ARP request and replies on vnet0
> of the
> >>> host:
> >>>
> >>> root at c220-2:~# tcpdump -i vnet0 arp
> >>> tcpdump: WARNING: vnet0: no IPv4 address assigned
> >>> tcpdump: verbose output suppressed, use -v or -vv for full protocol
> >>> decode
> >>> 16:34:42.109067 ARP, Request who-has 10.40.0.1 (Broadcast) tell
> >>> 10.40.0.4, length 28
> >>> 16:34:42.109085 ARP, Request who-has 10.40.0.1 (Broadcast) tell
> >>> 10.40.0.4, length 28
> >>> 16:34:42.109216 ARP, Reply 10.40.0.1 is-at 54:78:1a:86:50:c9 (oui
> >>> Unknown), length 46
> >>>
> >>>
> >>> But if I tcpdump on eth0 in the guest, I only see the arp requests, not
> >>> the replies..
> >>>
> >>>
> >>> Lorin
> >>>
> >>>
> >>> On Wed, Apr 3, 2013 at 4:26 PM, Joe Topjian <joe.topjian at cybera.ca>
> >>> wrote:
> >>>>
> >>>> What CentOS images are you using? These have worked for me:
> >>>>
> >>>> https://github.com/rackerjoe/oz-image-build
> >>>>
> >>>>
> >>>> On Wed, Apr 3, 2013 at 2:13 PM, Lorin Hochstein
> >>>> <lorin at nimbisservices.com> wrote:
> >>>>>
> >>>>> Hi Joe:
> >>>>>
> >>>>> It happens immediately thereafter. CentOS images have never worked on
> >>>>> our setup.
> >>>>>
> >>>>> Lorin
> >>>>>
> >>>>>
> >>>>> On Wed, Apr 3, 2013 at 3:30 PM, Joe Topjian <joe.topjian at cybera.ca>
> >>>>> wrote:
> >>>>>>
> >>>>>> Hi Lorin,
> >>>>>>
> >>>>>> Does this happen shortly after the guests were created? Or usually a
> >>>>>> few hours/days later? If the latter, are these guests seeing large
> amounts
> >>>>>> of bandwidth?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Joe
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Apr 3, 2013 at 1:16 PM, Lorin Hochstein
> >>>>>> <lorin at nimbisservices.com> wrote:
> >>>>>>>
> >>>>>>> Hi all:
> >>>>>>>
> >>>>>>> I'm having a strange issue where networking on my CentOS guests
> isn't
> >>>>>>> working properly, but things are working fine with my Ubuntu
> guests.
> >>>>>>>
> >>>>>>> I'm running Folsom on Ubuntu 12.04, nova-network, not multi-host.
> >>>>>>>
> >>>>>>> The first symptom is that CentOS instances don't get IP addresses
> via
> >>>>>>> DHCP. If I trace the DHCP requests and replies using tcpdump, I
> can see the
> >>>>>>> reply from dnsmasq reach the vnetX interface of the compute host,
> but it
> >>>>>>> doesn't get to the eth0 interface of the compute host. (I'm at a
> loss here
> >>>>>>> about how to debug something like that).
> >>>>>>>
> >>>>>>> If I try to statically configure an IP address on the guest
> instead,
> >>>>>>> networking still doesn't work. I can't ping anything on the
> subnet, and I
> >>>>>>> don't even see the icmp traffic on vnetX of the host.
> >>>>>>>
> >>>>>>> I've tried this twiddling the following options, but no change in
> >>>>>>> behavior:
> >>>>>>>
> >>>>>>> * Adding the following rule to nova-network node: iptables -A
> >>>>>>> POSTROUTING -t mangle -p udp --dport bootpc -j CHECKSUM
> --checksum-fill
> >>>>>>> * Adding the same rule to nova-compute node
> >>>>>>> * Setting libvirt_use_virtio_for_bridge to "yes" and "no"
> (restarting
> >>>>>>> nova-compute, re-launching instances)
> >>>>>>> * With and without vhost_net loaded in nova-compute (restarting
> >>>>>>> nova-compute, re-launching instances)
> >>>>>>> * Disabling iIpv6 inside of the CentOS guest
> >>>>>>>
> >>>>>>> Has anybody encountered this before?
> >>>>>>>
> >>>>>>> Lorin
> >>>>>>>
> >>>>>>> --
> >>>>>>> Lorin Hochstein
> >>>>>>> Lead Architect - Cloud Services
> >>>>>>> Nimbis Services, Inc.
> >>>>>>> www.nimbisservices.com
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> OpenStack-operators mailing list
> >>>>>>> OpenStack-operators at lists.openstack.org
> >>>>>>>
> >>>>>>>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Joe Topjian
> >>>>>> Systems Administrator
> >>>>>> Cybera Inc.
> >>>>>>
> >>>>>> www.cybera.ca
> >>>>>>
> >>>>>> Cybera is a not-for-profit organization that works to spur and
> support
> >>>>>> innovation, for the economic benefit of Alberta, through the use of
> >>>>>> cyberinfrastructure.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Lorin Hochstein
> >>>>> Lead Architect - Cloud Services
> >>>>> Nimbis Services, Inc.
> >>>>> www.nimbisservices.com
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Joe Topjian
> >>>> Systems Administrator
> >>>> Cybera Inc.
> >>>>
> >>>> www.cybera.ca
> >>>>
> >>>> Cybera is a not-for-profit organization that works to spur and support
> >>>> innovation, for the economic benefit of Alberta, through the use of
> >>>> cyberinfrastructure.
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Lorin Hochstein
> >>> Lead Architect - Cloud Services
> >>> Nimbis Services, Inc.
> >>> www.nimbisservices.com
> >>
> >>
> >>
> >>
> >> --
> >> Joe Topjian
> >> Systems Administrator
> >> Cybera Inc.
> >>
> >> www.cybera.ca
> >>
> >> Cybera is a not-for-profit organization that works to spur and support
> >> innovation, for the economic benefit of Alberta, through the use of
> >> cyberinfrastructure.
> >
> >
> >
> >
> > --
> > Lorin Hochstein
> > Lead Architect - Cloud Services
> > Nimbis Services, Inc.
> > www.nimbisservices.com
> >
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >
>



-- 
Lorin Hochstein
Lead Architect - Cloud Services
Nimbis Services, Inc.
www.nimbisservices.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20130404/a079cc4c/attachment.html>


More information about the OpenStack-operators mailing list