[Openstack-operators] Networking breaks in CentOS guests but works with Ubuntu guests

Narayan Desai narayan.desai at gmail.com
Thu Apr 4 20:27:58 UTC 2013


You might be hitting iptables/ebtables rules.

I don't understand why this would be image specific though.

Can you try generating traffic from the vm and see which counters
increment? (with a static ip maybe?)
 -nld

On Thu, Apr 4, 2013 at 2:55 PM, Lorin Hochstein
<lorin at nimbisservices.com> wrote:
> Yeah, I've only loaded vhost_net on the compute host.
>
> I'm running CentOS 6.3 on my latest test, but I've tried with CentOS 6.4 as
> well.
>
> I made some progress today (at least a potential workaround), but permit me
> to ramble for a bit. I'm trying to run non-multihost. The eth1 on my compute
> nodes are bridged to br100, and there's no IP address on br100 or eth1.
>
> Packets aren't getting into the VM from outside. If I manually put an IP
> address on there and do an "arping" from the network node, the arp request
> packets appear on vnet1 of the compute host but not on eth0 of the guest.
> (Packets do leave, however, so I can do an arping from inside the guest and
> the nova-network host will see the request. Similar to DHCP. It's like a
> reverse black hole, things can only go out).
>
> However, if I put an IP address of br100 of the compute host, then the guest
> can reach the host on that address.
>
> So, it looks like I'm going to have to switch to running multi-host to
> resolve this issue, since the VM can communicate directly with a bridge on
> the compute host if it has an IP.
>
> Still, it's puzzling to me, and I don't have a sense about how to debug this
> further. How do I dig in if the problem is that packets can go from
> guest:eth0 to host:vnet1, but they don't go from host:vnet1 to guest:eth0
> (when they originate from a different server and travel over layer 2), and
> only with a specific image that works for other people?
>
> Lorin
>
>
>
> On Thu, Apr 4, 2013 at 11:33 AM, Narayan Desai <narayan.desai at gmail.com>
> wrote:
>>
>> iirc, vhost_net is only needed on the host.
>>
>> We have seen stability issues with 12.04 (only on particular host
>> types) when using virtio without vhost_net. Enabling vhost_net on the
>> host resolved the issues for us.
>>
>> Which version of Centos are you running?
>>  -nld
>>
>> On Wed, Apr 3, 2013 at 3:59 PM, Lorin Hochstein
>> <lorin at nimbisservices.com> wrote:
>> > That was my instinct, but I've tried it both ways (toggling
>> > libvirt_use_virtio_for_bridge, restarting nova-compute, launching new
>> > instance), and vnc'd into the instance to confirmed that in one case the
>> > virtio_net drivers were loaded, and in another case, they weren't, and
>> > the
>> > result was the same. But it doesn't seem to be related. It's really
>> > baffling.
>> >
>> > Lorin
>> >
>> >
>> > On Wed, Apr 3, 2013 at 4:47 PM, Joe Topjian <joe.topjian at cybera.ca>
>> > wrote:
>> >>
>> >> That's really bizarre -- especially since it's only CentOS images. Do
>> >> you
>> >> think it might be something with virtio compatibility?
>> >>
>> >> I'm hesitant to lean on it being a compute/controller issue since other
>> >> images work.
>> >>
>> >>
>> >> On Wed, Apr 3, 2013 at 2:41 PM, Lorin Hochstein
>> >> <lorin at nimbisservices.com>
>> >> wrote:
>> >>>
>> >>> I've tested with multiple ones, including the CentOS6 image from that
>> >>> page, as well as several we have rolled on our own.
>> >>>
>> >>> Right now I'm testing by manually putting on the IP by doing:
>> >>>
>> >>> ip addr add 10.40.0.4/16 broadcast 10.40.255.255 dev eth0
>> >>>
>> >>> I can't ping out at all. If I try to arping out, and then tcpdump,
>> >>> just
>> >>> like in the DHCP case, I can see the ARP request and replies on vnet0
>> >>> of the
>> >>> host:
>> >>>
>> >>> root at c220-2:~# tcpdump -i vnet0 arp
>> >>> tcpdump: WARNING: vnet0: no IPv4 address assigned
>> >>> tcpdump: verbose output suppressed, use -v or -vv for full protocol
>> >>> decode
>> >>> 16:34:42.109067 ARP, Request who-has 10.40.0.1 (Broadcast) tell
>> >>> 10.40.0.4, length 28
>> >>> 16:34:42.109085 ARP, Request who-has 10.40.0.1 (Broadcast) tell
>> >>> 10.40.0.4, length 28
>> >>> 16:34:42.109216 ARP, Reply 10.40.0.1 is-at 54:78:1a:86:50:c9 (oui
>> >>> Unknown), length 46
>> >>>
>> >>>
>> >>> But if I tcpdump on eth0 in the guest, I only see the arp requests,
>> >>> not
>> >>> the replies..
>> >>>
>> >>>
>> >>> Lorin
>> >>>
>> >>>
>> >>> On Wed, Apr 3, 2013 at 4:26 PM, Joe Topjian <joe.topjian at cybera.ca>
>> >>> wrote:
>> >>>>
>> >>>> What CentOS images are you using? These have worked for me:
>> >>>>
>> >>>> https://github.com/rackerjoe/oz-image-build
>> >>>>
>> >>>>
>> >>>> On Wed, Apr 3, 2013 at 2:13 PM, Lorin Hochstein
>> >>>> <lorin at nimbisservices.com> wrote:
>> >>>>>
>> >>>>> Hi Joe:
>> >>>>>
>> >>>>> It happens immediately thereafter. CentOS images have never worked
>> >>>>> on
>> >>>>> our setup.
>> >>>>>
>> >>>>> Lorin
>> >>>>>
>> >>>>>
>> >>>>> On Wed, Apr 3, 2013 at 3:30 PM, Joe Topjian <joe.topjian at cybera.ca>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> Hi Lorin,
>> >>>>>>
>> >>>>>> Does this happen shortly after the guests were created? Or usually
>> >>>>>> a
>> >>>>>> few hours/days later? If the latter, are these guests seeing large
>> >>>>>> amounts
>> >>>>>> of bandwidth?
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>> Joe
>> >>>>>>
>> >>>>>>
>> >>>>>> On Wed, Apr 3, 2013 at 1:16 PM, Lorin Hochstein
>> >>>>>> <lorin at nimbisservices.com> wrote:
>> >>>>>>>
>> >>>>>>> Hi all:
>> >>>>>>>
>> >>>>>>> I'm having a strange issue where networking on my CentOS guests
>> >>>>>>> isn't
>> >>>>>>> working properly, but things are working fine with my Ubuntu
>> >>>>>>> guests.
>> >>>>>>>
>> >>>>>>> I'm running Folsom on Ubuntu 12.04, nova-network, not multi-host.
>> >>>>>>>
>> >>>>>>> The first symptom is that CentOS instances don't get IP addresses
>> >>>>>>> via
>> >>>>>>> DHCP. If I trace the DHCP requests and replies using tcpdump, I
>> >>>>>>> can see the
>> >>>>>>> reply from dnsmasq reach the vnetX interface of the compute host,
>> >>>>>>> but it
>> >>>>>>> doesn't get to the eth0 interface of the compute host. (I'm at a
>> >>>>>>> loss here
>> >>>>>>> about how to debug something like that).
>> >>>>>>>
>> >>>>>>> If I try to statically configure an IP address on the guest
>> >>>>>>> instead,
>> >>>>>>> networking still doesn't work. I can't ping anything on the
>> >>>>>>> subnet, and I
>> >>>>>>> don't even see the icmp traffic on vnetX of the host.
>> >>>>>>>
>> >>>>>>> I've tried this twiddling the following options, but no change in
>> >>>>>>> behavior:
>> >>>>>>>
>> >>>>>>> * Adding the following rule to nova-network node: iptables -A
>> >>>>>>> POSTROUTING -t mangle -p udp --dport bootpc -j CHECKSUM
>> >>>>>>> --checksum-fill
>> >>>>>>> * Adding the same rule to nova-compute node
>> >>>>>>> * Setting libvirt_use_virtio_for_bridge to "yes" and "no"
>> >>>>>>> (restarting
>> >>>>>>> nova-compute, re-launching instances)
>> >>>>>>> * With and without vhost_net loaded in nova-compute (restarting
>> >>>>>>> nova-compute, re-launching instances)
>> >>>>>>> * Disabling iIpv6 inside of the CentOS guest
>> >>>>>>>
>> >>>>>>> Has anybody encountered this before?
>> >>>>>>>
>> >>>>>>> Lorin
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Lorin Hochstein
>> >>>>>>> Lead Architect - Cloud Services
>> >>>>>>> Nimbis Services, Inc.
>> >>>>>>> www.nimbisservices.com
>> >>>>>>>
>> >>>>>>> _______________________________________________
>> >>>>>>> OpenStack-operators mailing list
>> >>>>>>> OpenStack-operators at lists.openstack.org
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Joe Topjian
>> >>>>>> Systems Administrator
>> >>>>>> Cybera Inc.
>> >>>>>>
>> >>>>>> www.cybera.ca
>> >>>>>>
>> >>>>>> Cybera is a not-for-profit organization that works to spur and
>> >>>>>> support
>> >>>>>> innovation, for the economic benefit of Alberta, through the use of
>> >>>>>> cyberinfrastructure.
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> Lorin Hochstein
>> >>>>> Lead Architect - Cloud Services
>> >>>>> Nimbis Services, Inc.
>> >>>>> www.nimbisservices.com
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Joe Topjian
>> >>>> Systems Administrator
>> >>>> Cybera Inc.
>> >>>>
>> >>>> www.cybera.ca
>> >>>>
>> >>>> Cybera is a not-for-profit organization that works to spur and
>> >>>> support
>> >>>> innovation, for the economic benefit of Alberta, through the use of
>> >>>> cyberinfrastructure.
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Lorin Hochstein
>> >>> Lead Architect - Cloud Services
>> >>> Nimbis Services, Inc.
>> >>> www.nimbisservices.com
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Joe Topjian
>> >> Systems Administrator
>> >> Cybera Inc.
>> >>
>> >> www.cybera.ca
>> >>
>> >> Cybera is a not-for-profit organization that works to spur and support
>> >> innovation, for the economic benefit of Alberta, through the use of
>> >> cyberinfrastructure.
>> >
>> >
>> >
>> >
>> > --
>> > Lorin Hochstein
>> > Lead Architect - Cloud Services
>> > Nimbis Services, Inc.
>> > www.nimbisservices.com
>> >
>> > _______________________________________________
>> > OpenStack-operators mailing list
>> > OpenStack-operators at lists.openstack.org
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> >
>
>
>
>
> --
> Lorin Hochstein
> Lead Architect - Cloud Services
> Nimbis Services, Inc.
> www.nimbisservices.com



More information about the OpenStack-operators mailing list