[Openstack] VM can't ping self floating IP after a snapshot is taken

Sam Su susltd.su at gmail.com
Fri Aug 24 17:37:17 UTC 2012


Hi,

I also reported this bug:
 https://bugs.launchpad.net/nova/+bug/1040255

 If someone can combine you guys solution and get a perfect way to fix this
bug, that will be great.

BRs,
Sam

On Thu, Aug 23, 2012 at 9:27 PM, heut2008 <heut2008 at gmail.com> wrote:

> this bug has been filed here  https://bugs.launchpad.net/nova/+bug/1040537
>
> 2012/8/24 Vishvananda Ishaya <vishvananda at gmail.com>:
> > +1 to this. Evan, can you report a bug (if one hasn't been reported yet)
> and
> > propose the fix? Or else I can find someone else to propose it.
> >
> > Vish
> >
> > On Aug 23, 2012, at 1:38 PM, Evan Callicoat <diopter at gmail.com> wrote:
> >
> > Hello all!
> >
> > I'm the original author of the hairpin patch, and things have changed a
> > little bit in Essex and Folsom from the original Diablo target. I
> believe I
> > can shed some light on what should be done here to solve the issue in
> either
> > case.
> >
> > ---
> > For Essex (stable/essex), in nova/virt/libvirt/connection.py:
> > ---
> >
> > Currently _enable_hairpin() is only being called from spawn(). However,
> > spawn() is not the only place that vifs (veth#) get added to a bridge
> (which
> > is when we need to enable hairpin_mode on them). The more relevant
> function
> > is _create_new_domain(), which is called from spawn() and other places.
> > Without changing the information that gets passed to _create_new_domain()
> > (which is just 'xml' from to_xml()), we can easily rewrite the first 2
> lines
> > in _enable_hairpin(), as follows:
> >
> > def _enable_hairpin(self, xml):
> >     interfaces = self.get_interfaces(xml['name'])
> >
> > Then, we can move the self._enable_hairpin(instance) call from spawn() up
> > into _create_new_domain(), and pass it xml as follows:
> >
> > [...]
> > self._enable_hairpin(xml)
> > return domain
> >
> > This will run the hairpin code every time a domain gets created, which is
> > also when the domain's vif(s) gets inserted into the bridge with the
> default
> > of hairpin_mode=0.
> >
> > ---
> > For Folsom (trunk), in nova/virt/libvirt/driver.py:
> > ---
> >
> > There've been a lot more changes made here, but the same strategy as
> above
> > should work. Here, _create_new_domain() has been split into
> _create_domain()
> > and _create_domain_and_network(), and _enable_hairpin() was moved from
> > spawn() to _create_domain_and_network(), which seems like it'd be the
> right
> > thing to do, but doesn't quite cover all of the cases of vif reinsertion,
> > since _create_domain() is the only function which actually creates the
> > domain (_create_domain_and_network() just calls it after doing some
> > pre-work). The solution here is likewise fairly simple; make the same 2
> > changes to _enable_hairpin():
> >
> > def _enable_hairpin(self, xml):
> >     interfaces = self.get_interfaces(xml['name'])
> >
> > And move it from _create_domain_and_network() to _create_domain(), like
> > before:
> >
> > [...]
> > self._enable_hairpin(xml)
> > return domain
> >
> > I haven't yet tested this on my Essex clusters and I don't have a Folsom
> > cluster handy at present, but the change is simple and makes sense.
> Looking
> > at to_xml() and _prepare_xml_info(), it appears that the 'xml' variable
> > _create_[new_]domain() gets is just a python dictionary, and xml['name']
> =
> > instance['name'], exactly what _enable_hairpin() was using the 'instance'
> > variable for previously.
> >
> > Let me know if this works, or doesn't work, or doesn't make sense, or if
> you
> > need an address to send gifts, etc. Hope it's solved!
> >
> > -Evan
> >
> > On Thu, Aug 23, 2012 at 11:20 AM, Sam Su <susltd.su at gmail.com> wrote:
> >>
> >> Hi Oleg,
> >>
> >> Thank you for your investigation. Good lucky!
> >>
> >> Can you let me know if find how to fix the bug?
> >>
> >> Thanks,
> >> Sam
> >>
> >> On Wed, Aug 22, 2012 at 12:50 PM, Oleg Gelbukh <ogelbukh at mirantis.com>
> >> wrote:
> >>>
> >>> Hello,
> >>>
> >>> Is it possible that, during snapshotting, libvirt just tears down
> virtual
> >>> interface at some point, and then re-creates it, with hairpin_mode
> disabled
> >>> again?
> >>> This bugfix [https://bugs.launchpad.net/nova/+bug/933640] implies that
> >>> fix works on spawn of instance. This means that upon resume after
> snapshot,
> >>> hairpin is not restored. May be if we insert the _enable_hairpin()
> call in
> >>> snapshot procedure, it helps.
> >>> We're currently investigating this issue in one of our environments,
> hope
> >>> to come up with answer by tomorrow.
> >>>
> >>> --
> >>> Best regards,
> >>> Oleg
> >>>
> >>> On Wed, Aug 22, 2012 at 11:29 PM, Sam Su <susltd.su at gmail.com> wrote:
> >>>>
> >>>> My friend has found a way to enable ping itself, when this problem
> >>>> happened. But not found why this happen.
> >>>> sudo echo "1" >
> >>>> /sys/class/net/br1000/brif/<virtual-interface-name>/hairpin_mode
> >>>>
> >>>> I file a ticket to report this problem:
> >>>> https://bugs.launchpad.net/nova/+bug/1040255
> >>>>
> >>>> hopefully someone can find why this happen and solve it.
> >>>>
> >>>> Thanks,
> >>>> Sam
> >>>>
> >>>>
> >>>> On Fri, Jul 20, 2012 at 3:50 PM, Gabriel Hurley
> >>>> <Gabriel.Hurley at nebula.com> wrote:
> >>>>>
> >>>>> I ran into some similar issues with the _enable_hairpin() call. The
> >>>>> call is allowed to fail silently and (in my case) was failing. I
> couldn’t
> >>>>> for the life of me figure out why, though, and since I’m really not a
> >>>>> networking person I didn’t trace it along too far.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Just thought I’d share my similar pain.
> >>>>>
> >>>>>
> >>>>>
> >>>>> -          Gabriel
> >>>>>
> >>>>>
> >>>>>
> >>>>> From: openstack-bounces+gabriel.hurley=
> nebula.com at lists.launchpad.net
> >>>>> [mailto:openstack-bounces+gabriel.hurley=
> nebula.com at lists.launchpad.net] On
> >>>>> Behalf Of Sam Su
> >>>>> Sent: Thursday, July 19, 2012 11:50 AM
> >>>>> To: Brian Haley
> >>>>> Cc: openstack
> >>>>> Subject: Re: [Openstack] VM can't ping self floating IP after a
> >>>>> snapshot is taken
> >>>>>
> >>>>>
> >>>>>
> >>>>> Thank you for your support.
> >>>>>
> >>>>>
> >>>>>
> >>>>> I checked the file  nova/virt/libvirt/connection.py, the sentence
> >>>>> self._enable_hairpin(instance) is already added to the function
> >>>>> _hard_reboot().
> >>>>>
> >>>>> It looks like there are some difference between taking snapshot and
> >>>>> reboot instance. I tried to figure out how to fix this bug but
> failed.
> >>>>>
> >>>>>
> >>>>>
> >>>>> It will be much appreciated if anyone can give some hints.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Sam
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Thu, Jul 19, 2012 at 8:37 AM, Brian Haley <brian.haley at hp.com>
> >>>>> wrote:
> >>>>>
> >>>>> On 07/17/2012 05:56 PM, Sam Su wrote:
> >>>>> > Hi,
> >>>>> >
> >>>>> > Just This always happens in Essex release. After I take a snapshot
> of
> >>>>> > my VM ( I
> >>>>> > tried Ubuntu 12.04 or CentOS 5.8), VM can't ping its self floating
> >>>>> > IP; before I
> >>>>> > take a snapshot though, VM can ping its self floating IP.
> >>>>> >
> >>>>> > This looks closely related to
> >>>>> > https://bugs.launchpad.net/nova/+bug/933640, but
> >>>>> > still a little different. In 933640, it sounds like VM can't ping
> its
> >>>>> > self
> >>>>> > floating IP regardless whether we take a snapshot or not.
> >>>>> >
> >>>>> > Any suggestion to make an easy fix? And what is the root cause of
> the
> >>>>> > problem?
> >>>>>
> >>>>> It might be because there's a missing _enable_hairpin() call in the
> >>>>> reboot()
> >>>>> function.  Try something like this...
> >>>>>
> >>>>> nova/virt/libvirt/connection.py, _hard_reboot():
> >>>>>
> >>>>>              self._create_new_domain(xml)
> >>>>> +            self._enable_hairpin(instance)
> >>>>>              self.firewall_driver.apply_instance_filter(instance,
> >>>>> network_info)
> >>>>>
> >>>>> At least that's what I remember doing myself recently when testing
> >>>>> after a
> >>>>> reboot, don't know about snapshot.
> >>>>>
> >>>>> Folsom has changed enough that something different would need to be
> >>>>> done there.
> >>>>>
> >>>>> -Brian
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Mailing list: https://launchpad.net/~openstack
> >>>> Post to     : openstack at lists.launchpad.net
> >>>> Unsubscribe : https://launchpad.net/~openstack
> >>>> More help   : https://help.launchpad.net/ListHelp
> >>>>
> >>>
> >>
> >>
> >> _______________________________________________
> >> Mailing list: https://launchpad.net/~openstack
> >> Post to     : openstack at lists.launchpad.net
> >> Unsubscribe : https://launchpad.net/~openstack
> >> More help   : https://help.launchpad.net/ListHelp
> >>
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~openstack
> > Post to     : openstack at lists.launchpad.net
> > Unsubscribe : https://launchpad.net/~openstack
> > More help   : https://help.launchpad.net/ListHelp
> >
> >
> >
> > _______________________________________________
> > Mailing list: https://launchpad.net/~openstack
> > Post to     : openstack at lists.launchpad.net
> > Unsubscribe : https://launchpad.net/~openstack
> > More help   : https://help.launchpad.net/ListHelp
> >
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack at lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20120824/d03ad2cc/attachment.html>


More information about the Openstack mailing list