[Openstack-operators] Neutron DVR HA

Pedro Sousa pgsousa at gmail.com
Tue Dec 30 11:35:27 UTC 2014


Hi Assaf,

According your instructions I can confirm that I have l2pop disabled.

Meanwhile, I've made another test, yesterday when I left the office this
wasn't working, but when I arrived this morning it was pinging again, and I
didn't changed or touched anything. So my interpretation that this has some
sort of timeout issue.

Thanks






On Tue, Dec 30, 2014 at 11:27 AM, Assaf Muller <amuller at redhat.com> wrote:

> Sorry I can't open zip files on this email. You need l2pop to not exist
> in the ML2 mechanism drivers list in neutron.conf where the Neutron server
> is, and you need l2population = False in each OVS agent.
>
> ----- Original Message -----
> >
> > [Text File:warning1.txt]
> >
> > Hi Asaf,
> >
> > I think I disabled it, but maybe you can check my conf files? I've
> attached
> > the zip.
> >
> > Thanks
> >
> > On Tue, Dec 30, 2014 at 8:27 AM, Assaf Muller < amuller at redhat.com >
> wrote:
> >
> >
> >
> >
> > ----- Original Message -----
> > > Hi Britt,
> > >
> > > some update on this after running tcpdump:
> > >
> > > I have keepalived master running on controller01, If I reboot this
> server
> > > it
> > > failovers to controller02 which now becomes Keepalived Master, then I
> see
> > > ping packets arriving to controller02, this is good.
> > >
> > > However when the controller01 comes online I see that ping requests
> stop
> > > being forwarded to controller02 and start being sent to controller01
> that
> > > is
> > > now in Backup State, so it stops working.
> > >
> >
> > If traffic is being forwarded to a backup node, that sounds like L2pop
> is on.
> > Is that true by chance?
> >
> > > Any hint for this?
> > >
> > > Thanks
> > >
> > >
> > >
> > > On Mon, Dec 29, 2014 at 11:06 AM, Pedro Sousa < pgsousa at gmail.com >
> wrote:
> > >
> > >
> > >
> > > Yes,
> > >
> > > I was using l2pop, disabled it, but the issue remains.
> > >
> > > I also stopped "bogus VRRP" messages configuring a user/password for
> > > keepalived, but when I reboot the servers, I see keepalived process
> running
> > > on them but I cannot ping the virtual router ip address anymore.
> > >
> > > So I rebooted the node that is running Keepalived as Master, starts
> pinging
> > > again, but when that node comes online, everything stops working.
> Anyone
> > > experienced this?
> > >
> > > Thanks
> > >
> > >
> > > On Tue, Dec 23, 2014 at 5:03 PM, David Martin < dmartls1 at gmail.com >
> wrote:
> > >
> > >
> > >
> > > Are you using l2pop? Until
> https://bugs.launchpad.net/neutron/+bug/1365476
> > > is
> > > fixed it's pretty broken.
> > >
> > > On Tue, Dec 23, 2014 at 10:48 AM, Britt Houser (bhouser) <
> > > bhouser at cisco.com
> > > > wrote:
> > >
> > >
> > >
> > > Unfortunately I've not had a chance yet to play with neutron router
> HA, so
> > > no
> > > hints from me. =( Can you give a little more details about "it stops
> > > working"? I.e. You see packets dropped while controller 1 is down? Do
> > > packets begin flowing before controller1 comes back online? Does
> > > controller1
> > > come back online successfully? Do packets begin to flow after
> controller1
> > > comes back online? Perhaps that will help.
> > >
> > > Thx,
> > > britt
> > >
> > > From: Pedro Sousa < pgsousa at gmail.com >
> > > Date: Tuesday, December 23, 2014 at 11:14 AM
> > > To: Britt Houser < bhouser at cisco.com >
> > > Cc: " OpenStack-operators at lists.openstack.org " <
> > > OpenStack-operators at lists.openstack.org >
> > > Subject: Re: [Openstack-operators] Neutron DVR HA
> > >
> > > I understand Britt, thanks.
> > >
> > > So I disabled DVR and tried to test L3_HA, but it's not working
> properly,
> > > it
> > > seems a keepalived issue. I see that it's running on 3 nodes:
> > >
> > > [root at controller01 keepalived]# neutron l3-agent-list-hosting-router
> > > harouter
> > >
> +--------------------------------------+--------------+----------------+-------+
> > > | id | host | admin_state_up | alive |
> > >
> +--------------------------------------+--------------+----------------+-------+
> > > | 09cfad44-2bb2-4683-a803-ed70f3a46a6a | controller01 | True | :-) |
> > > | 58ff7c42-7e71-4750-9f05-61ad5fbc5776 | compute03 | True | :-) |
> > > | 8d778c6a-94df-40b7-a2d6-120668e699ca | compute02 | True | :-) |
> > >
> +--------------------------------------+--------------+----------------+-------+
> > >
> > > However if I reboot one of the l3-agent nodes it stops working. I see
> this
> > > in
> > > the logs:
> > >
> > > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: ip address associated
> > > with
> > > VRID not present in received packet : 172.16.28.20
> > > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: one or more VIP
> > > associated
> > > with VRID mismatch actual MASTER advert
> > > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: bogus VRRP packet
> > > received
> > > on ha-a509de81-1c !!!
> > > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: VRRP_Instance(VR_1)
> > > ignoring received advertisment...
> > >
> > > Dec 23 16:13:10 Compute03 Keepalived_vrrp[12501]: VRRP_Instance(VR_1)
> > > ignoring received advertisment...
> > > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: ip address associated
> > > with
> > > VRID not present in received packet : 172.16.28.20
> > > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: one or more VIP
> > > associated
> > > with VRID mismatch actual MASTER advert
> > > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: bogus VRRP packet
> > > received
> > > on ha-d5718741-ef !!!
> > > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: VRRP_Instance(VR_1)
> > > ignoring received advertisment...
> > >
> > > Any hint?
> > >
> > > Thanks
> > >
> > >
> > >
> > > On Tue, Dec 23, 2014 at 3:17 PM, Britt Houser (bhouser) <
> bhouser at cisco.com
> > > >
> > > wrote:
> > >
> > >
> > >
> > > Currently HA and DVR are mutually exclusive features.
> > >
> > > From: Pedro Sousa < pgsousa at gmail.com >
> > > Date: Tuesday, December 23, 2014 at 9:42 AM
> > > To: " OpenStack-operators at lists.openstack.org " <
> > > OpenStack-operators at lists.openstack.org >
> > > Subject: [Openstack-operators] Neutron DVR HA
> > >
> > > Hi all,
> > >
> > > I've been trying Neutron DVR with 2 controllers + 2 computes. When I
> create
> > > a
> > > router I can see that is running on all the servers:
> > >
> > > [root at controller01 ~]# neutron l3-agent-list-hosting-router router
> > >
> +--------------------------------------+--------------+----------------+-------+
> > > | id | host | admin_state_up | alive |
> > >
> +--------------------------------------+--------------+----------------+-------+
> > > | 09cfad44-2bb2-4683-a803-ed70f3a46a6a | controller01 | True | :-) |
> > > | 0ca01d56-b6dd-483d-9c49-cc7209da2a5a | controller02 | True | :-) |
> > > | 52379f0f-9046-4b73-9d87-bab7f96be5e7 | compute01 | True | :-) |
> > > | 8d778c6a-94df-40b7-a2d6-120668e699ca | compute02 | True | :-) |
> > >
> +--------------------------------------+--------------+----------------+-------+
> > >
> > > However if controller01 server dies I cannot ping ip external gateway
> > > anymore. Is this the expected behavior? Shouldn't it failback to the
> > > another
> > > controller node?
> > >
> > > Thanks
> > >
> > >
> > > _______________________________________________
> > > OpenStack-operators mailing list
> > > OpenStack-operators at lists.openstack.org
> > >
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > OpenStack-operators mailing list
> > > OpenStack-operators at lists.openstack.org
> > >
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> > >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20141230/a3db4242/attachment.html>


More information about the OpenStack-operators mailing list