[Openstack-operators] Neutron DVR HA

Assaf Muller amuller at redhat.com
Tue Dec 30 11:27:17 UTC 2014


Sorry I can't open zip files on this email. You need l2pop to not exist
in the ML2 mechanism drivers list in neutron.conf where the Neutron server
is, and you need l2population = False in each OVS agent.

----- Original Message -----
> 
> [Text File:warning1.txt]
> 
> Hi Asaf,
> 
> I think I disabled it, but maybe you can check my conf files? I've attached
> the zip.
> 
> Thanks
> 
> On Tue, Dec 30, 2014 at 8:27 AM, Assaf Muller < amuller at redhat.com > wrote:
> 
> 
> 
> 
> ----- Original Message -----
> > Hi Britt,
> > 
> > some update on this after running tcpdump:
> > 
> > I have keepalived master running on controller01, If I reboot this server
> > it
> > failovers to controller02 which now becomes Keepalived Master, then I see
> > ping packets arriving to controller02, this is good.
> > 
> > However when the controller01 comes online I see that ping requests stop
> > being forwarded to controller02 and start being sent to controller01 that
> > is
> > now in Backup State, so it stops working.
> > 
> 
> If traffic is being forwarded to a backup node, that sounds like L2pop is on.
> Is that true by chance?
> 
> > Any hint for this?
> > 
> > Thanks
> > 
> > 
> > 
> > On Mon, Dec 29, 2014 at 11:06 AM, Pedro Sousa < pgsousa at gmail.com > wrote:
> > 
> > 
> > 
> > Yes,
> > 
> > I was using l2pop, disabled it, but the issue remains.
> > 
> > I also stopped "bogus VRRP" messages configuring a user/password for
> > keepalived, but when I reboot the servers, I see keepalived process running
> > on them but I cannot ping the virtual router ip address anymore.
> > 
> > So I rebooted the node that is running Keepalived as Master, starts pinging
> > again, but when that node comes online, everything stops working. Anyone
> > experienced this?
> > 
> > Thanks
> > 
> > 
> > On Tue, Dec 23, 2014 at 5:03 PM, David Martin < dmartls1 at gmail.com > wrote:
> > 
> > 
> > 
> > Are you using l2pop? Until https://bugs.launchpad.net/neutron/+bug/1365476
> > is
> > fixed it's pretty broken.
> > 
> > On Tue, Dec 23, 2014 at 10:48 AM, Britt Houser (bhouser) <
> > bhouser at cisco.com
> > > wrote:
> > 
> > 
> > 
> > Unfortunately I've not had a chance yet to play with neutron router HA, so
> > no
> > hints from me. =( Can you give a little more details about "it stops
> > working"? I.e. You see packets dropped while controller 1 is down? Do
> > packets begin flowing before controller1 comes back online? Does
> > controller1
> > come back online successfully? Do packets begin to flow after controller1
> > comes back online? Perhaps that will help.
> > 
> > Thx,
> > britt
> > 
> > From: Pedro Sousa < pgsousa at gmail.com >
> > Date: Tuesday, December 23, 2014 at 11:14 AM
> > To: Britt Houser < bhouser at cisco.com >
> > Cc: " OpenStack-operators at lists.openstack.org " <
> > OpenStack-operators at lists.openstack.org >
> > Subject: Re: [Openstack-operators] Neutron DVR HA
> > 
> > I understand Britt, thanks.
> > 
> > So I disabled DVR and tried to test L3_HA, but it's not working properly,
> > it
> > seems a keepalived issue. I see that it's running on 3 nodes:
> > 
> > [root at controller01 keepalived]# neutron l3-agent-list-hosting-router
> > harouter
> > +--------------------------------------+--------------+----------------+-------+
> > | id | host | admin_state_up | alive |
> > +--------------------------------------+--------------+----------------+-------+
> > | 09cfad44-2bb2-4683-a803-ed70f3a46a6a | controller01 | True | :-) |
> > | 58ff7c42-7e71-4750-9f05-61ad5fbc5776 | compute03 | True | :-) |
> > | 8d778c6a-94df-40b7-a2d6-120668e699ca | compute02 | True | :-) |
> > +--------------------------------------+--------------+----------------+-------+
> > 
> > However if I reboot one of the l3-agent nodes it stops working. I see this
> > in
> > the logs:
> > 
> > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: ip address associated
> > with
> > VRID not present in received packet : 172.16.28.20
> > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: one or more VIP
> > associated
> > with VRID mismatch actual MASTER advert
> > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: bogus VRRP packet
> > received
> > on ha-a509de81-1c !!!
> > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: VRRP_Instance(VR_1)
> > ignoring received advertisment...
> > 
> > Dec 23 16:13:10 Compute03 Keepalived_vrrp[12501]: VRRP_Instance(VR_1)
> > ignoring received advertisment...
> > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: ip address associated
> > with
> > VRID not present in received packet : 172.16.28.20
> > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: one or more VIP
> > associated
> > with VRID mismatch actual MASTER advert
> > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: bogus VRRP packet
> > received
> > on ha-d5718741-ef !!!
> > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: VRRP_Instance(VR_1)
> > ignoring received advertisment...
> > 
> > Any hint?
> > 
> > Thanks
> > 
> > 
> > 
> > On Tue, Dec 23, 2014 at 3:17 PM, Britt Houser (bhouser) < bhouser at cisco.com
> > >
> > wrote:
> > 
> > 
> > 
> > Currently HA and DVR are mutually exclusive features.
> > 
> > From: Pedro Sousa < pgsousa at gmail.com >
> > Date: Tuesday, December 23, 2014 at 9:42 AM
> > To: " OpenStack-operators at lists.openstack.org " <
> > OpenStack-operators at lists.openstack.org >
> > Subject: [Openstack-operators] Neutron DVR HA
> > 
> > Hi all,
> > 
> > I've been trying Neutron DVR with 2 controllers + 2 computes. When I create
> > a
> > router I can see that is running on all the servers:
> > 
> > [root at controller01 ~]# neutron l3-agent-list-hosting-router router
> > +--------------------------------------+--------------+----------------+-------+
> > | id | host | admin_state_up | alive |
> > +--------------------------------------+--------------+----------------+-------+
> > | 09cfad44-2bb2-4683-a803-ed70f3a46a6a | controller01 | True | :-) |
> > | 0ca01d56-b6dd-483d-9c49-cc7209da2a5a | controller02 | True | :-) |
> > | 52379f0f-9046-4b73-9d87-bab7f96be5e7 | compute01 | True | :-) |
> > | 8d778c6a-94df-40b7-a2d6-120668e699ca | compute02 | True | :-) |
> > +--------------------------------------+--------------+----------------+-------+
> > 
> > However if controller01 server dies I cannot ping ip external gateway
> > anymore. Is this the expected behavior? Shouldn't it failback to the
> > another
> > controller node?
> > 
> > Thanks
> > 
> > 
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> > 
> 
> 



More information about the OpenStack-operators mailing list