[openstack][neutron[nova][kolla-ansible]instance cannot ping after live migrate

Satish Patel satish.txt at gmail.com
Mon Jul 31 13:12:43 UTC 2023


Hi Slawek,

You are suggesting not to use an OVS base native firewall?

On Mon, Jul 31, 2023 at 3:12 AM Slawek Kaplonski <skaplons at redhat.com>
wrote:

> Hi,
>
> Dnia niedziela, 30 lipca 2023 17:00:22 CEST Nguyễn Hữu Khôi pisze:
>
> > Hello.
>
> > Is it ok if we use ovs with native firewall driver which I mean don't use
>
> > ovn. How about migration from ovs to ovn.
>
> Regarding migration from ML2/OVS to ML2/OVN backend it's easier to do it
> when You are using ML2/OVS with openvswitch (native) firewall driver as in
> that case plugging of the VMs into br-int will be the same before and after
> migration.
>
> >
>
> > Nguyen Huu Khoi
>
> >
>
> >
>
> > On Sun, Jul 30, 2023 at 8:26 AM Satish Patel <satish.txt at gmail.com>
> wrote:
>
> >
>
> > > iptables + linux bridge integration with OVS was very old and OVS ACL
> was
>
> > > not mature enough in earlier days. But nowadays OVN supports OVS base
> ACL
>
> > > and that means it's much more stable.
>
> I'm not sure but I think there are some mixed things here. Generally in
> Neutron we have "backends" like ML2/OVS (neutron-openvswitch-agent) or
> ML2/OVN (with ovn-controller running on compute nodes). There are more
> backends like ML2/Linuxbridge for example but lets not include them here
> and focus only on ML2/OVS and ML2/OVN as those were mentioned.
>
> Now, regarding firewall drivers, in ML2/OVS backend,
> neutron-openvswitch-agent can use one of the following firewall drivers:
>
> * iptables_hybrid - that's the one mentioned by Satish Patel as this "very
> old" solution. Indeed it is using linuxbridge between VM and br-int to
> implement iptables rule which will work on this linuxbridge for the
> instance,
>
> * openvswitch - this is newer firewall driver, where all SG rules are
> implemented on the host as OpenFlow rules in br-int. In this case VM is
> plugged directly to the br-int. But this isn't related to the OVN ACLs in
> any way. It's all implemented in the neutron-openvswitch-agent code.
> Details about it are in the:
> https://docs.openstack.org/neutron/latest/admin/config-ovsfwdriver.html
>
> In ML2/OVN backend there is only one implementation of the Security Groups
> and this is based on the OVN ACL mechanism. In this case of course there is
> also no need to use any Linuxbridge between VM and br-int so VM is plugged
> directly into br-int.
>
> > >
>
> > > On Sat, Jul 29, 2023 at 10:29 AM Nguyễn Hữu Khôi <
>
> > > nguyenhuukhoinw at gmail.com> wrote:
>
> > >
>
> > >> Hello.
>
> > >> I just known about ops firewall last week. I am going to compare
>
> > >> between them.
>
> > >> Could you share some experience about why ovs firewall driver over
>
> > >> iptables.
>
> > >> Thank you.
>
> > >> Nguyen Huu Khoi
>
> > >>
>
> > >>
>
> > >> On Sat, Jul 29, 2023 at 5:55 PM Satish Patel <satish.txt at gmail.com>
>
> > >> wrote:
>
> > >>
>
> > >>> Why are you not using openvswitch flow based firewall instead of
>
> > >>> Linuxbridge which will add hops in packet path.
>
> > >>>
>
> > >>> Sent from my iPhone
>
> > >>>
>
> > >>> On Jul 27, 2023, at 12:25 PM, Nguyễn Hữu Khôi <
> nguyenhuukhoinw at gmail.com>
>
> > >>> wrote:
>
> > >>>
>
> > >>> 
>
> > >>> Hello.
>
> > >>> I figured out that my rabbitmq queues are corrupt so neutron port
> cannot
>
> > >>> upgrade security rules. I need delete queues so I can migrate without
>
> > >>> problem.
>
> > >>>
>
> > >>> Thank you so much for replying to me.
>
> > >>>
>
> > >>> On Thu, Jul 27, 2023, 8:11 AM Nguyễn Hữu Khôi <
> nguyenhuukhoinw at gmail.com>
>
> > >>> wrote:
>
> > >>>
>
> > >>>> Hello.
>
> > >>>>
>
> > >>>> When my instances was migrated to other computes. I check on dest
> host
>
> > >>>> and I see that
>
> > >>>>
>
> > >>>> -A neutron-openvswi-i41ec1d15-e -d x.x.x.x(my instance ip)/32 -p
> udp -m
>
> > >>>> udp --sport 67 --dport 68 -j RETURN missing and my instance cannot
> get IP.
>
> > >>>> I must restart neutron_openvswitch_agent then this rule appears and
> I can
>
> > >>>> touch the instance via network.
>
> > >>>>
>
> > >>>> I use openswitch and provider networks. This problem has happened
> this
>
> > >>>> week. after the system was upgraded from xena to yoga and I enabled
> quorum
>
> > >>>> queue.
>
> > >>>>
>
> > >>>>
>
> > >>>>
>
> > >>>> Nguyen Huu Khoi
>
> > >>>>
>
> > >>>>
>
> > >>>> On Wed, Jul 26, 2023 at 5:28 PM Nguyễn Hữu Khôi <
>
> > >>>> nguyenhuukhoinw at gmail.com> wrote:
>
> > >>>>
>
> > >>>>>  Because I dont see any error logs. Althought, i set debug log to
> on.
>
> > >>>>>
>
> > >>>>> Your advices are very helpful to me. I will try to dig deeply. I am
>
> > >>>>> lost so some suggests are the best way for me to continue. :)
>
> > >>>>>
>
> > >>>>> On Wed, Jul 26, 2023, 4:39 PM <smooney at redhat.com> wrote:
>
> > >>>>>
>
> > >>>>>> On Wed, 2023-07-26 at 07:49 +0700, Nguyễn Hữu Khôi wrote:
>
> > >>>>>> > Hello guys.
>
> > >>>>>> >
>
> > >>>>>> > I am using openstack yoga with kolla ansible.
>
> > >>>>>> without logs of some kind i dont think anyoen will be able to hlep
>
> > >>>>>> you with this.
>
> > >>>>>> you have one issue with the config which i noted inline but that
>
> > >>>>>> should not break live migration.
>
> > >>>>>> but it would allow it to proceed when otherwise it would have
> failed.
>
> > >>>>>> and it woudl allow this issue to happen instead of the vm goign to
>
> > >>>>>> error ro the migration
>
> > >>>>>> being aborted in pre live migrate.
>
> > >>>>>> >
>
> > >>>>>> > When I migrate:
>
> > >>>>>> >
>
> > >>>>>> > instance1 from host A to host B after that I cannot ping this
>
> > >>>>>> > instance(telnet also). I must restart neutron_openvswitch_agent
> or
>
> > >>>>>> move
>
> > >>>>>> > this instance back to host B  then this problem has gone.
>
> > >>>>>> >
>
> > >>>>>> > this is my settings:
>
> > >>>>>> >
>
> > >>>>>> > ----------------- neutron.conf -----------------
>
> > >>>>>> > [nova]
>
> > >>>>>> > live_migration_events = True
>
> > >>>>>> > ------------------------------------------------
>
> > >>>>>> >
>
> > >>>>>> > ----------------- nova.conf -----------------
>
> > >>>>>> > [DEFAULT]
>
> > >>>>>> > vif_plugging_timeout = 600
>
> > >>>>>> > vif_plugging_is_fatal = False
>
> > >>>>>> you should never run with this set to false in production.
>
> > >>>>>> it will break nova ability to detect if netroking is configured
>
> > >>>>>> when booting or migrating a vm.
>
> > >>>>>> we honestly should have remove this when we removed nova-networks
>
> > >>>>>> > debug = True
>
> > >>>>>> >
>
> > >>>>>> > [compute]
>
> > >>>>>> > live_migration_wait_for_vif_plug = True
>
> > >>>>>> >
>
> > >>>>>> > [workarounds]
>
> > >>>>>> > enable_qemu_monitor_announce_self = True
>
> > >>>>>> >
>
> > >>>>>> > ----------------- openvswitch_agent.ini-----------------
>
> > >>>>>> > [securitygroup]
>
> > >>>>>> > firewall_driver = openvswitch
>
> > >>>>>> > [ovs]
>
> > >>>>>> > openflow_processed_per_port = true
>
> > >>>>>> >
>
> > >>>>>> > I check nova, neutron, ops logs but they are ok.
>
> > >>>>>> >
>
> > >>>>>> > Thank you.
>
> > >>>>>> >
>
> > >>>>>> >
>
> > >>>>>> > Nguyen Huu Khoi
>
> > >>>>>>
>
> > >>>>>>
>
> >
>
>
> --
>
> Slawek Kaplonski
>
> Principal Software Engineer
>
> Red Hat
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230731/9dbf2e18/attachment.htm>


More information about the openstack-discuss mailing list