[OCTAVIA][ROCKY] - MASTER & BACKUP instances unexpectedly deleted by octavia

Gaël THEROND gael.therond at gmail.com
Tue Jun 11 12:09:35 UTC 2019


Ok nice, do you have the commit hash? I would look at it and validate that
it have been committed to Stein too so I could bump my service to stein
using Kolla.

Thanks!

Le mar. 11 juin 2019 à 12:59, Carlos Goncalves <cgoncalves at redhat.com> a
écrit :

> On Mon, Jun 10, 2019 at 3:14 PM Gaël THEROND <gael.therond at gmail.com>
> wrote:
> >
> > Hi guys,
> >
> > Just a quick question regarding this bug, someone told me that it have
> been patched within stable/rocky, BUT, were you talking about the
> openstack/octavia repositoy or the openstack/kolla repository?
>
> Octavia.
>
> https://review.opendev.org/#/q/Ief97ddda8261b5bbc54c6824f90ae9c7a2d81701
>
> >
> > Many Thanks!
> >
> > Le mar. 4 juin 2019 à 15:19, Gaël THEROND <gael.therond at gmail.com> a
> écrit :
> >>
> >> Oh, that's perfect so, I'll just update my image and my platform as
> we're using kolla-ansible and that's super easy.
> >>
> >> You guys rocks!! (Pun intended ;-)).
> >>
> >> Many many thanks to all of you, that will real back me a lot regarding
> the Octavia solidity and Kolla flexibility actually ^^.
> >>
> >> Le mar. 4 juin 2019 à 15:17, Carlos Goncalves <cgoncalves at redhat.com>
> a écrit :
> >>>
> >>> On Tue, Jun 4, 2019 at 3:06 PM Gaël THEROND <gael.therond at gmail.com>
> wrote:
> >>> >
> >>> > Hi Lingxian Kong,
> >>> >
> >>> > That’s actually very interesting as I’ve come to the same conclusion
> this morning during my investigation and was starting to think about a fix,
> which it seems you already made!
> >>> >
> >>> > Is there a reason why it didn’t was backported to rocky?
> >>>
> >>> The patch was merged in master branch during Rocky development cycle,
> >>> hence included in stable/rocky as well.
> >>>
> >>> >
> >>> > Very helpful, many many thanks to you you clearly spare me hours of
> works! I’ll get a review of your patch and test it on our lab.
> >>> >
> >>> > Le mar. 4 juin 2019 à 11:06, Gaël THEROND <gael.therond at gmail.com>
> a écrit :
> >>> >>
> >>> >> Hi Felix,
> >>> >>
> >>> >> « Glad » you had the same issue before, and yes of course I looked
> at the HM logs which is were I actually found out that this event was
> triggered by octavia (Beside the DB data that validated that) here is my
> log trace related to this event, It doesn't really shows major issue IMHO.
> >>> >>
> >>> >> Here is the stacktrace that our octavia service archived for our
> both controllers servers, with the initial loadbalancer creation trace
> (Worker.log) and both controllers triggered task (Health-Manager.log).
> >>> >>
> >>> >> http://paste.openstack.org/show/7z5aZYu12Ttoae3AOhwF/
> >>> >>
> >>> >> I well may have miss something in it, but I don't see something
> strange on from my point of view.
> >>> >> Feel free to tell me if you spot something weird.
> >>> >>
> >>> >>
> >>> >> Le mar. 4 juin 2019 à 10:38, Felix Hüttner
> <felix.huettner at mail.schwarz> a écrit :
> >>> >>>
> >>> >>> Hi Gael,
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> we had a similar issue in the past.
> >>> >>>
> >>> >>> You could check the octiava healthmanager log (should be on the
> same node where the worker is running).
> >>> >>>
> >>> >>> This component monitors the status of the Amphorae and restarts
> them if they don’t trigger a callback after a specific time. This might
> also happen if there is some connection issue between the two components.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> But normally it should at least restart the LB with new Amphorae…
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> Hope that helps
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> Felix
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> From: Gaël THEROND <gael.therond at gmail.com>
> >>> >>> Sent: Tuesday, June 4, 2019 9:44 AM
> >>> >>> To: Openstack <openstack at lists.openstack.org>
> >>> >>> Subject: [OCTAVIA][ROCKY] - MASTER & BACKUP instances unexpectedly
> deleted by octavia
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> Hi guys,
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> I’ve a weird situation here.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> I smoothly operate a large scale multi-region Octavia service
> using the default amphora driver which imply the use of nova instances as
> loadbalancers.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> Everything is running really well and our customers (K8s and
> traditional users) are really  happy with the solution so far.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> However, yesterday one of those customers using the loadbalancer
> in front of their ElasticSearch cluster poked me because this loadbalancer
> suddenly passed from ONLINE/OK to ONLINE/ERROR, meaning the amphoras were
> no longer available but yet the anchor/member/pool and listeners settings
> were still existing.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> So I investigated and found out that the loadbalancer amphoras
> have been destroyed by the octavia user.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> The weird part is, both the master and the backup instance have
> been destroyed at the same moment by the octavia service user.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> Is there specific circumstances where the octavia service could
> decide to delete the instances but not the anchor/members/pool ?
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> It’s worrying me a bit as there is no clear way to trace why does
> Octavia did take this action.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> I digged within the nova and Octavia DB in order to correlate the
> action but except than validating my investigation it doesn’t really help
> as there are no clue of why the octavia service did trigger the deletion.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> If someone have any clue or tips to give me I’ll be more than
> happy to discuss this situation.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> Cheers guys!
> >>> >>>
> >>> >>> Hinweise zum Datenschutz finden Sie hier.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190611/675dcd7c/attachment.html>


More information about the openstack-discuss mailing list