[Openstack-operators] A couple of recent bugs that hit us in regions with cells and moderate (to heavy) build/delete activity

Michael Still mikal at stillhq.com
Thu Feb 12 20:53:50 UTC 2015


I just want to note that both of those fixes look to be approved now.

Cheers,
Michael

On Fri, Feb 13, 2015 at 6:14 AM, Matt Van Winkle <mvanwink at rackspace.com> wrote:
> Hey folks,
> Apologies if any of this has been discussed on the list already.  I've tried
> to check everything ahead of time.
>
> We recently had two bugs combine to hit us in some of our regions as we
> rolled out some new code.  The result of them was rabbit servers not accept
> connections and/or crashing with OOM errors.   I wanted to pass them along
> as I know from the Large Deployments Team, there are more and more folks
> using cells to manage larger regions.   Here are the specific bugs:
>
> Cells doesn't properly track RabbitMQ connection pools:
> https://review.openstack.org/#/c/152667/
>
> Oslo messaging bgt in version 1.5.1 that leaks channels :
> Upstream bug: https://bugs.launchpad.net/oslo.messaging/+bug/1406629
> Upstream fix:
> https://review.openstack.org/#/c/145232/9/oslo_messaging/_drivers/impl_rabbit.py
>
>
> We are deploying patches for both in our problem areas now and the rest of
> the fleet in the immediate future, but this gave us quite a run for our
> money last week.  I wanted to share in case anyone else is chasing these
> issues and/or might after an upcoming code update.
>
> Thanks!
> Matt
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>



-- 
Rackspace Australia



More information about the OpenStack-operators mailing list