[openstack-dev] [infra] [gate] [all] openstack services footprint lead to oom-kill in the gate

Sean Dague sean at dague.net
Thu Feb 2 18:08:06 UTC 2017


On 02/02/2017 12:49 PM, Armando M. wrote:
> 
> 
> On 2 February 2017 at 08:40, Sean Dague <sean at dague.net
> <mailto:sean at dague.net>> wrote:
> 
>     On 02/02/2017 11:16 AM, Matthew Treinish wrote:
>     <snip>
>     > <oops, forgot to finish my though>
>     >
>     > We definitely aren't saying running a single worker is how we recommend people
>     > run OpenStack by doing this. But it just adds on to the differences between the
>     > gate and what we expect things actually look like.
> 
>     I'm all for actually getting to the bottom of this, but honestly real
>     memory profiling is needed here. The growth across projects probably
>     means that some common libraries are some part of this. The ever growing
>     requirements list is demonstrative of that. Code reuse is good, but if
>     we are importing much of a library to get access to a couple of
>     functions, we're going to take a bunch of memory weight on that
>     (especially if that library has friendly auto imports in top level
>     __init__.py so we can't get only the parts we want).
> 
>     Changing the worker count is just shuffling around deck chairs.
> 
>     I'm not familiar enough with memory profiling tools in python to know
>     the right approach we should take there to get this down to individual
>     libraries / objects that are containing all our memory. Anyone more
>     skilled here able to help lead the way?
> 
> 
> From what I hear, the overall consensus on this matter is to determine
> what actually caused the memory consumption bump and how to address it,
> but that's more of a medium to long term action. In fact, to me this is
> one of the top priority matters we should talk about at the imminent PTG.
> 
> For the time being, and to provide relief to the gate, should we want to
> lock the API_WORKERS to 1? I'll post something for review and see how
> many people shoot it down :)

I don't think we want to do that. It's going to force down the eventlet
API workers to being a single process, and it's not super clear that
eventlet handles backups on the inbound socket well. I honestly would
expect that creates different hard to debug issues, especially with high
chatter rates between services.

	-Sean

-- 
Sean Dague
http://dague.net



More information about the OpenStack-dev mailing list