[openstack-dev] [infra] [gate] [all] openstack services footprint lead to oom-kill in the gate
harlowja at fastmail.com
Fri Feb 3 18:18:59 UTC 2017
An example of what this (dozer) gathers (attached).
Joshua Harlow wrote:
> Has anyone tried:
> This piece of middleware creates some nice graphs (using PIL) that may
> help identify which areas are using what memory (and/or leaking).
> https://pypi.python.org/pypi/linesman might also be somewhat useful to
> have running.
> How any process takes more than 100MB here blows my mind (horizon is
> doing nicely, ha); what are people caching in process to have RSS that
> large (1.95 GB, woah).
> Armando M. wrote:
>> [TL;DR]: OpenStack services have steadily increased their memory
>> footprints. We need a concerted way to address the oom-kills experienced
>> in the openstack gate, as we may have reached a ceiling.
>> Now the longer version:
>> We have been experiencing some instability in the gate lately due to a
>> number of reasons. When everything adds up, this means it's rather
>> difficult to merge anything and knowing we're in feature freeze, that
>> adds to stress. One culprit was identified to be .
>> We initially tried to increase the swappiness, but that didn't seem to
>> help. Then we have looked at the resident memory in use. When going back
>> over the past three releases we have noticed that the aggregated memory
>> footprint of some openstack projects has grown steadily. We have the
>> * Mitaka
>> o neutron: 1.40GB
>> o nova: 1.70GB
>> o swift: 640MB
>> o cinder: 730MB
>> o keystone: 760MB
>> o horizon: 17MB
>> o glance: 538MB
>> * Newton
>> o neutron: 1.59GB (+13%)
>> o nova: 1.67GB (-1%)
>> o swift: 779MB (+21%)
>> o cinder: 878MB (+20%)
>> o keystone: 919MB (+20%)
>> o horizon: 21MB (+23%)
>> o glance: 721MB (+34%)
>> * Ocata
>> o neutron: 1.75GB (+10%)
>> o nova: 1.95GB (%16%)
>> o swift: 703MB (-9%)
>> o cinder: 920MB (4%)
>> o keystone: 903MB (-1%)
>> o horizon: 25MB (+20%)
>> o glance: 740MB (+2%)
>> Numbers are approximated and I only took a couple of samples, but in a
>> nutshell, the majority of the services have seen double digit growth
>> over the past two cycles in terms of the amount or RSS memory they use.
>> Since  is observed only since ocata , I imagine that's pretty
>> reasonable to assume that memory increase might as well be a determining
>> factor to the oom-kills we see in the gate.
>> Profiling and surgically reducing the memory used by each component in
>> each service is a lengthy process, but I'd rather see some gate relief
>> right away. Reducing the number of API workers helps bring the RSS
>> memory down back to mitaka levels:
>> * neutron: 1.54GB
>> * nova: 1.24GB
>> * swift: 694MB
>> * cinder: 778MB
>> * keystone: 891MB
>> * horizon: 24MB
>> * glance: 490MB
>> However, it may have other side effects, like longer execution times, or
>> increase of timeouts.
>> Where do we go from here? I am not particularly fond of stop-gap ,
>> but it is the one fix that most widely address the memory increase we
>> have experienced across the board.
>>  https://bugs.launchpad.net/neutron/+bug/1656386
>>  https://review.openstack.org/#/c/427921
>> OpenStack Development Mailing List (not for usage questions)
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 336174 bytes
Desc: not available
More information about the OpenStack-dev