[openstack-dev] [infra] [gate] [all] openstack services footprint lead to oom-kill in the gate

Joshua Harlow harlowja at fastmail.com
Fri Feb 3 18:18:59 UTC 2017


An example of what this (dozer) gathers (attached).

-Josh

Joshua Harlow wrote:
> Has anyone tried:
>
> https://github.com/mgedmin/dozer/blob/master/dozer/leak.py#L72
>
> This piece of middleware creates some nice graphs (using PIL) that may
> help identify which areas are using what memory (and/or leaking).
>
> https://pypi.python.org/pypi/linesman might also be somewhat useful to
> have running.
>
> How any process takes more than 100MB here blows my mind (horizon is
> doing nicely, ha); what are people caching in process to have RSS that
> large (1.95 GB, woah).
>
> Armando M. wrote:
>> Hi,
>>
>> [TL;DR]: OpenStack services have steadily increased their memory
>> footprints. We need a concerted way to address the oom-kills experienced
>> in the openstack gate, as we may have reached a ceiling.
>>
>> Now the longer version:
>> --------------------------------
>>
>> We have been experiencing some instability in the gate lately due to a
>> number of reasons. When everything adds up, this means it's rather
>> difficult to merge anything and knowing we're in feature freeze, that
>> adds to stress. One culprit was identified to be [1].
>>
>> We initially tried to increase the swappiness, but that didn't seem to
>> help. Then we have looked at the resident memory in use. When going back
>> over the past three releases we have noticed that the aggregated memory
>> footprint of some openstack projects has grown steadily. We have the
>> following:
>>
>> * Mitaka
>> o neutron: 1.40GB
>> o nova: 1.70GB
>> o swift: 640MB
>> o cinder: 730MB
>> o keystone: 760MB
>> o horizon: 17MB
>> o glance: 538MB
>> * Newton
>> o neutron: 1.59GB (+13%)
>> o nova: 1.67GB (-1%)
>> o swift: 779MB (+21%)
>> o cinder: 878MB (+20%)
>> o keystone: 919MB (+20%)
>> o horizon: 21MB (+23%)
>> o glance: 721MB (+34%)
>> * Ocata
>> o neutron: 1.75GB (+10%)
>> o nova: 1.95GB (%16%)
>> o swift: 703MB (-9%)
>> o cinder: 920MB (4%)
>> o keystone: 903MB (-1%)
>> o horizon: 25MB (+20%)
>> o glance: 740MB (+2%)
>>
>> Numbers are approximated and I only took a couple of samples, but in a
>> nutshell, the majority of the services have seen double digit growth
>> over the past two cycles in terms of the amount or RSS memory they use.
>>
>> Since [1] is observed only since ocata [2], I imagine that's pretty
>> reasonable to assume that memory increase might as well be a determining
>> factor to the oom-kills we see in the gate.
>>
>> Profiling and surgically reducing the memory used by each component in
>> each service is a lengthy process, but I'd rather see some gate relief
>> right away. Reducing the number of API workers helps bring the RSS
>> memory down back to mitaka levels:
>>
>> * neutron: 1.54GB
>> * nova: 1.24GB
>> * swift: 694MB
>> * cinder: 778MB
>> * keystone: 891MB
>> * horizon: 24MB
>> * glance: 490MB
>>
>> However, it may have other side effects, like longer execution times, or
>> increase of timeouts.
>>
>> Where do we go from here? I am not particularly fond of stop-gap [4],
>> but it is the one fix that most widely address the memory increase we
>> have experienced across the board.
>>
>> Thanks,
>> Armando
>>
>> [1] https://bugs.launchpad.net/neutron/+bug/1656386
>> <https://bugs.launchpad.net/neutron/+bug/1656386>
>> [2]
>> http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22oom-killer%5C%22%20AND%20tags:syslog
>>
>> [3]
>> http://logs.openstack.org/21/427921/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/82084c2/
>>
>> [4] https://review.openstack.org/#/c/427921
>>
>> __________________________________________________________________________
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dozer_test.png
Type: image/png
Size: 336174 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170203/7ee0310f/attachment-0001.png>


More information about the OpenStack-dev mailing list