[Openstack-operators] Instance memory overhead

Mike Leong leongmzlist at gmail.com
Tue Jun 23 19:08:31 UTC 2015


Kris,

Sorry for the confusion, when I refer to "free mem", it's from the free
column, buffers/cache row.  eg:

root at node-2:/etc/libvirt/qemu# free -g
             total       used       free     shared    buffers     cached
Mem:           251        250          1          0          0          1
-/+ buffers/cache:        248          2  <------------ this number
Swap:           82         25         56

RSS sum of all the qemu processes:
root at node-2:/etc/libvirt/qemu# ps -eo rss,cmd|grep qemu|awk '{ sum+=$1} END
{print sum}'
204191112

RSS sum of the non qemu processes:
root at node-2:/etc/libvirt/qemu# ps -eo rss,cmd|grep -v qemu|awk '{ sum+=$1}
END {print sum}'
2017328

As you can see, the RSS total is only 196G.

slabtop usage: (about 10G used)
 Active / Total Objects (% used)    : 473924562 / 480448557 (98.6%)
 Active / Total Slabs (% used)      : 19393475 / 19393475 (100.0%)
 Active / Total Caches (% used)     : 87 / 127 (68.5%)
 Active / Total Size (% used)       : 10482413.81K / 11121675.57K (94.3%)
 Minimum / Average / Maximum Object : 0.01K / 0.02K / 15.69K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
420153856 420153856   7%    0.02K 18418442      256  73673768K kmalloc-16
55345344 49927985  12%    0.06K 864771       64   3459084K kmalloc-64
593551 238401  40%    0.55K  22516       28    360256K radix_tree_node
1121400 1117631  99%    0.19K  26700       42    213600K dentry
680784 320298  47%    0.10K  17456       39     69824K buffer_head
 10390   9998  96%    5.86K   2078        5     66496K task_struct
1103385 901181  81%    0.05K  12981       85     51924K shared_policy_node
 48992  48377  98%    1.00K   1531       32     48992K ext4_inode_cache
  4856   4832  99%    8.00K   1214        4     38848K kmalloc-8192
 58336  33664  57%    0.50K   1823       32     29168K kmalloc-512
 13552  11480  84%    2.00K    847       16     27104K kmalloc-2048
146256  81149  55%    0.18K   3324       44     26592K vm_area_struct
113424 109581  96%    0.16K   2667       48     21336K kvm_mmu_page_header
 18447  13104  71%    0.81K    473       39     15136K task_xstate
 26124  26032  99%    0.56K    933       28     14928K inode_cache
  3096   3011  97%    4.00K    387        8     12384K kmalloc-4096
106416 102320  96%    0.11K   2956       36     11824K sysfs_dir_cache


nova-compute still thinks there's memory available:
root at node-2:/etc/libvirt/qemu# grep 'Free ram'
/var/log/nova/nova-compute.log|tail -1
2015-06-23 18:34:05.382 36476 AUDIT nova.compute.resource_tracker [-] Free
ram (MB): 1280


Using virsh dommemstat, I'm only using 194GB as well:

rss:
root at node-2:/etc/libvirt/qemu# for i in instance-0000*.xml; do inst=$(echo
$i|sed s,\.xml,,); virsh dommemstat $inst; done|awk '/rss/ { sum+=$2} END
{print sum}'
204193676

allocated:
root at node-2:/etc/libvirt/qemu# for i in instance-0000*.xml; do inst=$(echo
$i|sed s,\.xml,,); virsh dommemstat $inst; done|awk '/actual/ { sum+=$2}
END {print sum}'
229111808

nova diagnostics had pretty much the same result was virsh dommemstat:
nova hypervisor-servers node-2|grep -w node-2|awk '{print $2}' > uuids.txt
for i in $(cat uuids.txt ); do nova diagnostics  $i | tee
/tmp/nova-diag/$i.txt; done

rss:
for i in /tmp/nova-diag/*; do cat $i | awk '/memory-rss/ {print $4}'; done|
awk '{sum +=$1} END {print sum}'
204196168

allocated:
for i in /tmp/nova-diag/*; do cat $i | awk '/memory-actual/ {print $4}';
done| awk '{sum +=$1} END {print sum}'
229111808


Basically, the math doesn't add up.  The qemu processes are using less that
what's allocated to them.  In the example above, node-2 has 250G, with 2G
free.
qemu has been allocated 218G, w/ 194G used in RSS.  That means 24G is not
used yet (218 - 194) and I only have 2G free.  You can guess what would
happen if the instances decided to use that 24G...

thx



On Tue, Jun 23, 2015 at 9:58 AM, Joe Topjian <joe at topjian.net> wrote:

> In addition to what Kris said, here are two other ways to see memory usage
> of qemu processes:
>
> The first is with "nova diagnostics <uuid>". By default this is an
> admin-only command.
>
> The second is by running "virsh dommemstat <instance-id>" directly on the
> compute node.
>
> Note that it's possible for the used memory (rss) to be greater than the
> available memory. When this happens, I believe it is due to the qemu
> process consuming more memory than the actual vm itself -- so the instance
> has consumed all available memory, plus qemu itself needs some to function
> properly. Someone please correct me if I'm wrong.
>
> Hope that helps,
> Joe
>
> On Tue, Jun 23, 2015 at 10:12 AM, Kris G. Lindgren <klindgren at godaddy.com>
> wrote:
>
>>   Not totally sure I am following - the output of free would help a lot.
>>
>>  However, the number you should be caring about is free +buffers/cache.
>> The reason for you discrepancy is you are including the cached in memory
>> file system content that linux does in order to improve performance. On
>> boxes with enough ram this can easily be 60+ GB.  When the system comes
>> under memory pressure (from applications or the kernel wanting more memory)
>> the kernel will remove any cached filesystem items to free up memory for
>> processes.  This link [1] has a pretty good description of what I am
>> talking about.
>>
>>  Either way, if you want to test to make sure this is a case of
>> filesystem caching you can run:
>>
>> echo 3 > /proc/sys/vm/drop_caches
>>
>>  Which will tell linux to drop all filesystem cache from memory, and I
>> bet a ton of your memory will show up.  Note: in doing so - you will affect
>> the performance of the box.  Since what use to be an in memory lookup will
>> now have to go to the filesystem.  However, over time the cache will
>> re-establish.  You can find more examples of how caching interacts with
>> other part of the linux memory system here: [2]
>>
>>  To your question about qemu process..  If you use ps aux, the
>> columns VSZ and RSS will tell you are wanting.  VSZ is the virtual size
>> (how much memory the process has asked the kernel for).  RSS is resident
>> set side, or that actual amount of non-swapped memory the process is using.
>>
>>  [1] - http://www.linuxatemyram.com/
>>  [2] - http://www.linuxatemyram.com/play.html
>>  ____________________________________________
>>
>> Kris Lindgren
>> Senior Linux Systems Engineer
>> GoDaddy, LLC.
>>
>>
>>   From: Mike Leong <leongmzlist at gmail.com>
>> Date: Tuesday, June 23, 2015 at 9:44 AM
>> To: "openstack-operators at lists.openstack.org" <
>> openstack-operators at lists.openstack.org>
>> Subject: [Openstack-operators] Instance memory overhead
>>
>>   My instances are using much more memory that expected.  The amount
>> free memory (free + cached) is under 3G on my servers even though the
>> compute nodes are configured to reserve 32G.
>>
>>  Here's my setup:
>> Release: Ice House
>>  Server mem: 256G
>> Qemu version: 2.0.0+dfsg-2ubuntu1.1
>> Networking: Contrail 1.20
>> Block storage: Ceph 0.80.7
>> Hypervisor OS: Ubuntu 12.04
>> memory over-provisioning is disabled
>> kernel version: 3.11.0-26-generic
>>
>>  On nova.conf
>>  reserved_host_memory_mb = 32768
>>
>>  Info on instances:
>> - root volume is file backed (qcow2) on the hypervisor local storage
>> - each instance has a rbd volume mounted from Ceph
>> - no swap file/partition
>>
>>  I've confirmed, via nova-compute.log, that nova is respecting the
>> reserved_host_memory_mb directive and is not over-provisioning.  On some
>> hypervisors, nova-compute says there's 4GB available for use even though
>> the OS has less that 4G left (free +cached)!
>>
>>  I've also summed up the memory from /etc/libvir/qemu/*.xml files and
>> the total looks good.
>>
>>  Each hypervisor hosts about 45-50 instances.
>>
>>  Is there good way to calculate the actual usage of each QEMU process?
>>
>>  PS: I've tried free, summing up RSS, and smem but none of them can tell
>> me where's the missing mem.
>>
>>  thx
>> mike
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20150623/1cd44203/attachment.html>


More information about the OpenStack-operators mailing list