<div dir="ltr">Kris,<div><br></div><div>Sorry for the confusion, when I refer to "free mem", it's from the free column, buffers/cache row. eg:</div><div><br></div><div><div>root@node-2:/etc/libvirt/qemu# free -g</div><div> total used free shared buffers cached</div><div>Mem: 251 250 1 0 0 1</div><div>-/+ buffers/cache: 248 2 <------------ this number</div><div>Swap: 82 25 56<br></div></div><div><br></div><div>RSS sum of all the qemu processes:</div><div><div>root@node-2:/etc/libvirt/qemu# ps -eo rss,cmd|grep qemu|awk '{ sum+=$1} END {print sum}'</div><div>204191112</div></div><div><br></div><div>RSS sum of the non qemu processes:</div><div><div>root@node-2:/etc/libvirt/qemu# ps -eo rss,cmd|grep -v qemu|awk '{ sum+=$1} END {print sum}'</div><div>2017328</div></div><div><br></div><div>As you can see, the RSS total is only 196G.</div><div><br></div><div>slabtop usage: (about 10G used)</div><div><div> Active / Total Objects (% used) : 473924562 / 480448557 (98.6%)</div><div> Active / Total Slabs (% used) : 19393475 / 19393475 (100.0%)</div><div> Active / Total Caches (% used) : 87 / 127 (68.5%)</div><div> Active / Total Size (% used) : 10482413.81K / 11121675.57K (94.3%)</div><div> Minimum / Average / Maximum Object : 0.01K / 0.02K / 15.69K</div><div><br></div><div> OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME</div><div>420153856 420153856 7% 0.02K 18418442 256 73673768K kmalloc-16</div><div>55345344 49927985 12% 0.06K 864771 64 3459084K kmalloc-64</div><div>593551 238401 40% 0.55K 22516 28 360256K radix_tree_node</div><div>1121400 1117631 99% 0.19K 26700 42 213600K dentry</div><div>680784 320298 47% 0.10K 17456 39 69824K buffer_head</div><div> 10390 9998 96% 5.86K 2078 5 66496K task_struct</div><div>1103385 901181 81% 0.05K 12981 85 51924K shared_policy_node</div><div> 48992 48377 98% 1.00K 1531 32 48992K ext4_inode_cache</div><div> 4856 4832 99% 8.00K 1214 4 38848K kmalloc-8192</div><div> 58336 33664 57% 0.50K 1823 32 29168K kmalloc-512</div><div> 13552 11480 84% 2.00K 847 16 27104K kmalloc-2048</div><div>146256 81149 55% 0.18K 3324 44 26592K vm_area_struct</div><div>113424 109581 96% 0.16K 2667 48 21336K kvm_mmu_page_header</div><div> 18447 13104 71% 0.81K 473 39 15136K task_xstate</div><div> 26124 26032 99% 0.56K 933 28 14928K inode_cache</div><div> 3096 3011 97% 4.00K 387 8 12384K kmalloc-4096</div><div>106416 102320 96% 0.11K 2956 36 11824K sysfs_dir_cache</div></div><div><br></div><div><br></div><div>nova-compute still thinks there's memory available:</div><div><div>root@node-2:/etc/libvirt/qemu# grep 'Free ram' /var/log/nova/nova-compute.log|tail -1</div><div>2015-06-23 18:34:05.382 36476 AUDIT nova.compute.resource_tracker [-] Free ram (MB): 1280</div></div><div><br></div><div><br></div><div>Using virsh dommemstat, I'm only using 194GB as well:</div><div><br></div><div>rss:</div><div><div>root@node-2:/etc/libvirt/qemu# for i in instance-0000*.xml; do inst=$(echo $i|sed s,\.xml,,); virsh dommemstat $inst; done|awk '/rss/ { sum+=$2} END {print sum}'</div><div>204193676</div></div><div><br></div><div>allocated:</div><div><div>root@node-2:/etc/libvirt/qemu# for i in instance-0000*.xml; do inst=$(echo $i|sed s,\.xml,,); virsh dommemstat $inst; done|awk '/actual/ { sum+=$2} END {print sum}'</div><div>229111808</div></div><div><br></div><div>nova diagnostics had pretty much the same result was virsh dommemstat:</div><div><div>nova hypervisor-servers node-2|grep -w node-2|awk '{print $2}' > uuids.txt</div></div><div>for i in $(cat uuids.txt ); do nova diagnostics $i | tee /tmp/nova-diag/$i.txt; done<br></div><div><div><br></div><div>rss:</div><div>for i in /tmp/nova-diag/*; do cat $i | awk '/memory-rss/ {print $4}'; done| awk '{sum +=$1} END {print sum}'</div><div>204196168</div></div><div><br></div><div>allocated:</div><div><div>for i in /tmp/nova-diag/*; do cat $i | awk '/memory-actual/ {print $4}'; done| awk '{sum +=$1} END {print sum}'</div><div>229111808</div></div><div><br></div><div><br></div><div>Basically, the math doesn't add up. The qemu processes are using less that what's allocated to them. In the example above, node-2 has 250G, with 2G free.</div><div>qemu has been allocated 218G, w/ 194G used in RSS. That means 24G is not used yet (218 - 194) and I only have 2G free. You can guess what would happen if the instances decided to use that 24G...</div><div><br></div><div>thx</div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 23, 2015 at 9:58 AM, Joe Topjian <span dir="ltr"><<a href="mailto:joe@topjian.net" target="_blank">joe@topjian.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">In addition to what Kris said, here are two other ways to see memory usage of qemu processes:<div><br></div><div>The first is with "nova diagnostics <uuid>". By default this is an admin-only command.</div><div><br></div><div>The second is by running "virsh dommemstat <instance-id>" directly on the compute node.</div><div><br></div><div>Note that it's possible for the used memory (rss) to be greater than the available memory. When this happens, I believe it is due to the qemu process consuming more memory than the actual vm itself -- so the instance has consumed all available memory, plus qemu itself needs some to function properly. Someone please correct me if I'm wrong.</div><div><br></div><div>Hope that helps,</div><div>Joe</div></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">On Tue, Jun 23, 2015 at 10:12 AM, Kris G. Lindgren <span dir="ltr"><<a href="mailto:klindgren@godaddy.com" target="_blank">klindgren@godaddy.com</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">
<div style="word-wrap:break-word">
<div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
Not totally sure I am following - the output of free would help a lot.</div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
However, the number you should be caring about is free +buffers/cache. The reason for you discrepancy is you are including the cached in memory file system content that linux does in order to improve performance. On boxes with enough ram this can easily be
60+ GB. When the system comes under memory pressure (from applications or the kernel wanting more memory) the kernel will remove any cached filesystem items to free up memory for processes. This link [1] has a pretty good description of what I am talking
about.</div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
Either way, if you want to test to make sure this is a case of filesystem caching you can run:</div>
<div style="font-family:Calibri,sans-serif;font-size:14px">
<p style="margin:0px;font-size:12px;font-family:Menlo">echo 3 > /proc/sys/vm/drop_caches </p>
</div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
Which will tell linux to drop all filesystem cache from memory, and I bet a ton of your memory will show up. Note: in doing so - you will affect the performance of the box. Since what use to be an in memory lookup will now have to go to the filesystem. However,
over time the cache will re-establish. You can find more examples of how caching interacts with other part of the linux memory system here: [2]</div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
<br>
</div>
<div><font face="Calibri,sans-serif">To your question about qemu process.. </font>If you use ps aux, the columns VSZ and RSS will tell you are wanting. VSZ is the virtual size (how much memory the process has asked the kernel for). RSS is resident set side,
or that actual amount of non-swapped memory the process is using. </div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
[1] - <a href="http://www.linuxatemyram.com/" target="_blank">http://www.linuxatemyram.com/</a> </div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
[2] - <a href="http://www.linuxatemyram.com/play.html" target="_blank">http://www.linuxatemyram.com/play.html</a></div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
<div>
<div>____________________________________________</div>
<div> </div>
<div>Kris Lindgren</div>
<div>Senior Linux Systems Engineer</div>
<div>GoDaddy, LLC.</div>
</div>
<div><br>
</div>
</div>
</div>
<div style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
<br>
</div>
<span style="font-family:Calibri,sans-serif;font-size:14px;color:rgb(0,0,0)">
<div style="font-family:Calibri;font-size:11pt;text-align:left;color:black;BORDER-BOTTOM:medium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:medium none;PADDING-TOP:3pt">
<span style="font-weight:bold">From: </span>Mike Leong <<a href="mailto:leongmzlist@gmail.com" target="_blank">leongmzlist@gmail.com</a>><br>
<span style="font-weight:bold">Date: </span>Tuesday, June 23, 2015 at 9:44 AM<br>
<span style="font-weight:bold">To: </span>"<a href="mailto:openstack-operators@lists.openstack.org" target="_blank">openstack-operators@lists.openstack.org</a>" <<a href="mailto:openstack-operators@lists.openstack.org" target="_blank">openstack-operators@lists.openstack.org</a>><br>
<span style="font-weight:bold">Subject: </span>[Openstack-operators] Instance memory overhead<br>
</div><div><div>
<div><br>
</div>
<div>
<div>
<div dir="ltr">My instances are using much more memory that expected. The amount free memory (free + cached) is under 3G on my servers even though the compute nodes are configured to reserve 32G.
<div><br>
</div>
<div>Here's my setup:
<div>Release: Ice House<br>
</div>
<div>Server mem: 256G</div>
<div>Qemu version: 2.0.0+dfsg-2ubuntu1.1</div>
<div>Networking: Contrail 1.20</div>
<div>Block storage: Ceph 0.80.7</div>
<div>Hypervisor OS: Ubuntu 12.04</div>
<div>memory over-provisioning is disabled</div>
<div>kernel version: 3.11.0-26-generic</div>
<div><br>
</div>
<div>On nova.conf</div>
<div>
<div>reserved_host_memory_mb = 32768</div>
</div>
<div><br>
</div>
<div>Info on instances:</div>
<div>- root volume is file backed (qcow2) on the hypervisor local storage</div>
<div>- each instance has a rbd volume mounted from Ceph</div>
<div>- no swap file/partition</div>
<div><br>
</div>
<div>I've confirmed, via nova-compute.log, that nova is respecting the reserved_host_memory_mb directive and is not over-provisioning. On some hypervisors, nova-compute says there's 4GB available for use even though the OS has less that 4G left (free +cached)!</div>
<div><br>
</div>
<div>I've also summed up the memory from /etc/libvir/qemu/*.xml files and the total looks good.<br>
</div>
<div><br>
</div>
<div>Each hypervisor hosts about 45-50 instances.</div>
<div><br>
</div>
<div>Is there good way to calculate the actual usage of each QEMU process?</div>
<div><br>
</div>
<div>PS: I've tried free, summing up RSS, and smem but none of them can tell me where's the missing mem.</div>
<div><br>
</div>
<div>thx</div>
<div>mike</div>
</div>
</div>
</div>
</div>
</div></div></span>
</div>
<br></div></div>_______________________________________________<br>
OpenStack-operators mailing list<br>
<a href="mailto:OpenStack-operators@lists.openstack.org" target="_blank">OpenStack-operators@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>
<br></blockquote></div><br></div>
</blockquote></div><br></div>