[Openstack-operators] Strange memory usage

Mike Wilson geekinutah at gmail.com
Wed Oct 16 23:36:13 UTC 2013


I want to add a little bit to what Ryan Richard is saying. Things swapping
when there is plenty of free memory isn't a behavior that is specific to
OpenStack or qemu, it is part of the way the linux kernel behaves. You can
tune overcommit ratios and swapiness, but even with  wilth swapiness set to
0 any overcommit_ratio greater than 1 will _still_ swap things
occasionally. There are 3 ways I am aware of that will ensure that you
_never_ use swap.

1. Use hugepages to back your applications, currently the linux kernel
doesn't know how to swap hugepages. There are other disadvantages and
advantages to using hugepages, I don't recommend this as a
one-size-fits-all solution. This definitely merits more understanding and
research before you decide to go this route. Qemu is happy to  use
hugepages memory to back instances. Good luck deduping those hugepages
though, last I heard neither ksmd nor anything else was able to do that.

2. You can apply patches to qemu and other applications to use mlock,
mlockall systems calls and the kernel will not swap that memory region
ever. If you search around there are a few patches that attempt to do just
this floating around on mailing lists and random git repos. Again, your
mileage may vary with this approach, I believe we do something close to
this in our production environment. But we weren't able to use what we
found publicly, we had to write our own. Sorry, the guy that wrote them is
on vacation so I can't just provide you with a link to our patch.

3. Turn off swap devices. This is generally not the answer. But if you
never ever see a legitimate case to swap rather than just die, then it
could be :-).

-Mike Wilson




On Wed, Oct 16, 2013 at 5:12 PM, Ryan Richard <ryan.richard at rackspace.com>wrote:

>  A few thoughts:
>
>
>    1. Are you actively swapping pages in and out? I'm going to guess not
>    since you have a lot of RAM available and the swap space hasn't been freed.
>    This would lead me to believe there was an event in the past that forced
>    the kernel to move some pages out to swap space but they haven't been
>    needed, so they've been left there.. The package 'dstat' is helpful to
>    determine if you're paging.
>    2. Do you overcommit your ram? If so, you may have booted a RAM heavy
>    VM at one point that pushed another process into swap. Perhaps look at
>    lowering the overcommit ratio.
>    3. You can lower the how aggressive the kernel moves things in and out
>    of swap space. The file /proc/sys/vm/swapiness controls this within the
>    kernel. You can try lowering it which will cause the kernel to not be so
>    aggressive about moving things in to swap. 0 is the lowest value.
>
> Paging in and out is what kills performance. Stuff in swap doesn't
> generally hurt performance until it needs to be accessed and has to be
> moved back into RAM. Constant paging in and out is a huge problem
> especially on a hypervisor with busy disks though.
>
>
>  Ryan Richard | RHCA
>  Rackspace US, Inc.
>
>   From: Joe Topjian <joe.topjian at cybera.ca>
> Date: Wednesday, October 16, 2013 3:46 PM
> To: Samuel Winchenbach <swinchen at gmail.com>
> Cc: "openstack-operators at lists.openstack.org" <
> openstack-operators at lists.openstack.org>
> Subject: Re: [Openstack-operators] Strange memory usage
>
>   Hello,
>
>  I'm not exactly sure what the next steps would be. :(
>
>  If your qemu processes are being reported as the high-swap consumers, it
> might be a more site-specific or instance-specific issue rather than
> something with OpenStack itself.
>
>  From the process string you gave, it looks like a normal running kvm
> instance to me... I can't see anything unusual.
>
>  If these instances aren't doing anything that would cause them to use a
> high amount of memory to the point of swapping, I'd start looking into KVM
> and reasons why it would want to swap.
>
>  Joe
>
>
> On Wed, Oct 16, 2013 at 2:25 PM, Samuel Winchenbach <swinchen at gmail.com>wrote:
>
>>  Hi Joe.  I have deleted that instance (it was no longer needed).  Here
>> is the next one down on the list:
>>
>>  119       6560  3.4  2.6 9247872 1749336 ?     Sl   Oct15  41:33
>> qemu-system-x86_64 -machine accel=kvm:tcg -name instance-000001b3 -S -M
>> pc-i440fx-1.4 -cpu
>> Opteron_G3,+nodeid_msr,+wdt,+skinit,+ibs,+osvw,+3dnowprefetch,+cr8legacy,+extapic,+cmp_legacy,+3dnow,+3dnowext,+pdpe1gb,+fxsr_opt,+mmxext,+ht,+vme
>> -m 4096 -smp 4,sockets=4,cores=1,threads=1 -uuid
>> 9627e75b-8d3b-4482-be43-8aba176a3f1f -smbios type=1,manufacturer=OpenStack
>> Foundation,product=OpenStack
>> Nova,version=2013.1.3,serial=49434d53-0200-9010-2500-109025007800,uuid=9627e75b-8d3b-4482-be43-8aba176a3f1f
>> -no-user-config -nodefaults -chardev
>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-000001b3.monitor,server,nowait
>> -mon chardev=charmonitor,id=monitor,mode=control -rtc
>> base=utc,driftfix=slew -no-shutdown -device
>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
>> file=/os-grizzly/nova/instances/9627e75b-8d3b-4482-be43-8aba176a3f1f/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
>> -device
>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>> -drive
>> file=/os-grizzly/nova/instances/9627e75b-8d3b-4482-be43-8aba176a3f1f/disk.swap,if=none,id=drive-virtio-disk1,format=qcow2,cache=none
>> -device
>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1
>> -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=31 -device
>> virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:45:63:f0,bus=pci.0,addr=0x3
>> -chardev
>> file,id=charserial0,path=/os-grizzly/nova/instances/9627e75b-8d3b-4482-be43-8aba176a3f1f/console.log
>> -device isa-serial,chardev=charserial0,id=serial0 -chardev
>> pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1
>> -device usb-tablet,id=input0 -vnc 0.0.0.0:6 -k en-us -vga cirrus -device
>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
>>
>>
>> On Wed, Oct 16, 2013 at 4:07 PM, Joe Topjian <joe.topjian at cybera.ca>wrote:
>>
>>> Hello,
>>>
>>>  "ps auxww | grep 19587", or some variation of the command, should give
>>> you the full command. Can you trace it back to an individual instance?
>>>
>>>  Thanks,
>>> Joe
>>>
>>>
>>> On Wed, Oct 16, 2013 at 11:50 AM, Samuel Winchenbach <swinchen at gmail.com
>>> > wrote:
>>>
>>>>  Joe: http://pastie.org/pastes/8407072/text
>>>>
>>>>  It really looks like qemu-system-x86 is the culprit.   mysqld
>>>> (Percona/Galera) is using a fair amount too.
>>>>
>>>
>>>
>>>
>>>  --
>>> Joe Topjian
>>> Systems Architect
>>> Cybera Inc.
>>>
>>>  www.cybera.ca
>>>
>>>  Cybera is a not-for-profit organization that works to spur and support
>>> innovation, for the economic benefit of Alberta, through the use
>>> of cyberinfrastructure.
>>>
>>
>>
>
>
>  --
> Joe Topjian
> Systems Architect
> Cybera Inc.
>
>  www.cybera.ca
>
>  Cybera is a not-for-profit organization that works to spur and support
> innovation, for the economic benefit of Alberta, through the use
> of cyberinfrastructure.
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20131016/6bb2201f/attachment.html>


More information about the OpenStack-operators mailing list