[Openstack] High Latency to VMs

Min Pae sputnik13 at gmail.com
Sat Dec 13 07:39:36 UTC 2014


What kernel version are you running on the host?

On Fri, Dec 12, 2014 at 12:09 PM, André Aranha <andre.f.aranha at gmail.com> wrote:
> Our compute nodes are using vhost_net, we haven't made any changes to buffer
> our NIC.
> The system is not over loaded, cpu usage aren't higher than 30%
>
> On 12 December 2014 at 02:35, mad Engineer <themadengin33r at gmail.com> wrote:
>>
>> so looks like its not the issue with openvswitch,missed is quite
>> normal,it is not the reason for packet loss
>> is your guests using vhost_net?
>> do
>> ps aux|grep vhost
>> also have you made any changes to buffer size of your NIC?
>> is the system over loaded what is the cpu usage
>>
>> On Thu, Dec 11, 2014 at 6:20 PM, André Aranha <andre.f.aranha at gmail.com>
>> wrote:
>> > Thanks for the advice, i've run the command in NetworkNode and in a
>> > ComputeNode and lost is 0, but missed is a high value.
>> >
>> > NetworkNode
>> > system at ovs-system:
>> > lookups: hit:425667155 missed:2962922 lost:0
>> > flows: 27
>> > port 0: ovs-system (internal)
>> > port 1: br-ex (internal)
>> > port 2: br-tun (internal)
>> > port 3: eth1
>> > port 4: br-int (internal)
>> > port 5: tapbdc3d959-d8 (internal)
>> > port 6: gre_system (gre: df_default=false, ttl=0)
>> > port 7: qr-4063db49-6b (internal)
>> > port 8: qg-e427e527-92 (internal)
>> >
>> >
>> > ComputeNode
>> > system at ovs-system:
>> > lookups: hit:28660666 missed:200922 lost:0
>> > flows: 19
>> > port 0: ovs-system (internal)
>> > port 1: br-int (internal)
>> > port 2: br-tun (internal)
>> > port 3: gre_system (gre: df_default=false, ttl=0)
>> > port 4: em1
>> > port 5: br-private (internal)
>> > port 6: qvo9a959049-a0
>> > port 7: qvodd0ef077-e1
>> > port 8: qvoac2b566b-65
>> > port 9: qvo9e4ab149-5c
>> > port 10: qvoc2d2625c-0c
>> > port 11: qvo3069daeb-4a
>> > port 12: qvo7f82a3cf-0c
>> > port 13: qvo83b77d2d-1a
>> > port 14: qvobbadd8c2-30
>> > port 15: qvocfd0b8e8-ad
>> > port 16: qvo714fab88-60
>> > port 17: qvob9ddde49-86
>> > port 18: qvo42ef9f3b-ac
>> > port 19: qvof4ae7868-41
>> > port 20: qvoa4408a18-03
>> > port 22: qvo36c64d52-9b
>> >
>> > On 11 December 2014 at 06:17, mad Engineer <themadengin33r at gmail.com>
>> > wrote:
>> >>
>> >> sorry its 2.3.0 not 2.1.3
>> >>
>> >> On Thu, Dec 11, 2014 at 2:43 PM, mad Engineer
>> >> <themadengin33r at gmail.com>
>> >> wrote:
>> >> > Not in openstack,i had performance issue, with OVS and bursty traffic
>> >> > upgrading to later version improved the performance.A lot of
>> >> > performance features have been added in  2.1.3.
>> >> >
>> >> > Do you have lots of lost: value in
>> >> > ovs-dpctl show
>> >> >
>> >> >
>> >> > On Thu, Dec 11, 2014 at 2:33 AM, André Aranha
>> >> > <andre.f.aranha at gmail.com>
>> >> > wrote:
>> >> >> Yes, we are using version 2.0.2.
>> >> >> The process uses only about 0.3% on network node and compute node.
>> >> >> Did you have the same issue?
>> >> >>
>> >> >> On 10 December 2014 at 14:31, mad Engineer
>> >> >> <themadengin33r at gmail.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> are you using openvswitch? which version?
>> >> >>> if yes,is it consuming a lot of CPU?
>> >> >>>
>> >> >>> On Wed, Dec 10, 2014 at 7:45 PM, André Aranha
>> >> >>> <andre.f.aranha at gmail.com>
>> >> >>> wrote:
>> >> >>> > Well, here we are using de Icehouse with Ubuntu 14.04 LTS
>> >> >>> >
>> >> >>> > We found this thread in the community  and we apply the changes
>> >> >>> > in
>> >> >>> > the
>> >> >>> > compute nodes (change VHOST_NET_ENABLED to 1 in
>> >> >>> > /etc/default/qemu-kvm).
>> >> >>> > After do this, a few instances the problem doesn't exists
>> >> >>> > anymore.
>> >> >>> > This
>> >> >>> > link
>> >> >>> > show an investigation to find the problem.
>> >> >>> >
>> >> >>> > About the MTU in our cloud (using iperf),
>> >> >>> >
>> >> >>> > 1-from any the Desktop to the Network Node
>> >> >>> > MSS size 1448 bytes (MTU 1500 bytes, ethernet)
>> >> >>> >
>> >> >>> > 2-from any Desktop to the instance
>> >> >>> > MSS size 1348 bytes (MTU 1388 bytes, unknown interface)
>> >> >>> >
>> >> >>> > 3- from any instance to the Network Node
>> >> >>> > MSS size 1348 bytes (MTU 1388 bytes, unknown interface)
>> >> >>> >
>> >> >>> > 4- from any instance to the Desktop
>> >> >>> > MSS size 1348 bytes (MTU 1388 bytes, unknown interface)
>> >> >>> >
>> >> >>> > 5-from Network Node to any ComputeNode
>> >> >>> > MSS size 1448 bytes (MTU 1500 bytes, ethernet)
>> >> >>> >
>> >> >>> > 6-from any ComputeNode to NetworkNode
>> >> >>> > MSS size 1448 bytes (MTU 1500 bytes, ethernet)
>> >> >>> >
>> >> >>> > On 10 December 2014 at 10:31, somshekar kadam
>> >> >>> > <som_kadam at yahoo.co.in>
>> >> >>> > wrote:
>> >> >>> >>
>> >> >>> >> Sorry for wrong post mail chain.
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> Regards
>> >> >>> >> Neelu
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> On Wednesday, 10 December 2014 6:59 PM, somshekar kadam
>> >> >>> >> <som_kadam at yahoo.co.in> wrote:
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> Hi All,
>> >> >>> >>
>> >> >>> >> Please recommend which stable Host OS to use for Controller and
>> >> >>> >> Compute
>> >> >>> >> node.
>> >> >>> >> I have tried Fedora20 seems lot of tweaking is required, corerct
>> >> >>> >> me
>> >> >>> >> If
>> >> >>> >> I
>> >> >>> >> am wrong.
>> >> >>> >> I see that most of it is tested on ubuntu and centos.
>> >> >>> >> I am planning to use JUNO stable version.
>> >> >>> >> Please help on this
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> Regards
>> >> >>> >> Neelu
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> On Wednesday, 10 December 2014 5:42 PM, Hannah Fordham
>> >> >>> >> <hfordham at radiantworlds.com> wrote:
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> I'm afraid we didn't, we're still struggling with some VMs with
>> >> >>> >> this
>> >> >>> >> problem. Sorry!
>> >> >>> >>
>> >> >>> >> On 9 December 2014 14:09:32 GMT+00:00, "André Aranha"
>> >> >>> >> <andre.f.aranha at gmail.com> wrote:
>> >> >>> >>
>> >> >>> >> Hi,
>> >> >>> >>
>> >> >>> >> We are with the same issue here, and already try some solutions
>> >> >>> >> that
>> >> >>> >> didn't work at all. Did you solved this problem?
>> >> >>> >>
>> >> >>> >> Thank you,
>> >> >>> >> Andre Aranha
>> >> >>> >>
>> >> >>> >> On 27 August 2014 at 08:17, Hannah Fordham
>> >> >>> >> <hfordham at radiantworlds.com>
>> >> >>> >> wrote:
>> >> >>> >>
>> >> >>> >> I’ve been trying to figure this one out for a while, so I’ll try
>> >> >>> >> and be
>> >> >>> >> as
>> >> >>> >> thorough as possible in this post but apologies if I miss
>> >> >>> >> anything
>> >> >>> >> pertinent
>> >> >>> >> out.
>> >> >>> >>
>> >> >>> >> First off, I’m running a set up with one control node and 5
>> >> >>> >> compute
>> >> >>> >> nodes,
>> >> >>> >> all created using the Stackgeek scripts -
>> >> >>> >> http://www.stackgeek.com/guides/gettingstarted.html. The first
>> >> >>> >> two
>> >> >>> >> (compute1
>> >> >>> >> and compute 2) were created at the same time, compute3, 4 and 5
>> >> >>> >> were
>> >> >>> >> added
>> >> >>> >> as needed later. My VMs are predominantly CentOS, while my
>> >> >>> >> Openstack
>> >> >>> >> nodes
>> >> >>> >> are Ubuntu 14.04.1
>> >> >>> >>
>> >> >>> >> The symptom: irregular high latency/packet loss to VMs on all
>> >> >>> >> compute
>> >> >>> >> boxes except compute3. Mostly a pain when trying to do anything
>> >> >>> >> via
>> >> >>> >> ssh
>> >> >>> >> on a
>> >> >>> >> VM because the lag makes it difficult to do anything, but it
>> >> >>> >> shows
>> >> >>> >> itself
>> >> >>> >> quite nicely through pings as well:
>> >> >>> >> --- 10.0.102.47 ping statistics ---
>> >> >>> >> 111 packets transmitted, 103 received, 7% packet loss, time
>> >> >>> >> 110024ms
>> >> >>> >> rtt min/avg/max/mdev = 0.096/367.220/5593.100/1146.920 ms, pipe
>> >> >>> >> 6
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> I have tested these pings:
>> >> >>> >> VM to itself (via its external IP) seems fine
>> >> >>> >> VM to another VM is not fine
>> >> >>> >> Hosting compute node to VM is not fine
>> >> >>> >> My PC to VM is not fine (however the other way round works fine)
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> Top on a (32 core) compute node with laggy VMs:
>> >> >>> >> top - 12:09:20 up 33 days, 21:35,  1 user,  load average: 2.37,
>> >> >>> >> 4.95,
>> >> >>> >> 6.23
>> >> >>> >> Tasks: 431 total,   2 running, 429 sleeping,   0 stopped,   0
>> >> >>> >> zombie
>> >> >>> >> %Cpu(s):  0.6 us,  3.4 sy,  0.0 ni, 96.0 id,  0.0 wa,  0.0 hi,
>> >> >>> >> 0.0
>> >> >>> >> si,
>> >> >>> >> 0.0 st
>> >> >>> >> KiB Mem:  65928256 total, 44210348 used, 21717908 free,   341172
>> >> >>> >> buffers
>> >> >>> >> KiB Swap:  7812092 total,  1887864 used,  5924228 free.  7134740
>> >> >>> >> cached
>> >> >>> >> Mem
>> >> >>> >>
>> >> >>> >> And for comparison, on the one compute node that doesn’t seem to
>> >> >>> >> be
>> >> >>> >> suffering from this:
>> >> >>> >> top - 12:12:20 up 33 days, 21:38,  1 user,  load average: 0.28,
>> >> >>> >> 0.18,
>> >> >>> >> 0.15
>> >> >>> >> Tasks: 399 total,   3 running, 396 sleeping,   0 stopped,   0
>> >> >>> >> zombie
>> >> >>> >> %Cpu(s):  0.3 us,  0.1 sy,  0.0 ni, 98.9 id,  0.6 wa,  0.0 hi,
>> >> >>> >> 0.0
>> >> >>> >> si,
>> >> >>> >> 0.0 st
>> >> >>> >> KiB Mem:  65928256 total, 49986064 used, 15942192 free,   335788
>> >> >>> >> buffers
>> >> >>> >> KiB Swap:  7812092 total,   919392 used,  6892700 free. 39272312
>> >> >>> >> cached
>> >> >>> >> Mem
>> >> >>> >>
>> >> >>> >> Top on a laggy VM:
>> >> >>> >> top - 11:02:53 up 27 days, 33 min,  3 users,  load average:
>> >> >>> >> 0.00,
>> >> >>> >> 0.00,
>> >> >>> >> 0.00
>> >> >>> >> Tasks:  91 total,   1 running,  90 sleeping,   0 stopped,   0
>> >> >>> >> zombie
>> >> >>> >> Cpu(s):  0.2%us,  0.1%sy,  0.0%ni, 99.5%id,  0.1%wa,  0.0%hi,
>> >> >>> >> 0.0%si,
>> >> >>> >> 0.0%st
>> >> >>> >> Mem:   1020400k total,   881004k used,   139396k free,   162632k
>> >> >>> >> buffers
>> >> >>> >> Swap:  1835000k total,    14984k used,  1820016k free,   220644k
>> >> >>> >> cached
>> >> >>> >>
>> >> >>> >> http://imgur.com/blULjDa shows the hypervisor panel of Horizon.
>> >> >>> >> As
>> >> >>> >> you
>> >> >>> >> can
>> >> >>> >> see, Compute 3 has fewer resources used, but none of the compute
>> >> >>> >> nodes
>> >> >>> >> should be anywhere near overloaded from what I can tell.
>> >> >>> >>
>> >> >>> >> Any ideas? Let me know if I’m missing anything obvious that
>> >> >>> >> would
>> >> >>> >> help
>> >> >>> >> with figuring this out!
>> >> >>> >>
>> >> >>> >> Hannah
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> ***********
>> >> >>> >>
>> >> >>> >> Radiant Worlds Limited is registered in England (company no:
>> >> >>> >> 07822337).
>> >> >>> >> This message is intended solely for the addressee and may
>> >> >>> >> contain
>> >> >>> >> confidential information. If you have received this message in
>> >> >>> >> error
>> >> >>> >> please
>> >> >>> >> send it back to us and immediately and permanently delete it
>> >> >>> >> from
>> >> >>> >> your
>> >> >>> >> system. Do not use, copy or disclose the information contained
>> >> >>> >> in
>> >> >>> >> this
>> >> >>> >> message or in any attachment. Please also note that transmission
>> >> >>> >> cannot
>> >> >>> >> be
>> >> >>> >> guaranteed to be secure or error-free.
>> >> >>> >>
>> >> >>> >> _______________________________________________
>> >> >>> >> Mailing list:
>> >> >>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> >> >>> >> Post to     : openstack at lists.openstack.org
>> >> >>> >> Unsubscribe :
>> >> >>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> --
>> >> >>> >> Sent from my Android device with K-9 Mail. Please excuse my
>> >> >>> >> brevity.
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> ***********
>> >> >>> >>
>> >> >>> >> Radiant Worlds Limited is registered in England (company no:
>> >> >>> >> 07822337).
>> >> >>> >> This message is intended solely for the addressee and may
>> >> >>> >> contain
>> >> >>> >> confidential information. If you have received this message in
>> >> >>> >> error
>> >> >>> >> please
>> >> >>> >> send it back to us and immediately and permanently delete it
>> >> >>> >> from
>> >> >>> >> your
>> >> >>> >> system. Do not use, copy or disclose the information contained
>> >> >>> >> in
>> >> >>> >> this
>> >> >>> >> message or in any attachment. Please also note that transmission
>> >> >>> >> cannot
>> >> >>> >> be
>> >> >>> >> guaranteed to be secure or error-free.
>> >> >>> >>
>> >> >>> >> _______________________________________________
>> >> >>> >> Mailing list:
>> >> >>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> >> >>> >> Post to    : openstack at lists.openstack.org
>> >> >>> >> Unsubscribe :
>> >> >>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> _______________________________________________
>> >> >>> >> Mailing list:
>> >> >>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> >> >>> >> Post to    : openstack at lists.openstack.org
>> >> >>> >> Unsubscribe :
>> >> >>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> >> >>> >>
>> >> >>> >>
>> >> >>> >
>> >> >>> >
>> >> >>> > _______________________________________________
>> >> >>> > Mailing list:
>> >> >>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> >> >>> > Post to     : openstack at lists.openstack.org
>> >> >>> > Unsubscribe :
>> >> >>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> >> >>> >
>> >> >>
>> >> >>
>> >
>> >
>
>
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack at lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>




More information about the Openstack mailing list