[Openstack-operators] OpenvSwitch Latency issues

Jacob Godin jacobgodin at gmail.com
Tue Oct 15 19:37:06 UTC 2013


The tenant router has the initial delay to/from both external and internal.


On Tue, Oct 15, 2013 at 4:29 PM, Narayan Desai <narayan.desai at gmail.com>wrote:

> This sounds like a timer in the qrouter path, since you can get to the
> tenant router with predictable, low latency, right?
>
> This is one of the big problems with network datapaths implemented fully
> in software.
>  -nld
>
>
> On Tue, Oct 15, 2013 at 1:06 PM, Jacob Godin <jacobgodin at gmail.com> wrote:
>
>> Hi Jay,
>>
>> If I stop and repeat immediately, ping times are fine. However, if I wait
>> 5+ secs, they spike up during the first packet again.
>>
>> I'm running OVS 1.4.0, someone recommended upgraded to 1.9.x from the
>> Havana repo.
>>
>>
>> On Tue, Oct 15, 2013 at 2:41 PM, Jay Pipes <jaypipes at gmail.com> wrote:
>>
>>> Hi Jacob,
>>>
>>> What you are witnessing, I believe, is OVS "learning" the flows and MAC
>>> addresses of the various compute nodes involved in the communication path
>>> between the source and target interfaces.
>>>
>>> If you repeat the pings, do you see the same latency on the first ping?
>>>
>>> Best,
>>> -jay
>>>
>>>
>>> On 10/15/2013 10:37 AM, Jacob Godin wrote:
>>>
>>>> Hi folks,
>>>>
>>>> I'm experiencing a weird issue with OpenStack Networking + OpenvSwitch.
>>>> My setup consists of several compute nodes, and a networking node (l3,
>>>> OVS, dhcp, etc.). These are connected via a Gigabit switch, and it is no
>>>> where near capacity.
>>>>
>>>> It seems that the first packet being sent through a quantum router is
>>>> delayed by several hundred milliseconds. Here is some sample ping
>>>> output:
>>>>
>>>> _VM(comp node 1) -> VM(comp node 2)_
>>>>
>>>>
>>>>     # ping 10.199.0.7
>>>>     PING 10.199.0.7 (10.199.0.7) 56(84) bytes of data.
>>>>     64 bytes from 10.199.0.7 <http://10.199.0.7>: icmp_seq=1 ttl=64
>>>>     time=3.45 ms
>>>>     64 bytes from 10.199.0.7 <http://10.199.0.7>: icmp_seq=2 ttl=64
>>>>     time=0.792 ms
>>>>     64 bytes from 10.199.0.7 <http://10.199.0.7>: icmp_seq=3 ttl=64
>>>>     time=0.837 ms
>>>>     64 bytes from 10.199.0.7 <http://10.199.0.7>: icmp_seq=4 ttl=64
>>>>     time=0.864 ms
>>>>
>>>> _VM -> qrouter_
>>>>
>>>>
>>>>     # ping 10.199.0.1
>>>>     PING 10.199.0.1 (10.199.0.1) 56(84) bytes of data.
>>>>     64 bytes from 10.199.0.1 <http://10.199.0.1>: icmp_seq=1 ttl=64
>>>>     time=248 ms
>>>>     64 bytes from 10.199.0.1 <http://10.199.0.1>: icmp_seq=2 ttl=64
>>>>     time=0.512 ms
>>>>     64 bytes from 10.199.0.1 <http://10.199.0.1>: icmp_seq=3 ttl=64
>>>>     time=0.553 ms
>>>>     64 bytes from 10.199.0.1 <http://10.199.0.1>: icmp_seq=4 ttl=64
>>>>     time=0.533 ms
>>>>     64 bytes from 10.199.0.1 <http://10.199.0.1>: icmp_seq=5 ttl=64
>>>>     time=0.679 ms
>>>>
>>>> _qrouter -> VM_
>>>>
>>>>
>>>>     # ip netns exec qrouter-XXXXX ping 10.199.0.7
>>>>     PING 10.199.0.7 (10.199.0.7) 56(84) bytes of data.
>>>>     64 bytes from 10.199.0.7 <http://10.199.0.7>: icmp_req=1 ttl=64
>>>>     time=576 ms
>>>>     64 bytes from 10.199.0.7 <http://10.199.0.7>: icmp_req=2 ttl=64
>>>>     time=0.530 ms
>>>>     64 bytes from 10.199.0.7 <http://10.199.0.7>: icmp_req=3 ttl=64
>>>>     time=0.597 ms
>>>>     64 bytes from 10.199.0.7 <http://10.199.0.7>: icmp_req=4 ttl=64
>>>>     time=0.723 ms
>>>>     64 bytes from 10.199.0.7 <http://10.199.0.7>: icmp_req=5 ttl=64
>>>>     time=0.677 ms
>>>>
>>>> _qrouter -> Internet_
>>>>
>>>>
>>>>     # ip netns exec qrouter-XXXXX ping 8.8.8.8
>>>>     PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
>>>>     64 bytes from 8.8.8.8 <http://8.8.8.8>: icmp_req=1 ttl=43 time=267
>>>> ms
>>>>     64 bytes from 8.8.8.8 <http://8.8.8.8>: icmp_req=2 ttl=43
>>>> time=37.0 ms
>>>>     64 bytes from 8.8.8.8 <http://8.8.8.8>: icmp_req=3 ttl=43
>>>> time=37.2 ms
>>>>     64 bytes from 8.8.8.8 <http://8.8.8.8>: icmp_req=4 ttl=43
>>>> time=37.3 ms
>>>>
>>>>
>>>>
>>>> Here's a tcpdump on the qrouter of a ping from a vm on that network. It
>>>> doesn't appear to show the large delay:
>>>> 14:33:38.024040 fa:16:3e:36:8e:f2 > fa:16:3e:99:85:5d, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.4 > 10.199.0.1 <http://10.199.0.1>: ICMP
>>>>
>>>> echo request, id 29953, seq 1, length 64
>>>> 14:33:38.024089 fa:16:3e:99:85:5d > fa:16:3e:36:8e:f2, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.1 > 10.199.0.4 <http://10.199.0.4>: ICMP
>>>>
>>>> echo reply, id 29953, seq 1, length 64
>>>> 14:33:38.526725 fa:16:3e:36:8e:f2 > fa:16:3e:99:85:5d, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.4 > 10.199.0.1 <http://10.199.0.1>: ICMP
>>>>
>>>> echo request, id 29953, seq 2, length 64
>>>> 14:33:38.526781 fa:16:3e:99:85:5d > fa:16:3e:36:8e:f2, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.1 > 10.199.0.4 <http://10.199.0.4>: ICMP
>>>>
>>>> echo reply, id 29953, seq 2, length 64
>>>> 14:33:39.526943 fa:16:3e:36:8e:f2 > fa:16:3e:99:85:5d, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.4 > 10.199.0.1 <http://10.199.0.1>: ICMP
>>>>
>>>> echo request, id 29953, seq 3, length 64
>>>> 14:33:39.527000 fa:16:3e:99:85:5d > fa:16:3e:36:8e:f2, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.1 > 10.199.0.4 <http://10.199.0.4>: ICMP
>>>>
>>>> echo reply, id 29953, seq 3, length 64
>>>> 14:33:39.665664 fa:16:3e:61:ef:25 > ff:ff:ff:ff:ff:ff, ethertype ARP
>>>> (0x0806), length 42: Request who-has 10.199.0.7 tell 10.199.0.9, length
>>>> 28
>>>> 14:33:40.526963 fa:16:3e:36:8e:f2 > fa:16:3e:99:85:5d, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.4 > 10.199.0.1 <http://10.199.0.1>: ICMP
>>>>
>>>> echo request, id 29953, seq 4, length 64
>>>> 14:33:40.527021 fa:16:3e:99:85:5d > fa:16:3e:36:8e:f2, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.1 > 10.199.0.4 <http://10.199.0.4>: ICMP
>>>>
>>>> echo reply, id 29953, seq 4, length 64
>>>>
>>>> And a dump from the VM performing the ping:
>>>> 14:34:59.897783 fa:16:3e:36:8e:f2 > fa:16:3e:99:85:5d, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.4 > 10.199.0.1 <http://10.199.0.1>: ICMP
>>>>
>>>> echo request, id 38145, seq 1, length 64
>>>> 14:35:00.897569 fa:16:3e:36:8e:f2 > fa:16:3e:99:85:5d, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.4 > 10.199.0.1 <http://10.199.0.1>: ICMP
>>>>
>>>> echo request, id 38145, seq 2, length 64
>>>> 14:35:01.260201 fa:16:3e:99:85:5d > fa:16:3e:36:8e:f2, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.1 > 10.199.0.4 <http://10.199.0.4>: ICMP
>>>>
>>>> echo reply, id 38145, seq 1, length 64
>>>> 14:35:01.260229 fa:16:3e:99:85:5d > fa:16:3e:36:8e:f2, ethertype IPv4
>>>> (0x0800), length 98: 10.199.0.1 > 10.199.0.4 <http://10.199.0.4>: ICMP
>>>>
>>>> echo reply, id 38145, seq 2, length 64
>>>>
>>>> So the router sees a sub-millisecond delay, while the VM sees a
>>>> significant delay (almost a second). This only happens during the first
>>>> packet, and then responses are sub 1ms.
>>>>
>>>> It appears to be an issue with the router, as delays are seem with both
>>>> internal and external traffic on the router itself. Any thoughts are
>>>> greatly appreciated!
>>>>
>>>>
>>>> ______________________________**_________________
>>>> OpenStack-operators mailing list
>>>> OpenStack-operators at lists.**openstack.org<OpenStack-operators at lists.openstack.org>
>>>> http://lists.openstack.org/**cgi-bin/mailman/listinfo/**
>>>> openstack-operators<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators>
>>>>
>>>>
>>>
>>> ______________________________**_________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.**openstack.org<OpenStack-operators at lists.openstack.org>
>>> http://lists.openstack.org/**cgi-bin/mailman/listinfo/**
>>> openstack-operators<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators>
>>>
>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20131015/3a23a42f/attachment.html>


More information about the OpenStack-operators mailing list