[Openstack] VM can receive traffic, but not send it

Sterdnot Shaken sterdnotshaken at gmail.com
Thu Mar 23 18:04:10 UTC 2017


Just to clarify: Version: Mitaka with OVS only. Firewall driver:
Openvswitch, VM OS: Windows 10

Kaustubh: Thanks for your help on the mirroring part. In my reading
yesterday, I came across a thread that stated you can't mirror a patch
interface with ovs? So, that would explain why I wasn't seeing the expected
traffic on the mirror output ports when mirroring said patch interfaces.
Outside of re-writing flows in OVS that OS installs and adding an
additional output port to the flow and then tcpdumping that added output
port, how would one effectively troubleshoot network traffic issues when
patch interfaces were in use?

Adam: Thanks for chiming in on my issue! I appreciate it. So the VM's are
placed directly on a provider network (external, flat) and, as such, have a
public ip assigned to their nic's. So for these VM's, their default gateway
is a physical router outside of Openstack's control.

As a way to further isolate the issue, I moved ALL but one vm off of one
compute node. Multiple issues happen to show there is an issue, but (on the
windows vm) running something as simple as a speed test (speedtest.net)
works great on the download, but totally fails on the upload. Looking at
all the drop flows on br-int, I did notice that this flow was incrimenting
when the upload part of the test was active:

*cookie=0xa9964f66f62764ad, duration=1494.495s, table=82, n_packets=5813,
n_bytes=348780, idle_age=4, priority=50,ct_state=+inv+trk actions=drop*

So I added this flow to mirror what would have been dropped to a dummy
interface (of port 2) that I could tcpdump to see what it was actually
dropping:

*ovs-ofctl add-flow br-int
table=82,priority=51,ct_state=+inv+trk,actions=output:2*

>From the tcpdump, I call see the traffic that the VM is missing that is
likely causing this whole issue...

Anyone have any thoughts on this?

Thanks!

On Thu, Mar 23, 2017 at 11:49 AM, Adam Lawson <alawson at aqorn.com> wrote:

> For downloads, you're using probably DNAT or SNAT. For *uploads*, you're
> using floating IP's I'm guessing. Does uploads work for other VM's with a
> similar configuration? It's rare that this would occur so I would presume
> it's firewall related (either security group via OpenStack) or firewall on
> the VM itself.
>
> Another question, are incoming connections timing out, is the security
> group allowing connections from everyone or a subset? i ask because I
> haven't seen the easy questions asked up front.
>
> //adam
>
>
> *Adam Lawson*
>
> Principal Architect
> Office: +1-916-794-5706 <(916)%20794-5706>
>
> On Wed, Mar 22, 2017 at 11:31 AM, Kaustubh Kelkar <
> kaustubh.kelkar at casa-systems.com> wrote:
>
>> The select_all = 1 is supposed to mirror all the packets.
>>
>>
>>
>> Referring to the documentation (http://openvswitch.org/suppor
>> t/dist-docs/ovs-vswitchd.conf.db.5.html),
>>
>> “*select_all*: boolean
>>
>>               If true, every packet arriving  or  departing  on  any  port  is
>>
>>               selected for mirroring.
>>
>>>>
>>
>>
>> And for OVS 2.5,
>>
>>
>>
>> “In Open
>>
>>        vSwitch 2.5 and later, mirroring  occurs  just  after  a  packet  first
>>
>>        becomes  eligible, using the packet as it exists at that point; …
>>
>>
>>
>> in  Open  vSwitch  2.4, the modifications are never visible to
>>
>>        mirrors, whereas in Open  vSwitch  2.5  and  later  modifications
>> made
>>
>>        before  the first output that makes it eligible for mirroring to a
>> par‐
>>
>>        ticular destination are visible.
>>
>>>>
>> I believe, if the very first flow is dropping unicast packets, you might
>> not be able to mirror them.
>>
>>
>>
>> Maybe you can monitor the flow-tables on each OVS bridge while sending
>> traffic and see which flows’ count increases. Something like,
>>
>> watch –n 2 “ovs-ofctl dump-flows <bridge name>”
>>
>>
>>
>> -Kaustubh
>>
>>
>>
>> *From:* Sterdnot Shaken [mailto:sterdnotshaken at gmail.com]
>> *Sent:* Wednesday, March 22, 2017 12:24 PM
>> *To:* Kaustubh Kelkar <kaustubh.kelkar at casa-systems.com>
>> *Subject:* Re: [Openstack] VM can receive traffic, but not send it
>>
>>
>>
>> Here's was my first mirror setup:
>>
>> ip link add name dummy3 type dummy
>> ip link set dev dummy3 up
>>
>>
>>
>> ovs-vsctl add-port br-ex3 dummy3
>>
>> ovs-vsctl -- set bridge br-ex3 mirrors=@m \
>> -- --id=@src get port pat-ex3-bss \
>> -- --id=@mir get port dummy3 \
>> -- --id=@m create mirror name=ovs_mirror3 select-dst-port=@src
>> select-src-port=@src output-port=@mir select-all=true
>>
>>
>>
>> And here's the one I did by copying your example:
>>
>> ip link add name dummy3 type dummy
>> ip link set dev dummy3 up
>>
>>
>>
>> ovs-vsctl add-port br-ex3 dummy3
>>
>> ovs-vsctl -- set Bridge br-ex3 mirrors=@m  \
>> -- --id=@dummy3 get Port dummy3 \
>> -- --id=@pat-ex3-bss get Port pat-ex3-bss \
>> -- --id=@m create Mirror name=mirror0 \
>> select-dst-port=@pat-ex3-bss select-src-port=@pat-ex3-bss \
>> output-port=@dummy3 select_all=1
>>
>>
>>
>> Both yield the same results. When I tcpdump the respective dummy
>> interface attached to br-ex3, I only see broadcast traffic for the VM in
>> question, I never see unicast traffic (case and point, if I ping the
>> broadcast address on the VM, then traffic show's up in the tcpdump). I can
>> do a tcpdump on the external interface and see the unicast traffic though,
>> but I need to see where it's breaking in the OVS bridges.
>>
>> Is there some trick to mirror unicast dataplane traffic?
>>
>> Thanks in advance!
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Mar 22, 2017 at 10:07 AM, Kaustubh Kelkar <
>> kaustubh.kelkar at casa-systems.com> wrote:
>>
>>
>>
>> *From:* Sterdnot Shaken [mailto:sterdnotshaken at gmail.com]
>> *Sent:* Tuesday, March 21, 2017 8:54 PM
>> *To:* Kaustubh Kelkar <kaustubh.kelkar at casa-systems.com>
>> *Cc:* Richard Jones <rjones at suse.com>; openstack at lists.openstack.org
>> *Subject:* Re: [Openstack] VM can receive traffic, but not send it
>>
>>
>>
>> Thanks for everyone's kind help!
>>
>> Steve: I will try and turn off the offload features and see if that
>> helps. Thanks!
>>
>> Neil: I will also check and make sure neither RPF nor TTL are posing any
>> issues.
>>
>>
>> Kaustubh: Is there a reason the mirror approach only seems to work on
>> some of the OVS bridges, but not others? if I follow your instructions, I
>> can see traffic when I set up a mirror on some bridges, but not others...
>> Do I need to put these OVS bridges into promiscuous mode before the mirror
>> will work?
>>
>> [Kaustubh] I don’t recall putting the bridge in promiscuous mode, but it
>> has been a while since I had looked at this. How are you setting up the
>> mirrors? You would need to mirror a specific port of the bridge, not the
>> bridge itself.
>>
>> Thanks!!
>>
>>
>>
>> On Tue, Mar 21, 2017 at 9:42 AM, Kaustubh Kelkar <
>> kaustubh.kelkar at casa-systems.com> wrote:
>>
>> You can narrow down the point where the packets are being dropped by
>> mirroring and tracing packets on OVS bridge ports. I use a script that does
>> the following (as root):
>>
>>
>>
>> ip link add name sniff0 type dummy
>>
>> ip link set dev sniff0 up
>>
>> ovs-vsctl add-port br1 sniff0
>>
>> ovs-vsctl -- set Bridge br1 mirrors=@m  \
>>
>> -- --id=@sniff0 get Port sniff0 \
>>
>> -- --id=@eth0 get Port eth0 \
>>
>> -- --id=@m create Mirror name=mirror0 \
>>
>> select-dst-port=@eth0 select-src-port=@eth0 \
>>
>> output-port=@sniff0 select_all=1
>>
>>
>>
>> and to delete,
>>
>> ovs-vsctl clear Bridge br1 mirrors
>>
>> ovs-vsctl del-port br1 sniff0
>>
>> ip link del dev sniff0
>>
>>
>>
>> where eth0 is the point of packet capture and br1 is the bridge eth0
>> resides in. Then, you can run tcpdump on sniff0.
>>
>> Create such mirror ports on
>>
>> 1) phy-br-ex on external OVS bridge
>>
>> 2) int-br-ex on integration bridge
>>
>> 3) qvo-xxx on integration bridge
>>
>> Also capture packets on qvb-xxx on the linux bridge having the tap
>> interface of the VM. Hopefully, this will provide us more clues.
>>
>>
>>
>> -Kaustubh
>>
>>
>>
>> *From:* Sterdnot Shaken [mailto:sterdnotshaken at gmail.com]
>> *Sent:* Monday, March 20, 2017 9:17 PM
>> *To:* Richard Jones <rjones at suse.com>
>> *Cc:* openstack at lists.openstack.org
>> *Subject:* Re: [Openstack] VM can receive traffic, but not send it
>>
>>
>>
>> Wow! Thanks for answering both of my questions!
>>
>> So, I did some things you suggested, including setting the MSS in iperf
>> to something small (1000 bytes) and tested with no improvement. I then
>> changed the VM running on Openstack to have an MTU of 1000 and retested
>> with no improvement. I noticed that the node I was testing against was
>> reporting back to the VM on Openstack that it had an MSS of 8960, so just
>> for the heck of it, I changed the remote node's (server outside of
>> Openstack) MTU also to 1000 bytes and retested with no improvement. (The
>> effects of all of these tests were also validated by checking mss settings
>> in the tcp header via tcpdump).
>>
>> To simplify the equation, I ditched the iperf for the time being and just
>> did a simple "telnet 'remote server' 8080" test from the remote server to
>> the VM in Openstack, while capturing packets all along the way (4 different
>> points along the network path). Every point saw the same packets, including
>> the VM's tap interface as expected. I then reversed the test by initiating
>> the tcp session on the VM in Openstack to the remote server while running
>> the packet captures at those same points having set the remote server to
>> respond with a TCP Reset. From VM to Remote server traffic looked correct
>> with expected TCP SYN. The TCP Reset that the remote server responded with
>> passed all 4 points of the network, including the external interface on the
>> Compute node where the VM resides, but the TAP interface that connects to
>> the VM NEVER sees the Reset. I can recreate this condition over and over.
>>
>> So, thanks to your ideas Richard, I'm no longer convinced this is an MTU
>> issue. What would prevent a TCP related response from being forwarded from
>> the external interface to the intended VM? The security group we have
>> applied to this VM is wide open, so I can't imagine that is the cause...
>>
>> Here are 2 packet captures where I initiated a telnet to the remote
>> server from the VM in Openstack. As said above, I set the remote server to
>> respond with a reset. The top one is from the physical interface on the
>> Compute node where the VM resides and the other, the tap interface to that
>> VM:
>>
>> [(openstack-mitaka) root at prv-0-18-compute user]# tcpdump -nni eth0 host
>> x.y.120.23 and host x.y.224.45
>> tcpdump: WARNING: eth0: no IPv4 address assigned
>> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
>> listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
>> 19:10:13.143931 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
>> 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
>> 0
>> 19:10:13.147951 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
>> ack 3131027442, win 0, length 0
>> 19:10:16.156520 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
>> 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
>> 0
>> 19:10:16.157693 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
>> ack 1, win 0, length 0
>> 19:10:22.157407 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
>> 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
>> 19:10:22.158682 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0,
>> ack 1, win 0, length 0
>>
>>
>> [(openstack-mitaka) root at prv-0-18-compute user]# tcpdump -nni
>> tap3bbe0f9d-6b host x.y.120.23 and host x.y.224.45
>> tcpdump: WARNING: tap3bbe0f9d-6b: no IPv4 address assigned
>> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
>> listening on tap3bbe0f9d-6b, link-type EN10MB (Ethernet), capture size
>> 65535 bytes
>> 19:10:13.143739 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
>> 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
>> 0
>> 19:10:16.156499 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
>> 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length
>> 0
>> 19:10:22.157384 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq
>> 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
>>
>> Any ideas? Thanks in advance for your help!!
>>
>> Steve
>>
>>
>>
>> On Mon, Mar 20, 2017 at 4:17 PM, Richard Jones <rjones at suse.com> wrote:
>>
>> You might consider taking a packet trace of the start of an upload to see
>> what the TCP MSS (Maximum Segment Size) options look like and perhaps
>> compare between the different configs.  Also, you could consider either
>> using netperf and having it tweak the MSS to a smaller value (test-specific
>> -G option if I recall correctly), or just try dropping the MTU of your VM
>> before you try the upload.
>>
>> Another way to use netperf to "probe" without tweaking MSS or MTU
>> settings would be to use the TCP_RR test with increasing request/response
>> sizes.  If there is indeed an MTU issue somewhere along the way, as you
>> walk the request/response size up to the local MTU, you should see the test
>> performance drop off a cliff if not go fully to zero.
>>
>> Does the port for the VM have a security group rule permitting ICMP
>> traffic in?  Offhand I wouldn't expect that to be different between the two
>> network setups you've described because I'd not have expected the virtual
>> router to pay attention to an arriving ICMP Destination Unreachable,
>> Datagram Too Big message to have the routed version work, but it seemed a
>> reasonable straw at which to grasp.
>>
>> rick jones
>>
>> PS perhaps iperf has a similar option to set the TCP MSS, I've not looked.
>>
>> >>> Sterdnot Shaken <sterdnotshaken at gmail.com> 03/20/17 3:07 PM >>>
>>
>> Our info:
>>
>> Openstack version: Mitaka (using OVS 2.5)
>> Firewall driver: Openvswitch
>>
>> Anyone know why VM's that are directly on a Flat Provider Network (so the
>> VM would have a public IP directly assigned to it) can download data just
>> fine, but when we try and upload anything (iperf where the VM is the
>> client
>> or something even like speedtest.net (upload portion)) the VM simply
>> can't
>> get data out to the intended destination? Again, download works great,
>> upload doesn't.
>>
>> If I take that VM and change it's interface to be a tenant network one
>> that
>> has a Openstack HA virtual router, everything (upload and download) works
>> perfectly. The problem only seems to be apparent when the VM is directly
>> on
>> the external network.
>>
>> It seems like an MTU issue, but I don't see how... Here are the MTU's of
>> the part's at play:
>>
>> VM: 1500
>> br-int (specific interface connecting to VM) - 9216
>> br-ex - (can't tell what that MTU is set to)
>>
>> Any help would be GREATLY appreciated.
>>
>> Steve
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Mailing list: http://lists.openstack.org/cgi
>> -bin/mailman/listinfo/openstack
>> Post to     : openstack at lists.openstack.org
>> Unsubscribe : http://lists.openstack.org/cgi
>> -bin/mailman/listinfo/openstack
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20170323/c1faafff/attachment.html>


More information about the Openstack mailing list