[Openstack] VM can receive traffic, but not send it
Kaustubh Kelkar
kaustubh.kelkar at casa-systems.com
Tue Mar 21 15:42:47 UTC 2017
You can narrow down the point where the packets are being dropped by mirroring and tracing packets on OVS bridge ports. I use a script that does the following (as root):
ip link add name sniff0 type dummy
ip link set dev sniff0 up
ovs-vsctl add-port br1 sniff0
ovs-vsctl -- set Bridge br1 mirrors=@m \
-- --id=@sniff0 get Port sniff0 \
-- --id=@eth0 get Port eth0 \
-- --id=@m create Mirror name=mirror0 \
select-dst-port=@eth0 select-src-port=@eth0 \
output-port=@sniff0 select_all=1
and to delete,
ovs-vsctl clear Bridge br1 mirrors
ovs-vsctl del-port br1 sniff0
ip link del dev sniff0
where eth0 is the point of packet capture and br1 is the bridge eth0 resides in. Then, you can run tcpdump on sniff0.
Create such mirror ports on
1) phy-br-ex on external OVS bridge
2) int-br-ex on integration bridge
3) qvo-xxx on integration bridge
Also capture packets on qvb-xxx on the linux bridge having the tap interface of the VM. Hopefully, this will provide us more clues.
-Kaustubh
From: Sterdnot Shaken [mailto:sterdnotshaken at gmail.com]
Sent: Monday, March 20, 2017 9:17 PM
To: Richard Jones <rjones at suse.com>
Cc: openstack at lists.openstack.org
Subject: Re: [Openstack] VM can receive traffic, but not send it
Wow! Thanks for answering both of my questions!
So, I did some things you suggested, including setting the MSS in iperf to something small (1000 bytes) and tested with no improvement. I then changed the VM running on Openstack to have an MTU of 1000 and retested with no improvement. I noticed that the node I was testing against was reporting back to the VM on Openstack that it had an MSS of 8960, so just for the heck of it, I changed the remote node's (server outside of Openstack) MTU also to 1000 bytes and retested with no improvement. (The effects of all of these tests were also validated by checking mss settings in the tcp header via tcpdump).
To simplify the equation, I ditched the iperf for the time being and just did a simple "telnet 'remote server' 8080" test from the remote server to the VM in Openstack, while capturing packets all along the way (4 different points along the network path). Every point saw the same packets, including the VM's tap interface as expected. I then reversed the test by initiating the tcp session on the VM in Openstack to the remote server while running the packet captures at those same points having set the remote server to respond with a TCP Reset. From VM to Remote server traffic looked correct with expected TCP SYN. The TCP Reset that the remote server responded with passed all 4 points of the network, including the external interface on the Compute node where the VM resides, but the TAP interface that connects to the VM NEVER sees the Reset. I can recreate this condition over and over.
So, thanks to your ideas Richard, I'm no longer convinced this is an MTU issue. What would prevent a TCP related response from being forwarded from the external interface to the intended VM? The security group we have applied to this VM is wide open, so I can't imagine that is the cause...
Here are 2 packet captures where I initiated a telnet to the remote server from the VM in Openstack. As said above, I set the remote server to respond with a reset. The top one is from the physical interface on the Compute node where the VM resides and the other, the tap interface to that VM:
[(openstack-mitaka) root at prv-0-18-compute user]# tcpdump -nni eth0 host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143931 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:13.147951 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 3131027442, win 0, length 0
19:10:16.156520 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:16.157693 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 1, win 0, length 0
19:10:22.157407 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
19:10:22.158682 IP x.y.224.45.8080 > x.y.120.23.53877: Flags [R.], seq 0, ack 1, win 0, length 0
[(openstack-mitaka) root at prv-0-18-compute user]# tcpdump -nni tap3bbe0f9d-6b host x.y.120.23 and host x.y.224.45
tcpdump: WARNING: tap3bbe0f9d-6b: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap3bbe0f9d-6b, link-type EN10MB (Ethernet), capture size 65535 bytes
19:10:13.143739 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:16.156499 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,wscale 8,nop,nop,sackOK], length 0
19:10:22.157384 IP x.y.120.23.53877 > x.y.224.45.8080: Flags [S], seq 3131027441, win 8192, options [mss 960,nop,nop,sackOK], length 0
Any ideas? Thanks in advance for your help!!
Steve
On Mon, Mar 20, 2017 at 4:17 PM, Richard Jones <rjones at suse.com<mailto:rjones at suse.com>> wrote:
You might consider taking a packet trace of the start of an upload to see what the TCP MSS (Maximum Segment Size) options look like and perhaps compare between the different configs. Also, you could consider either using netperf and having it tweak the MSS to a smaller value (test-specific -G option if I recall correctly), or just try dropping the MTU of your VM before you try the upload.
Another way to use netperf to "probe" without tweaking MSS or MTU settings would be to use the TCP_RR test with increasing request/response sizes. If there is indeed an MTU issue somewhere along the way, as you walk the request/response size up to the local MTU, you should see the test performance drop off a cliff if not go fully to zero.
Does the port for the VM have a security group rule permitting ICMP traffic in? Offhand I wouldn't expect that to be different between the two network setups you've described because I'd not have expected the virtual router to pay attention to an arriving ICMP Destination Unreachable, Datagram Too Big message to have the routed version work, but it seemed a reasonable straw at which to grasp.
rick jones
PS perhaps iperf has a similar option to set the TCP MSS, I've not looked.
>>> Sterdnot Shaken <sterdnotshaken at gmail.com<mailto:sterdnotshaken at gmail.com>> 03/20/17 3:07 PM >>>
Our info:
Openstack version: Mitaka (using OVS 2.5)
Firewall driver: Openvswitch
Anyone know why VM's that are directly on a Flat Provider Network (so the
VM would have a public IP directly assigned to it) can download data just
fine, but when we try and upload anything (iperf where the VM is the client
or something even like speedtest.net<http://speedtest.net> (upload portion)) the VM simply can't
get data out to the intended destination? Again, download works great,
upload doesn't.
If I take that VM and change it's interface to be a tenant network one that
has a Openstack HA virtual router, everything (upload and download) works
perfectly. The problem only seems to be apparent when the VM is directly on
the external network.
It seems like an MTU issue, but I don't see how... Here are the MTU's of
the part's at play:
VM: 1500
br-int (specific interface connecting to VM) - 9216
br-ex - (can't tell what that MTU is set to)
Any help would be GREATLY appreciated.
Steve
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20170321/bb4d9660/attachment.html>
More information about the Openstack
mailing list