Network policer behavior

Vladimir Prokofev v at prokofev.me
Fri Feb 19 11:52:11 UTC 2021


I've encountered some really bizarre things today, and want to discuss it
for a sanity check.
This is probably more appropriate for a libvirt mailing list, but I want
some second opinion first.

Openstack Queens
libvirtd (libvirt) 4.0.0
qemu-x86_64 version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.15)

2 instances were launched a while ago with Zabbix Appliance 5.0.7 LTS
inside.
Both of them were "hacked" shortly after and became a part of a botnet that
participated in DDOS attacks via SYN-flood.
Sad, but nothing out of the ordinary so far.

Now to the bizarre part.
Both of the instances had QoS configured via instance metadata that limited
them for 100Mb/s in/out[1]. I've checked it on the compute side - it was
correctly applied there too[2]. This metadata was tested years ago with
usual iperf tcp/udp tests with 1-10 flows - it worked perfectly.
Both of the instances landed on the same compute node.
And both of them are ignoring network policer and sending about 400Mb/s of
SYN-flood traffic each on their respective tap interface, so about 800Mb/s
were flowing out of the compute node switch port.
So I've shut down 1 instance - 2nd one traffic rose to about 600Mb/s - ok,
they probably were contesting some resources.
Now I apply a qos-policy to the port of the remaining instance[3] - that
does the trick, I can see on the switch port that compute node traffic went
down to the expected level, but CPU context switches on the compute node
increased almost 3 times, and traffic on the tap interface rise to 1.6Gb/s!

What I can't understand is why libvirt network policer does not handle this
case? Why does implementing qos-policy actually increase traffic on the tap
interface?
I can't exactly say if it's the tc that is responsible for context switches
increase, or traffic generated by the instance.
At first I thought that my Zabbix went crazy, so I double checked. It seems
it takes its data for net.if.in key from /proc/net/dev and it appears to be
correct.

Any ideas appreciated.

[1]
quota:vif_inbound_average='12500', quota:vif_inbound_burst='3125',
quota:vif_inbound_peak='12500', quota:vif_outbound_average='12500',
quota:vif_outbound_burst='3125', quota:vif_outbound_peak='12500'

[2]
compute2:~$ virsh domiftune instance-0000141d tap834d76e9-5f
inbound.average: 12500
inbound.peak   : 12500
inbound.burst  : 3125
inbound.floor  : 0
outbound.average: 12500
outbound.peak  : 12500
outbound.burst : 3125

[3] openstack port set --qos-policy 100Mbps
834d16e9-5f85-4e82-a834-bdae613cfc23
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210219/7e1d8e57/attachment-0001.html>


More information about the openstack-discuss mailing list