<div dir="ltr">Assigning a distinct ct zone to each port sounds more scalable. This should keep the number of zones per host contained.<div><br></div><div>What should the workflow when rules are updated or deleted be?</div><div>1) From the rule security group find ports on host where it's applied</div><div>2) kill all matching connections for those ports</div><div><br></div><div>I'm just thinking aloud here, but can #1 be achieved without doing a call from the agent to the server?</div><div>Otherwise one could pack the set of affect ports in messages for security group updates.</div><div><br></div><div>Once we identify the ports, and therefore the ct zones, then we'd still need to find the connections matching the rules which were removed. This does not sound like being too difficult, but it can result in searches over long lists - think about an instance hosting a DB or web server.</div><div><br></div><div>The above two considerations made me suggest the idea of associating ct zones with rules, but it is probably true that this can cause us to go beyond the 2^16 limit.</div><div><br></div><div>Salvatore</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 24 October 2014 11:16, Miguel Angel Ajo Pelayo <span dir="ltr"><<a href="mailto:mangelajo@redhat.com" target="_blank">mangelajo@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">sorry: when I said boot, I mean "openvswitch agent restart".<br>
<div class="HOEnZb"><div class="h5"><br>
----- Original Message -----<br>
><br>
> Kevin, I agree, with you, 1 zone per port should be reasonable.<br>
><br>
> The 2^16 rule limit will force us into keeping state (to tie<br>
> ports to zones across reboots), may be this state can be just<br>
> recovered by reading the iptables rules at boot, and reconstructing<br>
> the current openvswitch-agent local port/zone association.<br>
><br>
> Best,<br>
> Miguel Ángel.<br>
><br>
> ----- Original Message -----<br>
><br>
> > While a zone per rule would be nice because we can easily delete connection<br>
> > state by only referencing a zone, that's probably overkill. We only need<br>
> > enough to disambiguate between overlapping IPs so we can then delete<br>
> > connection state by matching standard L3/4 headers again, right?<br>
><br>
> > I think a conntrack zone per port would be the easiest from an accounting<br>
> > perspective. We already setup an iptables chain per port so the grouping is<br>
> > already there (/me sweeps the complexity of choosing zone numbers under the<br>
> > rug).<br>
><br>
> > On Fri, Oct 24, 2014 at 2:25 AM, Salvatore Orlando < <a href="mailto:sorlando@nicira.com">sorlando@nicira.com</a> ><br>
> > wrote:<br>
><br>
> > > Just like Kevin I was considering using conntrack zones to segregate<br>
> > > connections.<br>
> ><br>
> > > However, I don't know whether this would be feasible as I've never used<br>
> > > iptables CT target in real applications.<br>
> ><br>
><br>
> > > Segregation should probably happen at the security group level - or even<br>
> > > at<br>
> > > the rule level - rather than the tenant level.<br>
> ><br>
> > > Indeed the same situation could occur even with two security groups<br>
> > > belonging<br>
> > > to the same tenant.<br>
> ><br>
><br>
> > > Probably each rule can be associated with a different conntrack zone. So<br>
> > > when<br>
> > > it's matched, the corresponding conntrack entries will be added to the<br>
> > > appropriate zone. And therefore when the rules are removed the<br>
> > > corresponding<br>
> > > connections to kill can be filtered by zone as explained by Kevin.<br>
> ><br>
><br>
> > > This approach will add a good number of rules to the RAW table however,<br>
> > > so<br>
> > > its impact on control/data plane scalability should be assessed, as it<br>
> > > might<br>
> > > turn as bad as the solution where connections where explicitly dropped<br>
> > > with<br>
> > > an ad-hoc iptables rule.<br>
> ><br>
><br>
> > > Salvatore<br>
> ><br>
><br>
> > > On 24 October 2014 09:32, Kevin Benton < <a href="mailto:blak111@gmail.com">blak111@gmail.com</a> > wrote:<br>
> ><br>
><br>
> > > > I think the root cause of the problem here is that we are losing<br>
> > > > segregation<br>
> > > > between tenants at the conntrack level. The compute side plugs<br>
> > > > everything<br>
> > > > into the same namespace and we have no guarantees about uniqueness of<br>
> > > > any<br>
> > > > other fields kept by conntrack.<br>
> > ><br>
> ><br>
><br>
> > > > Because of this loss of uniqueness, I think there may be another<br>
> > > > lurking<br>
> > > > bug<br>
> > > > here as well. One tenant establishing connections between IPs that<br>
> > > > overlap<br>
> > > > with another tenant will create the possibility that a connection the<br>
> > > > other<br>
> > > > tenant attempts will match the conntrack entry from the original<br>
> > > > connection.<br>
> > > > Then whichever closes the connection first will result in the conntrack<br>
> > > > entry being removed and the return traffic from the remaining<br>
> > > > connection<br>
> > > > being dropped.<br>
> > ><br>
> ><br>
><br>
> > > > I think the correct way forward here is to isolate each tenant (or even<br>
> > > > compute interface) into its own conntrack zone.[1] This will provide<br>
> > > > isolation against that imaginary unlikely scenario I just presented.<br>
> > > > :-)<br>
> > ><br>
> ><br>
> > > > More importantly, it will allow us to clear connections for a specific<br>
> > > > tenant<br>
> > > > (or compute interface) without interfering with others because<br>
> > > > conntrack<br>
> > > > can<br>
> > > > delete by zone.[2]<br>
> > ><br>
> ><br>
><br>
> > > > 1.<br>
> > > > <a href="https://github.com/torvalds/linux/commit/5d0aa2ccd4699a01cfdf14886191c249d7b45a01" target="_blank">https://github.com/torvalds/linux/commit/5d0aa2ccd4699a01cfdf14886191c249d7b45a01</a><br>
> > ><br>
> ><br>
> > > > 2. see the -w option.<br>
> > > > <a href="http://manpages.ubuntu.com/manpages/raring/man8/conntrack.8.html" target="_blank">http://manpages.ubuntu.com/manpages/raring/man8/conntrack.8.html</a><br>
> > ><br>
> ><br>
><br>
> > > > On Thu, Oct 23, 2014 at 3:22 AM, Elena Ezhova < <a href="mailto:eezhova@mirantis.com">eezhova@mirantis.com</a> ><br>
> > > > wrote:<br>
> > ><br>
> ><br>
><br>
> > > > > Hi!<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > I am working on a bug " ping still working once connected even after<br>
> > > > > related<br>
> > > > > security group rule is deleted" (<br>
> > > > > <a href="https://bugs.launchpad.net/neutron/+bug/1335375" target="_blank">https://bugs.launchpad.net/neutron/+bug/1335375</a> ). The gist of the<br>
> > > > > problem<br>
> > > > > is the following: when we delete a security group rule the<br>
> > > > > corresponding<br>
> > > > > rule in iptables is also deleted, but the connection, that was<br>
> > > > > allowed<br>
> > > > > by<br>
> > > > > that rule, is not being destroyed.<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > The reason for such behavior is that in iptables we have the<br>
> > > > > following<br>
> > > > > structure of a chain that filters input packets for an interface of<br>
> > > > > an<br>
> > > > > istance:<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > Chain neutron-openvswi-i830fa99f-3 (1 references)<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > pkts bytes target prot opt in out source destination<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > 0 0 DROP all -- * * <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> state INVALID /* Drop packets<br>
> > > > > that<br>
> > > > > are not associated with a state. */<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > 0 0 RETURN all -- * * <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> state RELATED,ESTABLISHED<br>
> > > > > /*<br>
> > > > > Direct<br>
> > > > > packets associated with a known session to the RETURN chain. */<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > 0 0 RETURN udp -- * * 10.0.0.3 <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> udp spt:67 dpt:68<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > 0 0 RETURN all -- * * <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> match-set<br>
> > > > > IPv43a0d3610-8b38-43f2-8<br>
> > > > > src<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > 0 0 RETURN tcp -- * * <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> tcp dpt:22 <---- rule that<br>
> > > > > allows<br>
> > > > > ssh on port 22<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > 1 84 RETURN icmp -- * * <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a><br>
> > > ><br>
> > ><br>
> ><br>
> > > > > 0 0 neutron-openvswi-sg-fallback all -- * * <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> /*<br>
> > > > > Send<br>
> > > > > unmatched traffic to the fallback chain. */<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > So, if we delete rule that allows tcp on port 22, then all<br>
> > > > > connections<br>
> > > > > that<br>
> > > > > are already established won't be closed, because all packets would<br>
> > > > > satisfy<br>
> > > > > the rule:<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > 0 0 RETURN all -- * * <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> <a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a> state RELATED,ESTABLISHED<br>
> > > > > /*<br>
> > > > > Direct<br>
> > > > > packets associated with a known session to the RETURN chain. */<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > I seek advice on the way how to deal with the problem. There are a<br>
> > > > > couple<br>
> > > > > of<br>
> > > > > ideas how to do it (more or less realistic):<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > * Kill the connection using conntrack<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > The problem here is that it is sometimes impossible to tell which<br>
> > > > > connection<br>
> > > > > should be killed. For example there may be two instances running in<br>
> > > > > different namespaces that have the same ip addresses. As a compute<br>
> > > > > doesn't<br>
> > > > > know anything about namespaces, it cannot distinguish between the two<br>
> > > > > seemingly identical connections:<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > $ sudo conntrack -L | grep "10.0.0.5"<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > tcp 6 431954 ESTABLISHED src=10.0.0.3 dst=10.0.0.5 sport=60723<br>
> > > > > dport=22<br>
> > > > > src=10.0.0.5 dst=10.0.0.3 sport=22 dport=60723 [ASSURED] mark=0 use=1<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > tcp 6 431976 ESTABLISHED src=10.0.0.3 dst=10.0.0.5 sport=60729<br>
> > > > > dport=22<br>
> > > > > src=10.0.0.5 dst=10.0.0.3 sport=22 dport=60729 [ASSURED] mark=0 use=1<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > I wonder whether there is any way to search for a connection by<br>
> > > > > destination<br>
> > > > > MAC?<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > * Delete iptables rule that directs packets associated with a known<br>
> > > > > session<br>
> > > > > to the RETURN chain<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > It will force all packets to go through the full chain each time and<br>
> > > > > this<br>
> > > > > will definitely make the connection close. But this will strongly<br>
> > > > > affect<br>
> > > > > the<br>
> > > > > performance. Probably there may be created a timeout after which this<br>
> > > > > rule<br>
> > > > > will be restored, but it is uncertain how long should it be.<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > Please share your thoughts on how it would be better to handle it.<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > Thanks in advance,<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > Elena<br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > > _______________________________________________<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > OpenStack-dev mailing list<br>
> > > ><br>
> > ><br>
> ><br>
> > > > > <a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
> > > ><br>
> > ><br>
> ><br>
> > > > > <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
> > > ><br>
> > ><br>
> ><br>
><br>
> > > > --<br>
> > ><br>
> ><br>
> > > > Kevin Benton<br>
> > ><br>
> ><br>
><br>
> > > > _______________________________________________<br>
> > ><br>
> ><br>
> > > > OpenStack-dev mailing list<br>
> > ><br>
> ><br>
> > > > <a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
> > ><br>
> ><br>
> > > > <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
> > ><br>
> ><br>
><br>
> > > _______________________________________________<br>
> ><br>
> > > OpenStack-dev mailing list<br>
> ><br>
> > > <a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
> ><br>
> > > <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
> ><br>
><br>
> > --<br>
> > Kevin Benton<br>
><br>
> > _______________________________________________<br>
> > OpenStack-dev mailing list<br>
> > <a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
> > <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
><br>
<br>
_______________________________________________<br>
OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
</div></div></blockquote></div><br></div>