[Openstack] Security Group Rule Refresh

Soren Hansen soren at linux2go.dk
Thu Feb 23 23:03:50 UTC 2012


2012/2/23 Day, Phil <philip.day at hp.com>:
>> 1 deal with the situation where a refresh call to one of the compute
>>   nodes got lost. If that happened, at least it would all get sorted
>>   out on the next refresh.
> Can see the advantage of this, but on an active system this can be
> quite an overhead compared to a periodic refresh.

Well, a periodic refresh will likely happen more often than the
refreshes triggered by changes, don't you think? And periodic refreshes
will inevitably have to refresh everything (otherwise they seem
pointless).

>> 2 the routine that turned the rules from the database into iptables
>>   rules was complex enough as it was. Making it remove only rules for a
>>   single security group or a single instance or whatever would make it
>>   even worse.
> I wonder if we're talking about the same driver - the code we're
> looking at is in the IptablesFirewallDriver  in libvirt/firewall.py
> (which I think is moved up to virt/firewall.py in Essex).  That seems
> to create a chain per Instance and do the update on a per instance
> basis, so I'm  not quite sure I understand your point ?

Sorry, I was basing this all on memory. The point is simply that if the
routine that did all of this would have to reliably leave everything
else alone, and only touch the rules pertaining to a particular
instance, the logic would be even more complicated than it already is.

>> 3 The difference in terms of efficiency is miniscule. iptables
>>   replaces full tables at a time anyway, and while the relative
>>   amount of data needed to be fetched from the database might be much
>>   larger than with a more selective refresh, the absolute amount of
>>   data is still pretty small.
> It may be that we're hitting a particular case - but we have a test
> system with 10's of VMs per host, on not many hosts, and some groups
> with 70+ VMs and a rule set that references the security group itself.
> So every VM in that group that gets refreshed (and there are many on
> each host) has to rebuild rules for each VM in the group.

That's a bug. It's supposed to only refresh once, regardless of how many
affected VM's there are.

> The impact of this overhead on every VM create and delete in
> un-related groups is killing the system - eps as the update code
> doesn't yield so other tasks on the compute node (such as the create
> itself are blocked).

Have you been able to profile this at all? Is it the DB query that takes
a long time or is it something else? Anyways, I don't fully understand
why any part of the process would make anything hang. Both the
communication with the DB as well as calling out to iptables-restore
should yield control over to the eventlet main loop and let other things
run. I wonder why this isn't happening.

>> Point 2 should be more palatable now that the simpler implementation
>> has proven itself.
> Could you clarify which simpler implementation your referring to

It's probably a poor choice of words :) The "simpler" implementation is
the current one. The more complicated one would be one that reliably
would only touch the rules pertaining to the instances or security
groups that are actually being changed.

> - I've seen the  NWFilterFirewall class and its associated comment
> block, but it wasn't clear to me under what circumstances it would be
> worth switching to this ?

None, at the moment, due to this bug:

   https://bugzilla.redhat.com/show_bug.cgi?id=642171

-- 
Soren Hansen             | http://linux2go.dk/
Senior Software Engineer | http://www.cisco.com/
Ubuntu Developer         | http://www.ubuntu.com/
OpenStack Developer      | http://www.openstack.org/




More information about the Openstack mailing list