[Openstack] nova-network iptables and dhcp issue

Joe Topjian joe at topjian.net
Wed Jan 15 16:05:32 UTC 2014


Hello,

I ran into a very odd issue today when setting up a new OpenStack cloud.
Instances that were migrated to another compute node lost communication
with the DHCP server once their lease was up.

The cloud is configured with nova-network, FlatDHCPManager, and uses
multi-host. Shared storage is not being used, so we were migrating with
--block-migrate.

We narrowed the issue down to iptables. The rules are acting very funny.

On the source compute node (192.168.1.12), before migrating:

:nova-compute-inst-49 - [0:0]
-A nova-compute-inst-49 -m state --state INVALID -j DROP
-A nova-compute-inst-49 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A nova-compute-inst-49 -j nova-compute-provider
-A nova-compute-inst-49 -s 192.168.1.12/32 -p udp -m udp --sport 67
--dport 68 -j ACCEPT
-A nova-compute-inst-49 -p icmp -j ACCEPT
-A nova-compute-inst-49 -p tcp -m tcp --dport 22 -j ACCEPT
-A nova-compute-inst-49 -j nova-compute-sg-fallback

On the destination compute node (192.168.1.11), after migrating:

:nova-compute-inst-49 - [0:0]
-A nova-compute-inst-49 -m state --state INVALID -j DROP
-A nova-compute-inst-49 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A nova-compute-inst-49 -j nova-compute-provider
-A nova-compute-inst-49 -s 192.168.1.12/32 -p udp -m udp --sport 67
--dport 68 -j ACCEPT
-A nova-compute-inst-49 -p icmp -j ACCEPT
-A nova-compute-inst-49 -p tcp -m tcp --dport 22 -j ACCEPT
-A nova-compute-inst-49 -j nova-compute-sg-fallback

Note how 192.168.1.12 was directly copied over. The old compute node no
longer accepts the instance's lease request and performs a DHCP NAK. This
is now an invalid rule.

After 60 seconds, the instance loses its DHCP lease and becomes unreachable.

On the destination compute node after hard rebooting the instance:

:nova-compute-inst-49 - [0:0]
-A nova-compute-inst-49 -m state --state INVALID -j DROP
-A nova-compute-inst-49 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A nova-compute-inst-49 -j nova-compute-provider
-A nova-compute-inst-49 -s 192.168.1.12/32 -p udp -m udp --sport 67
--dport 68 -j ACCEPT
-A nova-compute-inst-49 -p icmp -j ACCEPT
-A nova-compute-inst-49 -p tcp -m tcp --dport 22 -j ACCEPT
-A nova-compute-inst-49 -j nova-compute-sg-fallback
-A nova-compute-inst-49 -s 192.168.1.11/32 -p udp -m udp --sport 67
--dport 68 -j ACCEPT

Note how 192.168.1.11 has been added to the ruleset, but it's after the
fallback jump. The fallback jump simply drops the packet.

So we were scratching our heads on what to do. The first thing we tried was
to delete the fallback jump. That worked. But when we rebooted the node,
the rule was, of course, reinjected.

Our next thought was to add a security group rule allowing DHCP. We did
that and saw that any edit to the security group fixed the whole issue!

Note the addition of a port 80 rule, and how the DHCP rule is for the right
server as well as in the right location:

:nova-compute-inst-49 - [0:0]
-A nova-compute-inst-49 -m state --state INVALID -j DROP
-A nova-compute-inst-49 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A nova-compute-inst-49 -j nova-compute-provider
-A nova-compute-inst-49 -s 192.168.1.11/32 -p udp -m udp --sport 67
--dport 68 -j ACCEPT
-A nova-compute-inst-49 -p icmp -j ACCEPT
-A nova-compute-inst-49 -p tcp -m tcp --dport 22 -j ACCEPT
-A nova-compute-inst-49 -p tcp -m tcp --dport 80 -j ACCEPT
-A nova-compute-inst-49 -j nova-compute-sg-fallback


Does anyone know what's going on here?

Thanks,
Joe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140115/dd65bcc1/attachment.html>


More information about the Openstack mailing list