[Openstack-operators] RHEL 7 / CentOS 7 instances losing their network gateway

Joe Topjian joe at topjian.net
Wed Jan 28 20:54:25 UTC 2015


I'm pretty sure I've resolved this issue. Since this seems to happen
randomly, it might just be a coincidence that this is by far the longest
streak that it hasn't happened. :)

I noticed that CentOS 7 and RHEL 7 are setting a `valid_lft` and
`preferred_lft` timeout on the IPv4 address. You can see this by doing an
"ip a" on CentOS7/RHEL7 and comparing with either CentOS6 or Ubuntu. This
is the first time I've seen this used on IPv4. It's usually used for IPv6
privacy addresses. The timeout is set to something larger than the lease
renewal time.

What happens, though, is that it is occasionally taking a little longer to
receive the DHCP renewal. Then the `valid_lft` hits zero and the IP is
removed from the interface. When this happens, the kernel will clean up any
routes used by the removed IP (in this case, the default gateway).

A few seconds later, the late DHCP renewal is finally received and the IP
is added back to the interface. But due to how CentOS/RHEL7 is handling the
renewal in /usr/sbin/dhclient-script, the gateway is never re-added.

My guess as to why a newer version of dnsmasq does not exhibit this issue
is because it's advertising renewals a little different: enough to trigger
the part of dhclient-script to re-add the gateway. I have not verified this
theory, though.

What I've done for now is modified dhclient-script and removed any portion
that sets a valid_lft and preferred_lft, so now they are set to "forever"
just like other distros.

And so far, so good (crossing fingers).

Thanks,
Joe

On Tue, Jan 27, 2015 at 1:53 PM, Joe Topjian <joe at topjian.net> wrote:

> Hi George,
>
> All instances have only a single interface.
>
> Thanks,
> Joe
>
> On Tue, Jan 27, 2015 at 1:38 PM, George Shuklin <george.shuklin at gmail.com>
> wrote:
>
>>  How many network interfaces have your instance? If more than one - check
>> settings for second network (subnet). It can have own dhcp settings which
>> may mess up with routes for the main network.
>>
>>
>> On 01/27/2015 06:08 PM, Joe Topjian wrote:
>>
>> Hello,
>>
>>  I have run into two different OpenStack clouds where instances running
>> either RHEL 7 or CentOS 7 images are randomly losing their network gateway.
>>
>>  There's nothing in the logs that show any indication of why. There's no
>> DHCP hiccup or anything like that. The gateway has just disappeared.
>>
>>  If I log into the instance via another instance (so on the same subnet
>> since there's no gateway), I can manually re-add the gateway and everything
>> works... until it loses it again.
>>
>>  One cloud is running Havana and the other is running Icehouse. Both are
>> using nova-network and both are Ubuntu 12.04.
>>
>>  On the Havana cloud, we decided to install the dnsmasq package from
>> Ubuntu 14.04. This looks to have resolved the issue as this was back in
>> November and I haven't heard an update since.
>>
>>  However, we don't want to do that just yet on the Icehouse cloud. We'd
>> like to understand exactly why this is happening and why updating dnsmasq
>> resolves an issue that only one specific type of image is having.
>>
>>  I can make my way around CentOS, but I'm not as familiar with it as I
>> am with Ubuntu (especially CentOS 7). Does anyone know what change in
>> RHEL7/CentOS7 might be causing this? Or does anyone have any other ideas on
>> how to troubleshoot the issue?
>>
>>  I currently have access to two instances in this state, so I'd be happy
>> to act as remote hands and eyes. :)
>>
>>  Thanks,
>> Joe
>>
>>
>> _______________________________________________
>> OpenStack-operators mailing listOpenStack-operators at lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20150128/320c0268/attachment.html>


More information about the OpenStack-operators mailing list