[openstack-dev] [TripleO] Tis the season...for a cloud reboot

Ben Nemec openstack at nemebean.com
Tue Dec 19 21:00:08 UTC 2017



On 12/19/2017 02:43 PM, Brian Haley wrote:
> On 12/19/2017 11:53 AM, Ben Nemec wrote:
>> The reboot is done (mostly...see below).
>>
>> On 12/18/2017 05:11 PM, Joe Talerico wrote:
>>> Ben - Can you provide some links to the ovs port exhaustion issue for
>>> some background?
>>
>> I don't know if we ever had a bug opened, but there's some discussion 
>> of it in 
>> http://lists.openstack.org/pipermail/openstack-dev/2016-December/109182.html 
>>   I've also copied Derek since I believe he was the one who found it 
>> originally.
>>
>> The gist is that after about 3 months of tripleo-ci running in this 
>> cloud we start to hit errors creating instances because of problems 
>> creating OVS ports on the compute nodes.  Sometimes we see a huge 
>> number of ports in general, other times we see a lot of ports that 
>> look like this:
>>
>> Port "qvod2cade14-7c"
>>              tag: 4095
>>              Interface "qvod2cade14-7c"
>>
>> Notably they all have a tag of 4095, which seems suspicious to me.  I 
>> don't know whether it's actually an issue though.
> 
> Tag 4095 is for "dead" OVS ports, it's an unused VLAN tag in the agent.
> 
> The 'qvo' here shows it's part of the VETH pair that os-vif created when 
> it plugged in the VM (the other half is 'qvb'), and they're created so 
> that iptables rules can be applied by neutron.  It's part of the "old" 
> way to do security groups with the OVSHybridIptablesFirewallDriver, and 
> can eventually go away once the OVSFirewallDriver can be used everywhere 
> (requires newer OVS and agent).
> 
> I wonder if you can run the ovs_cleanup utility to clean some of these up?

As in neutron-ovs-cleanup?  Doesn't that wipe out everything, including 
any ports that are still in use?  Or is there a different tool I'm not 
aware of that can do more targeted cleanup?

Oh, also worth noting that I don't think we have os-vif in this cloud 
because it's so old.  There's no os-vif package installed anyway.

> 
> -Brian
> 
>> I've had some offline discussions about getting someone on this cloud 
>> to debug the problem.  Originally we decided not to pursue it since 
>> it's not hard to work around and we didn't want to disrupt the 
>> environment by trying to move to later OpenStack code (we're still 
>> back on Mitaka), but it was pointed out to me this time around that 
>> from a downstream perspective we have users on older code as well and 
>> it may be worth debugging to make sure they don't hit similar problems.
>>
>> To that end, I've left one compute node un-rebooted for debugging 
>> purposes.  The downstream discussion is ongoing, but I'll update here 
>> if we find anything.
>>
>>>
>>> Thanks,
>>> Joe
>>>
>>> On Mon, Dec 18, 2017 at 10:43 AM, Ben Nemec <openstack at nemebean.com> 
>>> wrote:
>>>> Hi,
>>>>
>>>> It's that magical time again.  You know the one, when we reboot rh1 
>>>> to avoid
>>>> OVS port exhaustion. :-)
>>>>
>>>> If all goes well you won't even notice that this is happening, but 
>>>> there is
>>>> the possibility that a few jobs will fail while the te-broker host is
>>>> rebooted so I wanted to let everyone know.  If you notice anything else
>>>> hosted in rh1 is down (tripleo.org, zuul-status, etc.) let me know. 
>>>> I have
>>>> been known to forget to restart services after the reboot.
>>>>
>>>> I'll send a followup when I'm done.
>>>>
>>>> -Ben
>>>>
>>>> __________________________________________________________________________ 
>>>>
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: 
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>> __________________________________________________________________________ 
>>>
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: 
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>> __________________________________________________________________________ 
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: 
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list