[openstack-dev] [nova][ceilometer] FloatingIp pollster spamming n-api logs (bug 1328694)
Matt Riedemann
mriedem at linux.vnet.ibm.com
Fri Jun 13 13:32:52 UTC 2014
On 6/12/2014 10:31 AM, John Garbutt wrote:
> On 11 June 2014 20:07, Joe Gordon <joe.gordon0 at gmail.com> wrote:
>> On Wed, Jun 11, 2014 at 11:38 AM, Matt Riedemann
>> <mriedem at linux.vnet.ibm.com> wrote:
>>> On 6/11/2014 10:01 AM, Eoghan Glynn wrote:
>>>> Thanks for bringing this to the list Matt, comments inline ...
>>>>
>>>>> tl;dr: some pervasive changes were made to nova to enable polling in
>>>>> ceilometer which broke some things and in my opinion shouldn't have been
>>>>> merged as a bug fix but rather should have been a blueprint.
>>>>>
>>>>> ===
>>>>>
>>>>> The detailed version:
>>>>>
>>>>> I opened bug 1328694 [1] yesterday and found that came back to some
>>>>> changes made in ceilometer for bug 1262124 [2].
>>>>>
>>>>> Upon further inspection, the original ceilometer bug 1262124 made some
>>>>> changes to the nova os-floating-ips API extension and the database API
>>>>> [3], and changes to python-novaclient [4] to enable ceilometer to use
>>>>> the new API changes (basically pass --all-tenants when listing floating
>>>>> IPs).
>>>>>
>>>>> The original nova change introduced bug 1328694 which spams the nova-api
>>>>> logs due to the ceilometer change [5] which does the polling, and right
>>>>> now in the gate ceilometer is polling every 15 seconds.
>>>>
>>>>
>>>> IIUC that polling cadence in the gate is in the process of being reverted
>>>> to the out-of-the-box default of 600s.
>>>>
>>>>> I pushed a revert in ceilometer to fix the spam bug and a separate patch
>>>>> was pushed to nova to fix the problem in the network API.
>>>>
>>>>
>>>> Thank you for that. The revert is just now approved on the ceilometer
>>>> side,
>>>> and is wending its merry way through the gate.
>>>>
>>>>> The bigger problem I see here is that these changes were all made under
>>>>> the guise of a bug when I think this is actually a blueprint. We have
>>>>> changes to the nova API, changes to the nova database API, CLI changes,
>>>>> potential performance impacts (ceilometer can be hitting the nova
>>>>> database a lot when polling here), security impacts (ceilometer needs
>>>>> admin access to the nova API to list floating IPs for all tenants),
>>>>> documentation impacts (the API and CLI changes are not documented), etc.
>>>>>
>>>>> So right now we're left with, in my mind, two questions:
>>>>>
>>>>> 1. Do we just fix the spam bug 1328694 and move on, or
>>>>> 2. Do we revert the nova API/CLI changes and require this goes through
>>>>> the nova-spec blueprint review process, which should have happened in
>>>>> the first place.
>>>>
>>>>
>>>> So just to repeat the points I made on the unlogged #os-nova IRC channel
>>>> earlier, for posterity here ...
>>>>
>>>> Nova already exposed an all_tenants flag in multiple APIs (servers,
>>>> volumes,
>>>> security-groups etc.) and these would have:
>>>>
>>>> (a) generally pre-existed ceilometer's usage of the corresponding APIs
>>>>
>>>> and:
>>>>
>>>> (b) been tracked and proposed at the time via straight-forward LP
>>>> bugs,
>>>> as opposed to being considered blueprint material
>>>>
>>>> So the manner of the addition of the all_tenants flag to the floating_ips
>>>> API looks like it just followed existing custom & practice.
>>>>
>>>> Though that said, the blueprint process and in particular the nova-specs
>>>> aspect, has been tightened up since then.
>>>>
>>>> My preference would be to fix the issue in the underlying API, but to use
>>>> this as "a teachable moment" ... i.e. to require more oversight (in the
>>>> form of a reviewed & approved BP spec) when such API changes are proposed
>>>> in the future.
>>>>
>>>> Cheers,
>>>> Eoghan
>>>>
>>>>> Are there other concerns here? If there are no major objections to the
>>>>> code that's already merged, then #2 might be excessive but we'd still
>>>>> need docs changes.
>>>>>
>>>>> I've already put this on the nova meeting agenda for tomorrow.
>>>>>
>>>>> [1] https://bugs.launchpad.net/ceilometer/+bug/1328694
>>>>> [2] https://bugs.launchpad.net/nova/+bug/1262124
>>>>> [3] https://review.openstack.org/#/c/81429/
>>>>> [4] https://review.openstack.org/#/c/83660/
>>>>> [5] https://review.openstack.org/#/c/83676/
>>>>>
>>>>> --
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Matt Riedemann
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> OpenStack-dev mailing list
>>>>> OpenStack-dev at lists.openstack.org
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>
>>>>
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>
>>> While there is precedent for --all-tenants with some of the other APIs,
>>> I'm concerned about where this stops. When ceilometer wants polling on some
>>> other resources that the nova API exposes, will it need the same thing?
>>> Doing all of this polling for resources in all tenants in nova puts an undue
>>> burden on the nova API and the database.
>>>
>>> Can we do something with notifications here instead? That's where the
>>> nova-spec process would have probably caught this.
>>
>> ++ to notifications and not polling.
>
> Yeah, I think we need to revert this, and go through the specs
> process. Its been released in Juno-1 now, so this revert feels bad,
> but perhaps its the best of a bad situation?
>
> Word of caution, we need to get notifications versioned correctly if
> we want this as a more formal external "API". I think Heat have
> similar issues in this area, efficiently knowing about something
> happening in Nova. So we do need to "solve this".
>
> Some kind of callback to ceilometer's REST API sounds attractive, but
> messes up dependencies. Maybe implemented as a plugin to a Nova
> "notification system", could work better somehow? A versioned
> interface that issues "server life cycle" events (some versioned
> subset of notifications), and allow services to implement their own
> plugins to that interface. But maybe we should probably just version
> the current notifications, rather than re-invent another interface.
>
> Thanks,
> John
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
The revert patch is here:
https://review.openstack.org/#/c/99938/
--
Thanks,
Matt Riedemann
More information about the OpenStack-dev
mailing list