[openstack-dev] [nova][ceilometer] FloatingIp pollster spamming n-api logs (bug 1328694)

Joe Gordon joe.gordon0 at gmail.com
Mon Jun 16 19:50:20 UTC 2014


On Sat, Jun 14, 2014 at 7:33 AM, Eoghan Glynn <eglynn at redhat.com> wrote:

>
>
> ----- Original Message -----
> > On 11 June 2014 20:07, Joe Gordon <joe.gordon0 at gmail.com> wrote:
> > > On Wed, Jun 11, 2014 at 11:38 AM, Matt Riedemann
> > > <mriedem at linux.vnet.ibm.com> wrote:
> > >> On 6/11/2014 10:01 AM, Eoghan Glynn wrote:
> > >>> Thanks for bringing this to the list Matt, comments inline ...
> > >>>
> > >>>> tl;dr: some pervasive changes were made to nova to enable polling in
> > >>>> ceilometer which broke some things and in my opinion shouldn't have
> been
> > >>>> merged as a bug fix but rather should have been a blueprint.
> > >>>>
> > >>>> ===
> > >>>>
> > >>>> The detailed version:
> > >>>>
> > >>>> I opened bug 1328694 [1] yesterday and found that came back to some
> > >>>> changes made in ceilometer for bug 1262124 [2].
> > >>>>
> > >>>> Upon further inspection, the original ceilometer bug 1262124 made
> some
> > >>>> changes to the nova os-floating-ips API extension and the database
> API
> > >>>> [3], and changes to python-novaclient [4] to enable ceilometer to
> use
> > >>>> the new API changes (basically pass --all-tenants when listing
> floating
> > >>>> IPs).
> > >>>>
> > >>>> The original nova change introduced bug 1328694 which spams the
> nova-api
> > >>>> logs due to the ceilometer change [5] which does the polling, and
> right
> > >>>> now in the gate ceilometer is polling every 15 seconds.
> > >>>
> > >>>
> > >>> IIUC that polling cadence in the gate is in the process of being
> reverted
> > >>> to the out-of-the-box default of 600s.
> > >>>
> > >>>> I pushed a revert in ceilometer to fix the spam bug and a separate
> patch
> > >>>> was pushed to nova to fix the problem in the network API.
> > >>>
> > >>>
> > >>> Thank you for that. The revert is just now approved on the ceilometer
> > >>> side,
> > >>> and is wending its merry way through the gate.
> > >>>
> > >>>> The bigger problem I see here is that these changes were all made
> under
> > >>>> the guise of a bug when I think this is actually a blueprint.  We
> have
> > >>>> changes to the nova API, changes to the nova database API, CLI
> changes,
> > >>>> potential performance impacts (ceilometer can be hitting the nova
> > >>>> database a lot when polling here), security impacts (ceilometer
> needs
> > >>>> admin access to the nova API to list floating IPs for all tenants),
> > >>>> documentation impacts (the API and CLI changes are not documented),
> etc.
> > >>>>
> > >>>> So right now we're left with, in my mind, two questions:
> > >>>>
> > >>>> 1. Do we just fix the spam bug 1328694 and move on, or
> > >>>> 2. Do we revert the nova API/CLI changes and require this goes
> through
> > >>>> the nova-spec blueprint review process, which should have happened
> in
> > >>>> the first place.
> > >>>
> > >>>
> > >>> So just to repeat the points I made on the unlogged #os-nova IRC
> channel
> > >>> earlier, for posterity here ...
> > >>>
> > >>> Nova already exposed an all_tenants flag in multiple APIs (servers,
> > >>> volumes,
> > >>> security-groups etc.) and these would have:
> > >>>
> > >>>    (a) generally pre-existed ceilometer's usage of the corresponding
> APIs
> > >>>
> > >>> and:
> > >>>
> > >>>    (b) been tracked and proposed at the time via straight-forward LP
> > >>> bugs,
> > >>>        as  opposed to being considered blueprint material
> > >>>
> > >>> So the manner of the addition of the all_tenants flag to the
> floating_ips
> > >>> API looks like it just followed existing custom & practice.
> > >>>
> > >>> Though that said, the blueprint process and in particular the
> nova-specs
> > >>> aspect, has been tightened up since then.
> > >>>
> > >>> My preference would be to fix the issue in the underlying API, but
> to use
> > >>> this as "a teachable moment" ... i.e. to require more oversight (in
> the
> > >>> form of a reviewed & approved BP spec) when such API changes are
> proposed
> > >>> in the future.
> > >>>
> > >>> Cheers,
> > >>> Eoghan
> > >>>
> > >>>> Are there other concerns here?  If there are no major objections to
> the
> > >>>> code that's already merged, then #2 might be excessive but we'd
> still
> > >>>> need docs changes.
> > >>>>
> > >>>> I've already put this on the nova meeting agenda for tomorrow.
> > >>>>
> > >>>> [1] https://bugs.launchpad.net/ceilometer/+bug/1328694
> > >>>> [2] https://bugs.launchpad.net/nova/+bug/1262124
> > >>>> [3] https://review.openstack.org/#/c/81429/
> > >>>> [4] https://review.openstack.org/#/c/83660/
> > >>>> [5] https://review.openstack.org/#/c/83676/
> > >>>>
> > >>>> --
> > >>>>
> > >>>> Thanks,
> > >>>>
> > >>>> Matt Riedemann
> > >>>>
> > >>>>
> > >>>> _______________________________________________
> > >>>> OpenStack-dev mailing list
> > >>>> OpenStack-dev at lists.openstack.org
> > >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >>>>
> > >>>
> > >>> _______________________________________________
> > >>> OpenStack-dev mailing list
> > >>> OpenStack-dev at lists.openstack.org
> > >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >>>
> > >>
> > >> While there is precedent for --all-tenants with some of the other
> APIs,
> > >> I'm concerned about where this stops.  When ceilometer wants polling
> on
> > >> some
> > >> other resources that the nova API exposes, will it need the same
> thing?
> > >> Doing all of this polling for resources in all tenants in nova puts an
> > >> undue
> > >> burden on the nova API and the database.
> > >>
> > >> Can we do something with notifications here instead?  That's where the
> > >> nova-spec process would have probably caught this.
> > >
> > > ++ to notifications and not polling.
> >
> > Yeah, I think we need to revert this, and go through the specs
> > process. Its been released in Juno-1 now, so this revert feels bad,
> > but perhaps its the best of a bad situation?
> >
> > Word of caution, we need to get notifications versioned correctly if
> > we want this as a more formal external "API". I think Heat have
> > similar issues in this area, efficiently knowing about something
> > happening in Nova. So we do need to "solve this".
> >
> > Some kind of callback to ceilometer's REST API sounds attractive, but
> > messes up dependencies.
>
> Yeah, I don't think that would be a workable approach, as it breaks
> the separation of concerns ... i.e. nova would then need to become
> aware of the format of ceilometer metering samples.
>
> Instead, the established model is that nova (or any other openstack
> service in general) either:
>
>  * emits resource lifecycle data in the form of notifications (which
>    generally contains a super-set of the data that ceilometer is
>    specifically interested in),
>
> or else:
>
>  * allows these data to be surfaced via a RESTful API that ceilometer
>    can call into
>
> > Maybe implemented as a plugin to a Nova
> > "notification system", could work better somehow? A versioned
> > interface that issues "server life cycle" events (some versioned
> > subset of notifications), and allow services to implement their own
> > plugins to that interface. But maybe we should probably just version
> > the current notifications, rather than re-invent another interface.
>
> I don't fully understand the concept of services implementing their
> own plugins to this versioned interface.
>
> Would the implementing service in this case be nova or ceilometer?
>
> If the latter, why re-invent the existing notifications mechanism
> layered over AMQP (i.e. oslo.messaging or the old openstack.common
> nofications, now deprecated)?
>
> In terms of versioning, it would be great to protect ceilometer from
> unilateral changes to or inconsistencies in the notification formats
> used by the services, as this has been found to be brittle in practice.
>

++,  right now nova only produces one stable type of API: REST.
Notifications aren't versioned and we don't really gate on their format (or
document them). Without versioning and proper gating, notifications will
never become a stable contractual API.

While I would like to see a way to protect ceilometer from unilateral
changes, I think we should focus on making sure notifications become a
contractual stable API, just like REST. As that is the more general issue
is at play here.


> Cheers,
> Eoghan
>
> > Thanks,
> > John
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140616/c2e5f611/attachment.html>


More information about the OpenStack-dev mailing list