[openstack-dev] [nova][ceilometer] FloatingIp pollster spamming n-api logs (bug 1328694)
eglynn at redhat.com
Wed Jun 11 19:22:51 UTC 2014
> > Thanks for bringing this to the list Matt, comments inline ...
> >> tl;dr: some pervasive changes were made to nova to enable polling in
> >> ceilometer which broke some things and in my opinion shouldn't have been
> >> merged as a bug fix but rather should have been a blueprint.
> >> ===
> >> The detailed version:
> >> I opened bug 1328694  yesterday and found that came back to some
> >> changes made in ceilometer for bug 1262124 .
> >> Upon further inspection, the original ceilometer bug 1262124 made some
> >> changes to the nova os-floating-ips API extension and the database API
> >> , and changes to python-novaclient  to enable ceilometer to use
> >> the new API changes (basically pass --all-tenants when listing floating
> >> IPs).
> >> The original nova change introduced bug 1328694 which spams the nova-api
> >> logs due to the ceilometer change  which does the polling, and right
> >> now in the gate ceilometer is polling every 15 seconds.
> > IIUC that polling cadence in the gate is in the process of being reverted
> > to the out-of-the-box default of 600s.
> >> I pushed a revert in ceilometer to fix the spam bug and a separate patch
> >> was pushed to nova to fix the problem in the network API.
> > Thank you for that. The revert is just now approved on the ceilometer side,
> > and is wending its merry way through the gate.
> >> The bigger problem I see here is that these changes were all made under
> >> the guise of a bug when I think this is actually a blueprint. We have
> >> changes to the nova API, changes to the nova database API, CLI changes,
> >> potential performance impacts (ceilometer can be hitting the nova
> >> database a lot when polling here), security impacts (ceilometer needs
> >> admin access to the nova API to list floating IPs for all tenants),
> >> documentation impacts (the API and CLI changes are not documented), etc.
> >> So right now we're left with, in my mind, two questions:
> >> 1. Do we just fix the spam bug 1328694 and move on, or
> >> 2. Do we revert the nova API/CLI changes and require this goes through
> >> the nova-spec blueprint review process, which should have happened in
> >> the first place.
> > So just to repeat the points I made on the unlogged #os-nova IRC channel
> > earlier, for posterity here ...
> > Nova already exposed an all_tenants flag in multiple APIs (servers,
> > volumes,
> > security-groups etc.) and these would have:
> > (a) generally pre-existed ceilometer's usage of the corresponding APIs
> > and:
> > (b) been tracked and proposed at the time via straight-forward LP bugs,
> > as opposed to being considered blueprint material
> > So the manner of the addition of the all_tenants flag to the floating_ips
> > API looks like it just followed existing custom & practice.
> > Though that said, the blueprint process and in particular the nova-specs
> > aspect, has been tightened up since then.
> > My preference would be to fix the issue in the underlying API, but to use
> > this as "a teachable moment" ... i.e. to require more oversight (in the
> > form of a reviewed & approved BP spec) when such API changes are proposed
> > in the future.
> > Cheers,
> > Eoghan
> >> Are there other concerns here? If there are no major objections to the
> >> code that's already merged, then #2 might be excessive but we'd still
> >> need docs changes.
> >> I've already put this on the nova meeting agenda for tomorrow.
> >>  https://bugs.launchpad.net/ceilometer/+bug/1328694
> >>  https://bugs.launchpad.net/nova/+bug/1262124
> >>  https://review.openstack.org/#/c/81429/
> >>  https://review.openstack.org/#/c/83660/
> >>  https://review.openstack.org/#/c/83676/
> >> --
> >> Thanks,
> >> Matt Riedemann
> >> _______________________________________________
> >> OpenStack-dev mailing list
> >> OpenStack-dev at lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> While there is precedent for --all-tenants with some of the other APIs,
> I'm concerned about where this stops. When ceilometer wants polling on
> some other resources that the nova API exposes, will it need the same
> thing? Doing all of this polling for resources in all tenants in nova
> puts an undue burden on the nova API and the database.
Yes, that's a fair point.
The only other existing usage of a nova API all_tenants flag by
ceilometer is on the nova servers API, but that's more self-
limiting by virtue of the fact that the query is also constrained
by host (i.e. to discover details for all instances running on a
single compute node).
Even in that single existing case, we have some ideas on how to
reduce the API polling frequency by detecting when these details
need to be refreshed (e.g. only when a new instance is booted, or
an existing instance is resized or migrated).
> Can we do something with notifications here instead? That's where the
> nova-spec process would have probably caught this.
OK, there is possibly a case for rebasing this aspect of floating IP
discovery on notifications as opposed to polling. I'll discuss further
with the original author.
More information about the OpenStack-dev