[openstack-dev] [qa] Test Ceilometer polling in tempest
Ildikó Váncsa
ildiko.vancsa at ericsson.com
Fri Jul 25 08:26:17 UTC 2014
Hi Matt,
Thanks for the reply, please see my comments inline.
Best Regards,
Ildiko
-----Original Message-----
From: Matthew Treinish [mailto:mtreinish at kortar.org]
Sent: Thursday, July 24, 2014 6:19 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [qa] Test Ceilometer polling in tempest
On Wed, Jul 16, 2014 at 07:44:38PM +0400, Dina Belova wrote:
> Ildiko, thanks for starting this discussion.
>
> Really, that is quite painful problem for Ceilometer and QA team. As
> far as I know, currently there is some kind of tendency of making
> integration Tempest tests quicker and less resource consuming - that's
> quite logical IMHO. Polling as a way of information collecting from
> different services and projects is quite consuming speaking about load
> on Nova API, etc. - that's why I completely understand the wish of QA
> team to get rid of it, although polling still makes lots work inside
> Ceilometer, and that's why integration testing for this feature is
> really important for me as Ceilometer contributor - without pollsters
> testing we have no way to check its workability.
>
> That's why I'll be really glad if Ildiko's (or whatever other)
> solution that will allow polling testing in the gate will be found and accepted.
>
> Problem with described above solution requires some kind of change in
> what do we call "environment preparing" for the integration testing -
> and we really need QA crew help here. Afair polling deprecation was
> suggested in some of the IRC discussions (by only notifications
> usage), but that's not the solution that might be just used right now
> - but we need way of Ceilometer workability verification right now to
> continue work on its improvement.
>
> So any suggestions and comments are welcome here :)
>
> Thanks!
> Dina
>
>
> On Wed, Jul 16, 2014 at 7:06 PM, Ildikó Váncsa
> <ildiko.vancsa at ericsson.com>
> wrote:
>
> > Hi Folks,
> >
> >
> >
> > We’ve faced with some problems during running Ceilometer integration
> > tests on the gate. The main issue is that we cannot test the polling
> > mechanism, as if we use a small polling interval, like 1 min, then
> > it puts a high pressure on Nova API. If we use a longer interval,
> > like 10 mins, then we will not be able to execute any tests
> > successfully, because it would run too long.
> >
> >
> >
> > The idea, to solve this issue, is to reconfigure Ceilometer, when
> > the polling is tested. Which would mean to change the polling
> > interval from the default 10 mins to 1 min at the beginning of the
> > test, restart the service and when the test is finished, the polling
> > interval should be changed back to 10 mins, which will require one
> > more service restart. The downside of this idea is, that it needs
> > service restart today. It is on the list of plans to support dynamic
> > re-configuration of Ceilometer, which would mean the ability to change the polling interval without restarting the service.
> >
> >
> >
> > I know that this idea isn’t ideal from the PoV that the system
> > configuration is changed during running the tests, but this is an
> > expected scenario even in a production environment. We would change
> > a parameter that can be changed by a user any time in a way as users
> > do it too. Later on, when we can reconfigure the polling interval
> > without restarting the service, this approach will be even simpler.
So your saying that you expect users to be able to manually reconfigure Ceilometer on the fly to be able to use polling, that seems far from ideal.
ildikov: Sorry, maybe I wasn't 100% clear in my original mail. So polling will work out of the box after you installed Ceilometer with the default configuration. But it can happen that someone is not satisfied with the default values, like he/she needs samples from polling in every minute instead of every 10 minutes. In this case the polling interval should be modified in one of the configuration files (pipeline.yaml) and then the affected services should be restarted. But if someone is happy with the 10 mins polling interval, then he/she does not have to do anything to make it work. Later on we plan to use some automation, so that the polling interval could be configured without restarting the services, which would make the user's life a bit easier.
> >
> >
> >
> > This idea would make it possible to test the polling mechanism of
> > Ceilometer without any radical change in the ordering of test cases
> > or any other things that would be strange in integration tests. We
> > couldn’t find any better way to solve the issue of the load on the APIs caused by polling.
> >
> >
> >
> > What’s your opinion about this scenario? Do you think it could be a
> > viable solution to the above described problem?
> >
> >
> >
Umm, so frankly this approach seems kind of crazy to me. Aside from the project level implications of saying that as a user to ensure you can't use polling data reliably unless you adjust the polling frequency of Ceilometer. The bigger issue is that you're not necessarily solving the problem. The test will still have an inherent race condition because there is still no guarantee on the polling happening during the test window.
ildikov: So just as I said above, the re-configuration has nothing to do with using the polling data reliably. It means the customization of the system. What is the size of the test window?
So assume this were implemented and you decrease the polling rate to 1 min. and restart ceilometer during the test setUp(). You'll still dependent on an internal ceilometer event occurring during the wait period of the test. There's actually no guarantee that everything will happen in the timeout interval for the test, your just hoping it will. It's still just a best guess that will probably work in the general case, but will just cause race bugs in the gate when things get slow for random reasons. (which increasing the poll rate, even temporarily, will contribute to)
ildikov: By things get slow you mean that we will not be able to restart the system in time? The polling cycle should happen in the configured interval, so IMHO, if it does not, then we have an issue to deal with, but correct me, if I'm wrong here. Or did you mean that if we launch an instance, that we can collect some information about, than it will not be ready in time?
The other thing to consider is how would this be implemented, changing a config and restarting a service is *way* outside the scope of tempest. I'd be -2 on anything proposed to tempest that mucks around with a projects config file like this or anything that restarts a service. If it were exposed through a rest API command that'd be a different story, but then you'd still have the race issue I described above.
ildikov: What is exactly out of the scope of Tempest? To change the configuration or doing it by modifying a config file? So if it would be supported in the future that the polling interval could be set via REST API and it wouldn't require service restart, then it could be an acceptable option?
IMO, a far better approach would be to just implement a flush command, or something of that ilk, in the rest API to force a poll. That way the test can just say: poll now, and then collect the results after the action is complete. I can also see that being useful for admins/operators who want to collect the results periodically, for billing or what have you. They can write a tool that does a flush to get up to the latest data and the periodic collect everything.
Instead of now where writing such a utility is at the whim of whenever the most recent poll occurred.
ildikov: This option was discussed earlier with the Ceilometer core team. Ceilometer continuously observes the system within the predefined periods, in production no one will initiate a command like 'do the polling now' and the periodicity is already supported, it is not an issue on the user's side. And as it is not an feature that we would like to provide to the user, we do not want to use this for testing either, as it is far from the real life usage of polling in Ceilometer, we wouldn't get realistic results, it wouldn't prove that polling is working as it should. So the point in testing the polling is that it does not focus only on the polling itself, but to test that the config file is loaded correctly, the polling cycle is working as it should work and also we get some samples, etc.
-Matt Treinish
More information about the OpenStack-dev
mailing list