[openstack-dev] [Ceilometer][QA][Tempest][Infra] Ceilometer tempest testing in gate

Alexei Kornienko alexei.kornienko at gmail.com
Thu Mar 20 22:03:19 UTC 2014


Hello,

We've done some profiling and results are quite interesting:
during 1,5 hour ceilometer inserted 59755 events (59755 calls to 
record_metering_data)
this calls resulted in total 2591573 SQL queries.

And the most interesting part is that 291569 queries were ROLLBACK queries.
We do around 5 rollbacks to record a single event!

I guess it means that MySQL backend is currently totally unusable in 
production environment.

Please find a full profiling graph attached.

Regards,

On 03/20/2014 10:31 PM, Sean Dague wrote:
> On 03/20/2014 01:01 PM, David Kranz wrote:
>> On 03/20/2014 12:31 PM, Sean Dague wrote:
>>> On 03/20/2014 11:35 AM, David Kranz wrote:
>>>> On 03/20/2014 06:15 AM, Sean Dague wrote:
>>>>> On 03/20/2014 05:49 AM, Nadya Privalova wrote:
>>>>>> Hi all,
>>>>>> First of all, thanks for your suggestions!
>>>>>>
>>>>>> To summarize the discussions here:
>>>>>> 1. We are not going to install Mongo (because "is's wrong" ?)
>>>>> We are not going to install Mongo "not from base distribution", because
>>>>> we don't do that for things that aren't python. Our assumption is
>>>>> dependent services come from the base OS.
>>>>>
>>>>> That being said, being an integrated project means you have to be able
>>>>> to function, sanely, on an sqla backend, as that will always be part of
>>>>> your gate.
>>>> This is a claim I think needs a bit more scrutiny if by "sanely" you
>>>> mean "performant". It seems we have an integrated project that no one
>>>> would deploy using the sql db driver we have in the gate. Is any one
>>>> doing that?  Is having a scalable sql back end a goal of ceilometer?
>>>>
>>>> More generally, if there is functionality that is of great importance to
>>>> any cloud deployment (and we would not integrate it if we didn't think
>>>> it was) that cannot be deployed at scale using sqla, are we really going
>>>> to say it should not be a part of OpenStack because we refuse, for
>>>> whatever reason, to run it in our gate using a driver that would
>>>> actually be used? And if we do demand an sqla backend, how much time
>>>> should we spend trying to optimize it if no one will really use it?
>>>> Though the slow heat job is a little different because the slowness
>>>> comes directly from running real use cases, perhaps we should just set
>>>> up a "slow ceilometer" job if the sql version is too slow for its budget
>>>> in the main job.
>>>>
>>>> It seems like there is a similar thread, at least in part, about this
>>>> around marconi.
>>> We required a non mongo backend to graduate ceilometer. So I don't think
>>> it's too much to ask that it actually works.
>>>
>>> If the answer is that it will never work and it was a checkbox with no
>>> intent to make it work, then it should be deprecated and removed from
>>> the tree in Juno, with a big WARNING that you shouldn't ever use that
>>> backend. Like Nova now does with all the virt drivers that aren't tested
>>> upstream.
>>>
>>> Shipping in tree code that you don't want people to use is bad for
>>> users. Either commit to making it work, or deprecate it and remove it.
>>>
>>> I don't see this as the same issue as the slow heat job. Heat,
>>> architecturally, is going to be slow. It spins up real OSes and does
>>> real thinks to them. There is no way that's ever going to be fast, and
>>> the dedicated job was a recognition that to support this level of
>>> services in OpenStack we need to give them more breathing room.
>> Peace. I specifically noted that difference in my original comment. And
>> for that reason the heat slow job may not be temporary.
>>> Architecturally Ceilometer should not be this expensive. We've got some
>>> data showing it to be aberrant from where we believe it should be. We
>>> should fix that.
>> There are plenty of cases where we have had code that passes gate tests
>> with acceptable performance but falls over in real deployment. I'm just
>> saying that having a driver that works ok in the gate but does not work
>> for real deployments is of no more value that not having it at all.
>> Maybe less value.
>> How do you propose to solve the problem of getting more ceilometer tests
>> into the gate in the short-run? As a practical measure l don't see why
>> it is so bad to have a separate job until the complex issue of whether
>> it is possible to have a real-world performant sqla backend is resolved.
>> Or did I miss something and it has already been determined that sqla
>> could be used for large-scale deployments if we just fixed our code?
> I think right now the ball is back in the ceilometer court to do some
> performance profiling, and lets see what comes of that. I don't think
> we're getting more test before the release in any real way.
>
>>> Once we get a base OS in the gate that lets us direct install mongo from
>>> base packages, we can also do that. Or someone can 3rd party it today.
>>> Then we'll even have comparative results to understand the differences.
>> Yes. Do you know which base OS's are candidates for that?
> Ubuntu 14.04 will have a sufficient level of Mongo, so some time in the
> Juno cycle we should have it in the gate.
>
> 	-Sean
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140321/714c3a63/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ceilometer.dot
Type: application/msword-template
Size: 68531 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140321/714c3a63/attachment.bin>


More information about the OpenStack-dev mailing list