[Openstack-operators] [Ceilometer] Real world experience with Ceilometer deployments - Feedback requested

Kris G. Lindgren klindgren at godaddy.com
Thu Feb 12 16:30:04 UTC 2015


Event-based Monitoring & Billing solution for OpenStack

Unsure what its checking out for billing though.
____________________________________________
 
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.



On 2/12/15, 9:17 AM, "Matt Joyce" <matt at nycresistor.com> wrote:

>I thought stacktach was more in the vein of diagnostic.  Not billable
>resources. 
>
>On Feb 12, 2015 10:47 AM, Tim Bell <Tim.Bell at cern.ch> wrote:
>>
>> Does anyone have any proposals regarding
>>
>> > - Possible replacements for Ceilometer that you have used instead
>>
>> It seems that many sites have written their own systems. The
>>stacktach/monasca teams are due to demo to the operators meetup in
>>Philadelphia  in March.
>>
>> Does anyone have experience to share comparing ceilometer with
>>stacktach ? 
>>
>> Tim 
>>
>> > -----Original Message-----
>> > From: Daniele Venzano [mailto:daniele.venzano at eurecom.fr]
>> > Sent: 12 February 2015 12:24
>> > To: openstack-operators at lists.openstack.org
>> > Subject: Re: [Openstack-operators] [Ceilometer] Real world experience
>>with 
>> > Ceilometer deployments - Feedback requested
>> > 
>> > Unfortunately, I can only confirm the sorry state of Ceilometer.
>> > We tried it on a very small setup (6 compute nodes) and run in so
>>many issues, 
>> > we dropped it and created our own solution based on a mix of scripts
>>that read 
>> > from the nova/neutron DB, iptables and collectd data. No need for
>>more 
>> > collection agents than what we are already running for the systems
>>monitoring. 
>> > 
>> > We tried the version in Havana and, later, in Icehouse. For starters
>>the 
>> > documentation was suggesting MySQL as default backend. MySQL will
>>last just a 
>> > few days and then break down under the size of the tables. We tried
>>MongoDB, 
>> > but were still not satisfied with performance on such a small
>>cluster. 
>> > Then there is the metering agent. It is yet another daemon, not
>>integrated in 
>> > Neutron and there is no documentation about what it is actually
>>measuring. 
>> > What if I have multiple routers? Ingress and Egress? From which point
>>of view? 
>> > The same applies to Cinder, it requires and external agent (to be run
>>via cron!). 
>> > 
>> > Some metrics were not recorded, we couldn't understand why and,
>>again, no 
>> > documentation and no tooling to help us understand whether we were
>>just 
>> > missing some config options somewhere in nova-compute or there was
>>some 
>> > other problem with KVM/libvirt versions.
>> > And even when we had some data and wanted to generate just a
>>proof-of- 
>> > concept report with some information about tenant resource usage, we
>>found 
>> > problems with the API. The fact that no one had bothered to write a
>>simple 
>> > proof of concept script that uses the API to actually do something
>>useful was 
>> > really off-putting.
>> > 
>> > We had to dig in libvirt to understand what some of the metrics
>>actually mean. 
>> > We found that we could read those same metrics from our (more
>>efficient, well- 
>> > known) monitoring system.
>> > 
>> > For some time we run just the agents and aggregated the data in an
>> > elasticsearch instance through the UDP msgpack pipeline (more bugs,
>>message 
>> > format is inconsistent, different agents generate different fields,
>>in slightly 
>> > different formats).
>> > It works. But for our needs it was just too much work. Most of the
>>data is 
>> > already available from other sources with well-known APIs.
>> > 
>> > Ah, also there is a long standing bug open: Sahara and Ceilometer
>>cannot be 
>> > used together. And we use Sahara.
>> > 
>> > I opened bugs for some of these issues, but since then I lost
>>interest. 
>> > 
>> > In the end, I think it really depends on what kind of data you need
>>and what 
>> > (developer) resources you can throw at the problem.
>> > Unless in Juno things changed dramatically, Ceilometer will not work
>>out of the 
>> > box. You will have to lose time because of the non-existent
>>documentation, you
>> > will have to develop code and scripts anyway and finally you will
>>have to create 
>> > something between your billing system and the ceilometer API, because
>>to the 
>> > best of my knowledge there is nothing that uses it.
>> > 
>> > eBay has the resources to do all that. We don't.
>> > 
>> > 
>> > 
>> > -----Original Message-----
>> > From: George Shuklin [mailto:george.shuklin at gmail.com]
>> > Sent: Thursday 12 February 2015 02:59
>> > To: openstack-operators at lists.openstack.org
>> > Subject: Re: [Openstack-operators] [Ceilometer] Real world experience
>>with 
>> > Ceilometer deployments - Feedback requested
>> > 
>> > Ceilometer is in sad state.
>> > 
>> > 1. Collector leaks memory. We ran it on same host with mongo, and it
>>grab 
>> > 29Gb out of 32, leaving mongo with less than gig memory available.
>> > 2. Metering agent cause huge load on neutron-server. o(n) of metering
>>rules and 
>> > tenants. Few bugs reported, one bugfix in review.
>> > 3. Metering agent simply do no work on multi-network-nodes
>>installation. 
>> > It exepects all routers be on same host. Fixed or not - I don't know,
>>we have our 
>> > own crude fix.
>> > 4. Many rough edges. Ceilometer much less tested than nova. Sometimes
>>it 
>> > traces and skip counting. Fresh example: if metadata has '.' in the
>>name, 
>> > ceilometer trace on it and did not count in glance usage.
>> > 5. Very slow on reports (using mongo's mapreduce).
>> > 
>> > Overall feeling: barely usable, but with my experience with cloud
>>billings, not the
>> > worst thing I saw in my life.
>> > 
>> > About load: except reporting and memory leaks, it use rather small
>>amount of 
>> > resources. 
>> > 
>> > On 02/11/2015 09:37 PM, Maish Saidel-Keesing wrote:
>> > > Is Ceilometer ready for prime time?
>> > > 
>> > > I would be interested in hearing from people who have deployed
>> > > OpenStack clouds with Ceilometer, and their experience. Some of the
>> > > topics I am looking for feedback on are:
>> > > 
>> > > - Database Size
>> > > - MongoDB management, Sharding, replica sets etc.
>> > > - Replication strategies
>> > > - Database backup/restore
>> > > - Overall useability
>> > > - Gripes, pains and problems (things to look out for)
>> > > - Possible replacements for Ceilometer that you have used instead
>> > > 
>> > > 
>> > > If you are willing to share - I am sure it will be beneficial to
>>the 
>> > > whole community.
>> > > 
>> > > Thanks in Advance
>> > > 
>> > > 
>> > > With best regards,
>> > > 
>> > > 
>> > > Maish Saidel-Keesing
>> > > Platform Architect
>> > > Cisco 
>> > > 
>> > > 
>> > > 
>> > > 
>> > > _______________________________________________
>> > > OpenStack-operators mailing list
>> > > OpenStack-operators at lists.openstack.org
>> > > 
>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operator
>> > > s 
>> > 
>> > 
>> > _______________________________________________
>> > OpenStack-operators mailing list
>> > OpenStack-operators at lists.openstack.org
>> > 
>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> > 
>> > 
>> > _______________________________________________
>> > OpenStack-operators mailing list
>> > OpenStack-operators at lists.openstack.org
>> > 
>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>_______________________________________________
>OpenStack-operators mailing list
>OpenStack-operators at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators




More information about the OpenStack-operators mailing list