AW: [telemetry][ceilometer][gnocchi] How to configure aggregate for cpu_util or calculate from metrics

Bernd Bausch berndbausch at gmail.com
Thu Aug 1 12:20:49 UTC 2019


I have a solution. At least it works for me. Be aware that this is 
Devstack, but I think nothing I did to solve my problem is 
Devstack-specific. Also, I don't know whether there are more efficient 
or canonical ways to reconfigure Ceilometer. But it's good enough for me.


These are my steps - you may not need all of them.

  * in *pipeline.yaml*, set publisher to gnocchi://
  * in *the resource definition file*, define my new archive policy.
    By default, this file resides in the Ceilometer source tree
    .../ceilometer/publisher/data/gnocchi_resources.yaml, but you can
    use config parameter resources_definition_file to change the default
    (I didn't try).
    Example:

         - name: ceilometer-medium-rate
           aggregation_methods:
           - mean
           - rate:mean
          back_window: 0
          definition:
            - granularity: 1 minute
              timespan: 7 days
            - granularity: 1 hour
              timespan: 365 days

  * in the same resource definition file, *adjust the archive policy *of
    rate metrics.
    Example:

        - resource_type: instance
          metrics:
          ...
            cpu:
              archive_policy_name: ceilometer-medium-rate

  * *delete all existing metrics and resources *from Gnocchi
    Probably only necessary when Ceilometer is running, and not needed
    if you reconfigure it before its first start.
    This is a drastic measure, but if you do it at the beginning of a
    deployment, it won't cause loss of much data.
    Why is this required? A metric contains an archive policy that can't
    be changed. Thus existing metrics need to be recreated.
    Why remove resources? Because they reference the metrics that I removed.

  * *restart all Ceilometer services*
    This is required for re-reading the pipeline and the resource
    definition files.
    Ceilometer will create resources and metrics as needed when it sends
    its samples to Gnocchi.

I tested this by running a CPU hogging instance and listing its measures 
after a few minutes:


     gnocchi measures show --resource f28f6b78-9dd5-49cc-a6ac-28cb14477bf0
                           --aggregation rate:mean cpu

+---------------------------+-------------+---------------+
     | timestamp                 | granularity |         value |
     +---------------------------+-------------+---------------+
     | 2019-08-01T20:23:00+09:00 |        60.0 |  1810000000.0 |
     | 2019-08-01T20:24:00+09:00 |        60.0 | 39940000000.0 |
     | 2019-08-01T20:25:00+09:00 |        60.0 | 40110000000.0 |


This means that the instance accumulated 39940000000 nanoseconds of CPU 
time in the 60 seconds at
20:24:00. Note that the old /cpu_util /was expressed in percent, so that 
Aodh alarms and Heat autoscaling definitions must be adapted.


Good luck. Hire me as Ceilometer consultant if you get stuck :)


Bernd


On 8/1/2019 6:11 PM, Teckelmann, Ralf, NMU-OIP wrote:
>
> Hello Bernd, Hello Lingxian,
>
>
> +1
>
>
> You are not alone in your fruitless endeavor. Sadly, I can not come up 
> with a solution.
>
> We are stuck at the same point.
>
>
> Maybe some day a dedicated member of the OpenStack community give the 
> ceilometer guys a push to explain their service.
> For us, also using Stein, it is in the state of "not production ready".
>
> Cheers,
>
> Ralf T.
> ------------------------------------------------------------------------
> *Von:* Bernd Bausch <berndbausch at gmail.com>
> *Gesendet:* Donnerstag, 1. August 2019 03:16:25
> *An:* Lingxian Kong <anlin.kong at gmail.com>
> *Cc:* openstack-discuss <openstack-discuss at lists.openstack.org>
> *Betreff:* Re: [telemetry][ceilometer][gnocchi] How to configure 
> aggregate for cpu_util or calculate from metrics
>
> Lingxian,
>
> Thanks for "bumping" my request and keeping it alive. The reason I 
> need an answer: I am updating courseware to Stein that includes 
> autoscaling based on CPU and disk I/O rates. Looks like I am "cutting 
> edge" :)
>
> I don't think the problem is in the Gnocchi camp, but rather 
> Ceilometer. To store rates of measures in z, the following is needed:
>
>   * A /metric/. Raw measures are sent to the metric.
>   * An /archive policy/. The metric has an archive policy.
>   * The archive policy includes one or more /rate aggregates/
>
> My cloud has archive policies with rate aggregates, but the question 
> is about the first bullet: *How can I configure Ceilometer so that it 
> creates the corresponding metrics and sends measures to them. *In 
> other words, how is Ceilometer's output connected to my archive 
> policy. From my experience, just adding the archive policy to 
> Ceilometer's publishers is not sufficient.
>
> Ceilometer's source code includes 
> /.../publisher/data/gnocchi_resources.yaml/, which might well be the 
> place where this can be configured. I am not sure how to do it though, 
> and this file is not documented. I can read the source, but my 
> developer skills are insufficient for understanding how everything 
> fits together.
>
> Bernd
>
> On 8/1/2019 9:01 AM, Lingxian Kong wrote:
>> Hi Bernd,
>>
>> There were a lot of people asked the same question before, 
>> unfortunately, I don't know the answer either(we are still using an 
>> old version of Ceilometer). The original cpu_util support has been 
>> removed from Ceilometer in favor of Gnocchi, but AFAIK, there is no 
>> doc in Gnocchi mentioned how to achieve the same thing and no clear 
>> answer from the Gnocchi maintainers.
>>
>> It'd be much appreciated if you could find the answer in the end, or 
>> there will be someone who has the already solved the issue.
>>
>> Best regards,
>> Lingxian Kong
>> Catalyst Cloud
>>
>>
>> On Wed, Jul 31, 2019 at 1:28 PM Bernd Bausch <berndbausch at gmail.com 
>> <mailto:berndbausch at gmail.com>> wrote:
>>
>>     The message at the end of this email is some three months old. I
>>     have the same problem. The question is: *How to use the new rate
>>     metrics in Gnocchi. *I am using a Stein Devstack for my tests.*
>>     *
>>
>>     For example, I need the CPU rate, formerly named /cpu_util/. I
>>     created a new archive policy that uses /rate:mean/ aggregation
>>     and has a 1 minute granularity:
>>
>>     $ gnocchi archive-policy show ceilometer-medium-rate
>>     +---------------------+------------------------------------------------------------------+
>>     | Field               | Value |
>>     +---------------------+------------------------------------------------------------------+
>>     | aggregation_methods | rate:mean, mean |
>>     | back_window         | 0 |
>>     | definition          | - points: 10080, granularity: 0:01:00,
>>     timespan: 7 days, 0:00:00 |
>>     | name                | ceilometer-medium-rate |
>>     +---------------------+------------------------------------------------------------------+
>>
>>     I added the new policy to the publishers in /pipeline.yaml/:
>>
>>     $ tail -n5 /etc/ceilometer/pipeline.yaml
>>     sinks:
>>         - name: meter_sink
>>           publishers:
>>               -
>>     gnocchi://?archive_policy=medium&filter_project=gnocchi_swift
>>     *-
>>     gnocchi://?archive_policy=ceilometer-medium-rate&filter_project=gnocchi_swift*
>>
>>     After restarting all of Ceilometer, my hope was that the CPU rate
>>     would magically appear in the metric list. But no: All metrics
>>     are linked to archive policy /medium/, and looking at the details
>>     of an instance, I don't detect anything rate-related:
>>
>>     $ gnocchi resource show ae3659d6-8998-44ae-a494-5248adbebe11
>>     +-----------------------+---------------------------------------------------------------------+
>>     | Field                 | Value |
>>     +-----------------------+---------------------------------------------------------------------+
>>     ...
>>     | metrics               | compute.instance.booting.time:
>>     76fac1f5-962e-4ff2-8790-1f497c99c17d |
>>     |                       | cpu: af930d9a-a218-4230-b729-fee7e3796944 |
>>     |                       | disk.ephemeral.size:
>>     0e838da3-f78f-46bf-aefb-aeddf5ff3a80           |
>>     |                       | disk.root.size:
>>     5b971bbf-e0de-4e23-ba50-a4a9bf7dfe6e |
>>     |                       | memory.resident:
>>     09efd98d-c848-4379-ad89-f46ec526c183               |
>>     |                       | memory.swap.in
>>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__memory.swap.in&d=DwMDaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=WXex93lsaiQ-z7CeZkHv93lzt4fdCRIPXloSPQEU7CM&m=pnr97rQYDOFbG5UeNvvK1DDoP0YecUmqLwRt4SI4wOU&s=wDnZesKE356cMfbQrJMuwYwdEof7ULmQOFQgqE31umo&e=>:
>>     1bb4bb3c-e40a-4810-997a-295b2fe2d5eb |
>>     |                       | memory.swap.out:
>>     4d012697-1d89-4794-af29-61c01c925bb4               |
>>     |                       | memory.usage:
>>     93eab625-0def-4780-9310-eceff46aab7b |
>>     |                       | memory:
>>     ea8f2152-09bd-4aac-bea5-fa8d4e72bbb1 |
>>     |                       | vcpus:
>>     e1c5acaf-1b10-4d34-98b5-3ad16de57a98 |
>>     | original_resource_id  | ae3659d6-8998-44ae-a494-5248adbebe11 |
>>     ...
>>
>>     | type                  | instance |
>>     | user_id               | a9c935f52e5540fc9befae7f91b4b3ae |
>>     +-----------------------+---------------------------------------------------------------------+
>>
>>     Obviously, I am missing something. Where is the missing link?
>>     What do I have to do to get CPU usage rates? Do I have to create
>>     metrics? Do//I have to ask Ceilometer to create metrics? How?
>>
>>     Right now, no instructions seem to exist at all. If that is
>>     correct, I would be happy to write documentation once I
>>     understand how it works.
>>
>>     Thanks a lot.
>>
>>     Bernd
>>
>>     On 5/10/2019 3:49 PM, info at dantalion.nl
>>     <mailto:info at dantalion.nl> wrote:
>>>     Hello,
>>>
>>>     I am working on Watcher and we are currently changing how metrics are
>>>     retrieved from different datasources such as Monasca or Gnocchi. Because
>>>     of this major overhaul I would like to validate that everything is
>>>     working correctly.
>>>
>>>     Almost all of the optimization strategies in Watcher require the cpu
>>>     utilization of an instance as metric but with newer versions of
>>>     Ceilometer this has become unavailable.
>>>
>>>     On IRC I received the information that Gnocchi could be used to
>>>     configure an aggregate and this aggregate would then report cpu
>>>     utilization, however, I have been unable to find documentation on how to
>>>     achieve this.
>>>
>>>     I was also notified that cpu_util is something that could be computed
>>>     from other metrics. When reading
>>>     https://docs.openstack.org/ceilometer/rocky/admin/telemetry-measurements.html#openstack-compute  <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_ceilometer_rocky_admin_telemetry-2Dmeasurements.html-23openstack-2Dcompute&d=DwMDaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=WXex93lsaiQ-z7CeZkHv93lzt4fdCRIPXloSPQEU7CM&m=pnr97rQYDOFbG5UeNvvK1DDoP0YecUmqLwRt4SI4wOU&s=-ncji0Wl7WScsqBfumudi0ot_et_UIRfjh2c464FYWY&e=>
>>>     the documentation seems to agree on this as it states that cpu_util is
>>>     measured by using a 'rate of change' transformer. But I have not been
>>>     able to find how this can be computed.
>>>
>>>     I was hoping someone could spare the time to provide documentation or
>>>     information on how this currently is best achieved.
>>>
>>>     Kind Regards,
>>>     Corne Lukken (Dantali0n)
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190801/2def53b1/attachment-0001.html>


More information about the openstack-discuss mailing list