[openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Won wjstk16 at gmail.com
Wed Oct 31 08:58:11 UTC 2018


Hi,

>
> This is strange. I would expect your original definition to work as well,
> since the alarm key in Vitrage is defined by a combination of the alert
> name and the instance. We will check it again.
> BTW,  we solved a different bug related to Prometheus alarms not being
> cleared [1]. Could it be related?
>

Using the original definition, no matter how different the instances are,
the alarm names are recognized as the same alarm in vitrage.
And I tried to install the rocky version and the master version on the new
server and retest but the problem was not solved. The latest bugfix seems
irrelevant.

Does the wrong timestamp appear if you run 'vitrage alarm list' cli
> command? please try running 'vitrage alarm list --debug' and send me the
> output.
>

I have attached 'vitrage-alarm-list.txt.'


> Please send me vitrage-collector.log and vitrage-graph.log from the time
> that the problematic vm was created and deleted. Please also create and
> delete a vm on your 'ubuntu' server, so I can check the differences in the
> log.
>

I have attached 'vitrage_log_on_compute1.zip' and
'vitrage_log_on_ubuntu.zip' files.
When creating a vm on computer1, a vitrage-collect log occurred, but no log
occurred when it was removed.

Br,
Won



2018년 10월 30일 (화) 오전 1:28, Ifat Afek <ifatafekn at gmail.com>님이 작성:

> Hi,
>
> On Fri, Oct 26, 2018 at 10:34 AM Won <wjstk16 at gmail.com> wrote:
>
>>
>> I solved the problem of not updating the Prometheus alarm.
>> Alarms with the same Prometheus alarm name are recognized as the same
>> alarm in vitrage.
>>
>> ------- alert.rules.yml
>> groups:
>> - name: alert.rules
>>   rules:
>>   - alert: InstanceDown
>>     expr: up == 0
>>     for: 60s
>>     labels:
>>       severity: warning
>>     annotations:
>>       description: '{{ $labels.instance }} of job {{ $labels.job }} has
>> been down
>>         for more than 30 seconds.'
>>       summary: Instance {{ $labels.instance }} down
>> ------
>> This is the contents of the alert.rules.yml file before I modify it.
>> This is a yml file that generates an alarm when the cardvisor
>> stops(instance down). Alarm is triggered depending on which instance is
>> down, but all alarms have the same name as 'instance down'. Vitrage
>> recognizes all of these alarms as the same alarm. Thus, until all 'instance
>> down' alarms were cleared, the 'instance down' alarm was recognized as
>> unresolved and the alarm was not extinguished.
>>
>
> This is strange. I would expect your original definition to work as well,
> since the alarm key in Vitrage is defined by a combination of the alert
> name and the instance. We will check it again.
> BTW,  we solved a different bug related to Prometheus alarms not being
> cleared [1]. Could it be related?
>
>
>> Can you please show me where you saw the 2001 timestamp? I didn't find it
>>> in the log.
>>>
>>
>> [image: image.png]
>> The time stamp is recorded well in log(vitrage-graph,collect etc), but in
>> vitrage-dashboard it is marked 2001-01-01.
>> However, it seems that the time stamp is recognized well internally
>> because the alarm can be resolved and is recorded well in log.
>>
>
> Does the wrong timestamp appear if you run 'vitrage alarm list' cli
> command? please try running 'vitrage alarm list --debug' and send me the
> output.
>
>
>> [image: image.png]
>> Host name ubuntu is my main server. I install openstack all in one in
>> this server and i install compute node in host name compute1.
>> When i create a new vm in nova(compute1) it immediately appear in the
>> entity graph. But in does not disappear in the entity graph when i delete
>> the vm. No matter how long i wait, it doesn't disappear.
>> Afther i execute 'vitrage-purge-data' command and reboot the
>> Openstack(execute reboot command in openstack server(host name ubuntu)), it
>> disappear. Only execute 'vitrage-purge-data' does not work. It need a
>> reboot to disappear.
>> When i create a new vm in nova(ubuntu) there is no problem.
>>
> Please send me vitrage-collector.log and vitrage-graph.log from the time
> that the problematic vm was created and deleted. Please also create and
> delete a vm on your 'ubuntu' server, so I can check the differences in the
> log.
>
> I implemented the web service of the micro service architecture and
>> applied the RCA. Attached file picture shows the structure of the web
>> service I have implemented. I wonder what data I receive and what can i do
>> when I link vitrage with kubernetes.
>> As i know, the vitrage graph does not present information about
>> containers or pods inside the vm. If that is correct, I would like to make
>> the information of the pod level appear on the entity graph.
>>
>> I follow (
>> https://docs.openstack.org/vitrage/latest/contributor/k8s_datasource.html)
>> this step. I attached the vitage.conf file and the kubeconfig file. The
>> contents of the Kubeconconfig file are copied from the contents of the
>> admin.conf file on the master node.
>> I want to check my settings are right and connected, but I don't know
>> how. It would be very much appreciated if you let me know how.
>>
> Unfortunately, Vitrage does not hold pods and containers information at
> the moment. We discussed the option of adding it in Stein release, but I'm
> not sure we will get to do it.
>
> Br,
> Ifat
>
> [1] https://review.openstack.org/#/c/611258/
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181031/e85c9e6d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 11247 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181031/e85c9e6d/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 42202 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181031/e85c9e6d/attachment-0003.png>


More information about the OpenStack-dev mailing list