[openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Won wjstk16 at gmail.com
Fri Oct 26 07:33:44 UTC 2018


Hi Ifat,
I'm sorry for the late reply.

I solved the problem of not updating the Prometheus alarm.
Alarms with the same Prometheus alarm name are recognized as the same alarm
in vitrage.

------- alert.rules.yml
groups:
- name: alert.rules
  rules:
  - alert: InstanceDown
    expr: up == 0
    for: 60s
    labels:
      severity: warning
    annotations:
      description: '{{ $labels.instance }} of job {{ $labels.job }} has
been down
        for more than 30 seconds.'
      summary: Instance {{ $labels.instance }} down
------
This is the contents of the alert.rules.yml file before I modify it.
This is a yml file that generates an alarm when the cardvisor
stops(instance down). Alarm is triggered depending on which instance is
down, but all alarms have the same name as 'instance down'. Vitrage
recognizes all of these alarms as the same alarm. Thus, until all 'instance
down' alarms were cleared, the 'instance down' alarm was recognized as
unresolved and the alarm was not extinguished.


------alert.rules.yml(modified)
groups:
- name: alert.rules
  rules:
  - alert: InstanceDown on Apigateway
    expr: up{instance="192.168.12.164:31121"} == 0
    for: 5s
    labels:
      severity: warning
    annotations:
      description: '{{ $labels.instance }} of job {{ $labels.job }} has
been down
        for more than 30 seconds.'
      summary: Instance {{ $labels.instance }} down

  - alert: InstanceDown on Signup
    expr: up{instance="192.168.12.164:31122"} == 0
    for: 5s
    labels:
      severity: warning
    annotations:
      description: '{{ $labels.instance }} of job {{ $labels.job }} has
been down
        for more than 30 seconds.'
      summary: Instance {{ $labels.instance }} down
.
.
.
---------------
By modifying the rules as above, the problem has been solved.



> Can you please show me where you saw the 2001 timestamp? I didn't find it
> in the log.
>



[image: image.png]
The time stamp is recorded well in log(vitrage-graph,collect etc), but in
vitrage-dashboard it is marked 2001-01-01.
However, it seems that the time stamp is recognized well internally because
the alarm can be resolved and is recorded well in log.


Let me make sure I understand the problem.
> When you create a new vm in Nova, does it immediately appear in the entity
> graph?
> When you delete a vm, it remains? does it remain in a multi-node
> environment and deleted in a single node environment?
>

[image: image.png]
Host name ubuntu is my main server. I install openstack all in one in this
server and i install compute node in host name compute1.
When i create a new vm in nova(compute1) it immediately appear in the
entity graph. But in does not disappear in the entity graph when i delete
the vm. No matter how long i wait, it doesn't disappear.
Afther i execute 'vitrage-purge-data' command and reboot the
Openstack(execute reboot command in openstack server(host name ubuntu)), it
disappear. Only execute 'vitrage-purge-data' does not work. It need a
reboot to disappear.
When i create a new vm in nova(ubuntu) there is no problem.




I implemented the web service of the micro service architecture and applied
the RCA. Attached file picture shows the structure of the web service I
have implemented. I wonder what data I receive and what can i do when I
link vitrage with kubernetes.
As i know, the vitrage graph does not present information about containers
or pods inside the vm. If that is correct, I would like to make the
information of the pod level appear on the entity graph.

I follow (
https://docs.openstack.org/vitrage/latest/contributor/k8s_datasource.html)
this step. I attached the vitage.conf file and the kubeconfig file. The
contents of the Kubeconconfig file are copied from the contents of the
admin.conf file on the master node.
I want to check my settings are right and connected, but I don't know how.
It would be very much appreciated if you let me know how.

Br,
Won




















2018년 10월 10일 (수) 오후 11:52, Ifat Afek <ifatafekn at gmail.com>님이 작성:

> Hi Won,
>
> On Wed, Oct 10, 2018 at 11:58 AM Won <wjstk16 at gmail.com> wrote:
>
>>
>> my prometheus version : 2.3.2 and alertmanager version : 0.15.2 and I
>> attached files.(vitrage collector,graph logs and apache log and
>> prometheus.yml alertmanager.yml alarm rule file etc..)
>> I think the problem that resolved alarm does not disappear is the time
>> stamp problem of the alarm.
>>
>> -gray alarm info
>> severity:PAGE
>> vitrage id: c6a94386-3879-499e-9da0-2a5b9d3294b8  ,
>> e2c5eae9-dba9-4f64-960b-b964f1c01dfe , 3d3c903e-fe09-4a6f-941f-1a2adb09feca
>> , 8c6e7906-9e66-404f-967f-40037a6afc83 ,
>> e291662b-115d-42b5-8863-da8243dd06b4 , 8abd2a2f-c830-453c-a9d0-55db2bf72d46
>> ----------
>>
>> The alarms marked with the blue circle are already resolved. However, it
>> does not disappear from the entity graph and alarm list.
>> There were seven more gray alarms at the top screenshot in active alarms
>> like entity graph. It disappeared by deleting gray alarms from the
>> vitrage-alarms table in the DB or changing the end timestamp value to an
>> earlier time than the current time.
>>
>
> I checked the files that you sent, and it appears that the connection
> between Prometheus and Vitrage works well. I see in vitrage-graph log that
> Prometheus notified both on alert firing and on alert resolved statuses.
> I still don't understand why the alarms were not removed from Vitrage,
> though. Can you please send me the output of 'vitrage topology show' CLI
> command?
> Also, did you happen to restart vitrage-graph or vitrage-collector during
> your tests?
>
>
>> At the log, it seems that the first problem is that the timestamp value
>> from the vitrage comes to 2001-01-01, even though the starting value in the
>> Prometheus alarm information has the correct value.
>> When the alarm is solved, the end time stamp value is not updated so
>> alarm does not disappear from the alarm list.
>>
>
> Can you please show me where you saw the 2001 timestamp? I didn't find it
> in the log.
>
>
>> The second problem is that even if the time stamp problem is solved, the
>> entity graph problem will not be solved. Gray alarm information is not in
>> the vitage-collector log but in the vitrage graph and apache log.
>> I want to know how to forcefully delete entity from a vitage graph.
>>
>
> You shouldn't do it :-) there is no API for deleting entities, and messing
> with the database may cause unexpected results.
> The only thing that you can safely do is to stop all Vitrage services,
> execute 'vitrage-purge-data' command, and start the services again. This
> will cause rebuilding of the entity graph.
>
>
>> Regarding the multi nodes, I mean, 1 controll node(pc1) & 1 compute
>> node(pc2). So one openstack.
>>
>> The test VM in the picture is an instance on compute node that has
>> already been deleted. I waited for hours and checked nova.conf but it was
>> not removed.
>> This was not the occur in the queens version; in the rocky version,
>> multinode environment, there seems to be a bug in VM creation on multi node.
>> The same situation occurred in multi-node environments that were
>> configured with different PCs.
>>
>
> Let me make sure I understand the problem.
> When you create a new vm in Nova, does it immediately appear in the entity
> graph?
> When you delete a vm, it remains? does it remain in a multi-node
> environment and deleted in a single node environment?
>
> Br,
> Ifat
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181026/f4245b3b/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 42202 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181026/f4245b3b/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 11247 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181026/f4245b3b/attachment-0004.png>
-------------- next part --------------
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRFNE1Ea3dPREUzTURneE1Gb1hEVEk0TURrd05URTNNRGd4TUZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTENnCktWR3NLbis5Tlk1WTM0MkhTQzFmYjNsWVVWV1BvYXd4U01oV1FFbktHVFRpbG9XRHRMMVFVUG9VQlRwUGJuTm8KU2IzaTJ3WjlMZDQyTUE5SlNsS1dxTk5HeTZPRUthWHFoRHN6c1RtNU1sckxvKy9tNFZCOWVjY0hhc25pb1Z6QwpLU1BxcndEd2tPaE5CMGRvemQxL09zbXZRNHd2Q3BHWTlsOUJOSFVWbWszUVBTVW9Gc25ncW5LR2pXZFROWFU0Ci9vMnFFWFFEcU15YVI2bnQ0S2JXcFdOMlEwL29MNnk2SGxzQUw4MS9rT2dUWXE2NmhWU1pnTTREWE5UM1gzTlcKV0tELzRJV0VPcDRqL3pCaG41eTNTMWd5SWpUcTVkeTgvMEpDT1R4VHBVV2Zmc3FhemxyWHFYQ0wzeUNmYWZ4dwpJQzVJaFZSNzJGSmpVSkt0OFBrQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFLd3AvQm9VYURjZTk0YmpDZDkyMFQyMDgrODcKZ1hpeTBqaGp1UE5OWFJlcjRyZkcySm1ENXBmQzhFdjJJNG5EQzVnaktJQVhxMjlGSW1xY1pLU1doKzVYZHQvYwpxT1l3SGlBOGtKL1VKeDRDWmQrUmRMZEM5RmZiWmM0akpvWXJIei9xcHRxVEl4aExGMVphc01YNVdZZ0FDV3dvCmtLUzU4NTVBbk55blNMNUZNSHp6VWRoQWNmOGJpbFBSNGlBR1pHMTZiL01CTmFmc1hoSS9rQU9neGgybHNySzYKWWtnMkIxcDdPaG5SUnExamE1c1UvSTQwSTFJeVpEcWJldW91ZUZjS2p5MmRIb2JnOEVoVXNUSWF1eVZnMEhIUApmN1BCVU1BdTMvaDF3ZkFXaDNzL09BVDhSN0tabHJob1ZhMnV3MmhRZTYyYm4vUEZFZWNwb2FDSXF5VT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    server: https://192.168.12.164:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: kubernetes-admin
  name: kubernetes-admin at kubernetes
current-context: kubernetes-admin at kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
  user:
client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM4akNDQWRxZ0F3SUJBZ0lJZWJBM0liN1R2Njh3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB4T0RBNU1EZ3hOekE0TVRCYUZ3MHhPVEE1TURneE56QTRNVE5hTURReApGekFWQmdOVkJBb1REbk41YzNSbGJUcHRZWE4wWlhKek1Sa3dGd1lEVlFRREV4QnJkV0psY201bGRHVnpMV0ZrCmJXbHVNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQWxCbkM0dUl6NUV4WUdYdVYKYkQ3TEZxSFBWVzF1UDhMSmR6dG5ETVh5TjRSL1RjUUxGQ0lyclJwTURrSElUb2tHTURKMTZkUm50dzlhbW5qcQpzTFZjbHRRcUNQWHJrSXVQcXo1cHplanBuUU90MzRXdFRKVW1RYUFOTGZLdzMxKzJZVmU2MnZqZm9IRFpWZWZBCmFxVlA4cldQK0piajM4QUsrQlFGR29BbWFxUWJMaHZkL3grbjRMMW84cWJrNXp3d1Z6RnFNTXBTajFBWlp6QSsKelkwWDFEL21MczVtWUZDQUdLUGtYT25aTzZFVTZsQk5KUkg1MmE1K1hiRXphZW1WZXB3b0p5NForUlVDNHFUMgpZbVJsUWNwb2tZMng0bDFIa1hXVU12cVVFdmRIWUhjV01rNTBlMmg0MiswWWQvQk9WbEoycXVNNDVZdzk5K0pRCmZmTkllUUlEQVFBQm95Y3dKVEFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3dDZ1lJS3dZQkJRVUgKQXdJd0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFBekcyaEkxTCt6bG9na1JMN2krOFBhdzk1eWpWdVA0V1l1UwpOekpSQUk5QWlEUlN4ZWp3VHFzU2xwclMvRFFiR0xWMHk0YWRuRUJlLzBVSi9nM3RpTEMvQVJ6V1JINk5NTldqCi9uQUFBbmNYMkRKdGlHb3FCRTVoVXVndTY1cE96MjRpN1VBbWdVbktpTGc1bjdld2tXWnFYbVV3MysvSHR0S2kKenNKQkZteHdLZUJET0dMVWdpdlNacGtEMWs1K3ltMDVuV01FSWJmQndIL1MvOE9kWk1IZ1VwY3Z1OExPODFPTQpPSk1EWjc5RzJJdlFRL1BRTEpoNWNCV3hlcVFWaisxS1EwQU1DbFZDeDB2ekNOYytpYURVallhWVZTT0NmVVdrCjg0emUyUUU1TU9oTldmd0J1Z0k0b3Z4WEJyRUIxbVhkQlM3T3VTa1RERlh2NXViVnl3dz0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb3dJQkFBS0NBUUVBbEJuQzR1SXo1RXhZR1h1VmJEN0xGcUhQVlcxdVA4TEpkenRuRE1YeU40Ui9UY1FMCkZDSXJyUnBNRGtISVRva0dNREoxNmRSbnR3OWFtbmpxc0xWY2x0UXFDUFhya0l1UHF6NXB6ZWpwblFPdDM0V3QKVEpVbVFhQU5MZkt3MzErMllWZTYydmpmb0hEWlZlZkFhcVZQOHJXUCtKYmozOEFLK0JRRkdvQW1hcVFiTGh2ZAoveCtuNEwxbzhxYms1end3VnpGcU1NcFNqMUFaWnpBK3pZMFgxRC9tTHM1bVlGQ0FHS1BrWE9uWk82RVU2bEJOCkpSSDUyYTUrWGJFemFlbVZlcHdvSnk0WitSVUM0cVQyWW1SbFFjcG9rWTJ4NGwxSGtYV1VNdnFVRXZkSFlIY1cKTWs1MGUyaDQyKzBZZC9CT1ZsSjJxdU00NVl3OTkrSlFmZk5JZVFJREFRQUJBb0lCQUd5c2dwOHQwVm9pMHpyUAp2cE9SZUVFQk56eStjZm9EbXdZTTVzOHVxVkFudjZwMndwSmhpSjhhL3RndldTYVgwWnlvU25IczFMWTFaQXlaCjBjMGRKL1hkZFlMaHdadHRiVjBCRFc1MURJZVUzWTk1YmZNV050NU03WjdieVFJQUg3cEtQK2pTV25aR21KUTYKM0t6azVVZDZCMDBvbThuaUI2cUdOa0I5N0xLdTN5Z1RTTkdhWjd1RHM0SnI5Z0VXdzZBVjQ2WWFkQkxYRHNKYQpIY2xGK0pQZStnRCtxUVZBNkxyUVJCSnBlTUhvY3R5S28vbzMvWXVTblZ3Y3dKSTlMZGxDZmR3SWFVVERNMVhYCnlRcUFVUHIxK0xiTDNGUm4xYlNjOHRiQmdnT0ErL0wzOHdLMGZqcGU5RmVXb0txOHhHWUxBeXAxR1B2aVAyam0KS1FvUU9Na0NnWUVBd3AxcldMbFlaaE5nZXplUmpsZzZTN3lDMjQ5NEZkZHRnMEN5Ym9iSUt2TlJ1dFdHSmg0MQpFK3p1bDB2ajJON05uUGNVYkNYU1o2T1IxcUErbkw1dTV5TVRtLyttSXg5L3EwQ2NxbkRFREpOUUlnbjJvNEI3CkM3aU1FNnc4Z2g3MjA1UzExb1BQOFhIaTIzNE1wRHF0aEJlUDdoaXV1dURBc1NjS1NDelFNQThDZ1lFQXd0QnoKUnFlU2pIcUdKYkJ3Z09xUW9Tblp4Z1ptaDJNc2hQV0Z5WWVaS3d1cktONUVST0VkczNacExDSDBRZjRBQlZxNAo5dG1rODBlYWhyaUJ4eHRveTBtNUNZbXhPVlVWSjgxcGFVTnE4Zjk1QXFHcjRzTHF1ckNVTkIzVDkxK2lkMVMwCk42RVUvTWNVaUVNVm9ndDRJbldkWnVpNEFOblJodERLL1pneWR2Y0NnWUFoY1dmSEFXSzlkOHIyb1ovenRCbWcKZGk2T2lHTDhiZDYxMVdKVU4va2gyRnBOSHZCRWtLQlNZajdGNVJhc1orMHhjZ3dpWVlWOHBkRWo3cm1UdWUzWQo3bUFxU0k1R0x0MkRra0RaMFRML2Jqa3hBRUZQNjM0NWoyY1M0bUFyaENLcVRUM0tOVENBcnk5cXhJaHJtR0hFCjl6K1dqTXRKOWVGbkQreG1acjBINVFLQmdRQ1hVRDdwTndqZHNlRDE3eWhEQ1czU3IrWGxLRjJFZE9SRVZVdFgKNzhscEpNUUpsekhoYWhTZXFxOGZ4ek9uK2poYjhFNVA5VlpvVzBwTHI0MmxiOFdpZUIyUHFmSU1QT2lVcExobQpPU1ljMXJoUDhmREd6V3h5R3VyUjNBVWlVNWFtSnhWZlMrODRNd3pnbFhKOURYbC9FbWx5WC9sak44dkZjZkRvCnJja3Ntd0tCZ0RnNHhtREQrMTRrbDFVcW03c0tmNVNvRlVmMmNxdnpvcHAzTlR1UXRxVERNQmUzQUZ2R25DUzYKOTU3NU5GNHlNZkNhdDl6bStyK2pKRFdyNzhmR3doaHM4dU1jVUFNdkJJYUlibWtmMXRtYnh0RXQrUHNXRG5Iagp6WVYrRkdqYUtVOWdtaEFNdTdielUvMzA5dGkrc280ZUllOU5Pa3BGNE15ejJXclZYcklwCi0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg==
-------------- next part --------------
A non-text attachment was scrubbed...
Name: micro service architecture.png
Type: image/png
Size: 361336 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181026/f4245b3b/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vitrage.conf
Type: application/octet-stream
Size: 1196 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181026/f4245b3b/attachment-0001.obj>


More information about the OpenStack-dev mailing list