[openstack-dev] [Vitrage] New proposal for analysis.

Afek, Ifat (Nokia - IL/Kfar Sava) ifat.afek at nokia.com
Wed Apr 4 07:21:14 UTC 2018


Hi Minwook,

I discussed this issue with a Mistral contributor.
Mistral has a long list of actions that can be used. Specifically, you can use the std.ssh action to execute shell scripts.

Some useful commands:

mistral action-list
mistral action-get <UUID of the std.ssh action>

I’m not sure about the output of the std.ssh, and whether you can get it from the action. I suggest you try it and see how it works.
The action is implemented here: https://github.com/openstack/mistral/blob/master/mistral/actions/std_actions.py

If std.ssh does not suit your needs, you also have an option to implement and run your own action in Mistral (either as an ssh action or as a python code).
And BTW, it is not related to your current use case, but we can also add Vitrage actions to Mistral, so the user can access Vitrage information (get topology, get alarms) from Mistral workflows.

Best regards,
Ifat


From: MinWookKim <delightwook at ssu.ac.kr>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
Date: Tuesday, 3 April 2018 at 15:19
To: "'OpenStack Development Mailing List (not for usage questions)'" <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Vitrage] New proposal for analysis.

Hello Ifat,

Thanks for your reply.

Your comments have been a great help to the proposal.  (sorry, I did not think we could use Mistral).

If we use the Mistral workflow for the proposal, we can get better results (we can get good results on performance and code conciseness).

Also, if we use the Mistral workflow, we do not need to write any unnecessary code.

Since I don't know about mistral yet, I think it would be better to do the most efficient design including mistral after grasping it.

If we run a check through a Mistral workflow, how about providing users with a choice of tools that have the capability to perform checks?

We can get the results of the check through the Mistral and tools, but I think we need the least functionality to manage them. What do you think?

I attached a picture of the actual UI that I simply implemented. I hope it helps you understand. (The parameter and content have no meaning and are a simple example.) : )

Thanks.

Best regards,
Minwook.

From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.afek at nokia.com]
Sent: Tuesday, April 3, 2018 8:31 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Vitrage] New proposal for analysis.

Hi Minwook,

Thanks for the explanation, I understand the reasons for not running these checks on a regular basis in Zabbix or other monitoring tools. It makes sense. However, I don’t want to re-invent the wheel and add to Vitrage functionality that already exists in other projects.

How about using Mistral for the purpose of manually running these extra checks? If you prepare the script/agent in advance, as well as the Mistral workflow, I believe that Mistral can successfully execute the check and return the results. I’m not so sure about the UI part, we will have to figure out how and where the user can see the output. But it will save a lot of effort around managing the checks, running a new service, supporting a new API, etc.

What do you think?
Ifat


From: MinWookKim <delightwook at ssu.ac.kr<mailto:delightwook at ssu.ac.kr>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Tuesday, 3 April 2018 at 5:36
To: "'OpenStack Development Mailing List (not for usage questions)'" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [Vitrage] New proposal for analysis.

Hello Ifat,

I also thought about several scenarios that use monitoring tools like Zabbix, Nagios, and Prometheus.

But there are some limitations, so we have to think about it.

We also need to think about targets, scope, and so on.

The reason I do not think of tools like Zabbix, Nagios, and Prometheus as a tool to run checks is because we need to configure an agent or an exporter.

I think it is not hard to configure an agent for monitoring objects such as a physical host.

But the scope of the idea I think involves the VM's interior.

Therefore, configuring the agent automatically inside the VM may not be easy. (although we can use parameters like user-data)

If we exclude VM internal checks from scope, we can simply perform a check via Zabbix. (Like Zabbix's remote command, history)

On the other hand, if we include the inside of a VM in a scope, and configure each of them, we have a rather constant overhead.

The check service may incur temporary overhead, but the agent configuration can cause constant overhead.

And Zabbix history can be another task for Vitrage.

If we configure the agents themselves and exclude the VM's internal checks, we can provide functionality with simple code.

how is it?

Thank you.

Best regards,
Minwook.
From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.afek at nokia.com]
Sent: Monday, April 2, 2018 10:22 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Vitrage] New proposal for analysis.

Hi Minwook,

Thinking about it again, writing a new service for these checks might be an unnecessary overhead. Have you considered using an existing tool, like Zabbix, for running such checks? If you use Zabbix, you can define new triggers that run the new checks, and whenever needed the user can ask to open Zabbix and show the relevant metrics. The format will not be exactly the same as in your example, but it will save a lot of work and spare you the need to write and manage a new service.

Some technical details:


·         The current information that Vitrage stores is not enough for opening the right Zabbix page. We will need to keep a little more data, like the item id, on the alarm vertex. But can be done easily.

·         A relevant Zabbix API is history.get [1]

·         If you are not using Zabbix, I assume that other monitoring tools have similar capabilities

What do you think? Do you think it can work with your scenario?
Or do you see a benefit to the user is viewing the data in the format that you suggested?


[1] https://www.zabbix.com/documentation/3.0/manual/api/reference/history/get

Thanks,
Ifat


From: MinWookKim <delightwook at ssu.ac.kr<mailto:delightwook at ssu.ac.kr>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Monday, 2 April 2018 at 4:51
To: "'OpenStack Development Mailing List (not for usage questions)'" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [Vitrage] New proposal for analysis.

Hello Ifat,

Thank you for the reply. :)

It is my opinion only, so if I'm wrong, we can change the implementation part at any time. (Even if it differs from my initial intention)

The same security issues arise as you say. But now Vitrage does not call external APIs.

The Vitrage-dashboard uses Vitrageclient libraries for Topology, Alarms, and RCA requests to Vitrage.

So if we add an API, it will have the following flow.

Vitrage-dashboard requests checks using the Vitrageclient library. -> Vitrage receives the API.

-> api / controllers / v1 / checks.py is called. -> checks service is called.

In accordance with the above flow, passing through the Vitrage API is the purpose of data passing and function calls.

I think Vitrage does not need to call external APIs.

If you do not want to go through the Vitrage API, we need to create a function for the check action in the Vitrage-dashboard, and write code to call the function.

If I think wrong, please tell me anytime. :)

Thank you.

Best regards,
Minwook.

From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.afek at nokia.com]
Sent: Sunday, April 1, 2018 3:40 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Vitrage] New proposal for analysis.

Hi Minwook,

I understand your concern about the security issue.
But how would that be different if the API call is passed through Vitrage API? The authentication from vitrage-dashboard to vitrage API will work, but then Vitrage will call an external API and you’ll have the same security issue, right? I don’t understand what is the difference between calling the external component from vitrage-dashboard and calling it from vitrage.

Best regards,
Ifat.

From: MinWookKim <delightwook at ssu.ac.kr<mailto:delightwook at ssu.ac.kr>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Thursday, 29 March 2018 at 14:51
To: "'OpenStack Development Mailing List (not for usage questions)'" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [Vitrage] New proposal for analysis.

Hello Ifat,

Thanks for your reply.  : )
I wrote my opinion on your comment.

Why do you think the request should pass through the Vitrage API? Why can’t vitrage-dashboard call the check component directly?

Authentication issues:
I think the check component is a separate component based on the API.

In my opinion, if the check component has a separate api address from the vitrage to receive requests from the Vitrage-dashboard,
the Vitrage-dashboard needs to know the api address for the check component.

This can result in a request / response situation open to anyone, regardless of the authentication supported
by openstack between the Vitrage-dashboard and the request / response procedure of check component.

This is possible not only through the Vitrage-dashboard, but also with simple commands such as curl.
(I think it is unnecessary to implement a separate authentication system for the check component.)

This problem may occur if someone knows the api address for the check component,
which can cause the host and VM to execute system commands.

what should happen if the user closes the check window before the checks are over? I assume that the checks will finish, but the user won’t be able to see the results?

If the window is closed before the check is finished, the user can not check the result.

To solve this problem, I think that temporarily saving a list of recent results is also a solution.

By storing temporary lists (for example, up to 10), the user can see the previous results and think that it is also possible to empty the list by the user.

how is it?

Thank you.

Best Regrads,
Minwook.

From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.afek at nokia.com]
Sent: Thursday, March 29, 2018 8:07 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Vitrage] New proposal for analysis.

Hi Minwook,

Why do you think the request should pass through the Vitrage API? Why can’t vitrage-dashboard call the check component directly?

And another question: what should happen if the user closes the check window before the checks are over? I assume that the checks will finish, but the user won’t be able to see the results?

Thanks,
Ifat.

From: MinWookKim <delightwook at ssu.ac.kr<mailto:delightwook at ssu.ac.kr>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Thursday, 29 March 2018 at 10:25
To: "'OpenStack Development Mailing List (not for usage questions)'" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [Vitrage] New proposal for analysis.

Hello Ifat and Vitrage team.

I would like to explain more about the implementation part of the mail I sent last time.

The flow is as follows.

Vitrage-dashboard (action-list-panel) -> Vitrage-api -> check component

The last time I mentioned it as api-handler, it would be better to call the check component directly from Vitarge-api without having to use it.

I hope this helps you understand.

Thank you

Best Regards,
Minwook.

From: MinWookKim [mailto:delightwook at ssu.ac.kr]
Sent: Wednesday, March 28, 2018 11:21 AM
To: 'OpenStack Development Mailing List (not for usage questions)'
Subject: Re: [openstack-dev] [Vitrage] New proposal for analysis.

Hello Ifat,

Thanks for your reply. : )

This proposal is a proposal that we expect to be useful from a user perspective.

From a manager's point of view, we need an implementation that minimizes the overhead incurred by the proposal.

The answers to some of your questions are:


&#8226 I assume that these checks will not be implemented in Vitrage, and the results will not be stored in Vitrage, right? Vitrage role is to be a place where it is easy and intuitive for the user to execute external actions/checks.

Yes, that's right. We do not need to save it to Vitrage because we just need to check the results.
However, it is possible to implement the function directly in Vitrage-dashboard separately from Vitrage like add-action-list panel,
but it seems that it is not enough to implement all the functions.
If you do not mind, we will have the following flow.

1. The user requests the check action from the vitrage-dashboard (add-action-list-panel).
2. Call the check component through the vitrage's API handler.
3. The check component executes the command and returns the result.

Because it is my opinion only, please tell us if there is an unnecessary part. :)


&#8226 Do you expect the user to click an entity, select an action to run (e.g. ‘P2P check’), and wait by the open panel for the results? What if the user switches to another menu before the check is done? What if the user asks to run an additional check in parallel? What if the user wants to see again a previous result?


My idea was to select the task, wait for the results in an open panel, and then instantly see it in the panel.
If we switch to another menu before the scan is complete, we will not be able to see the results.
Parallel checking is a matter of fact. (This can cause excessive overhead.)
For earlier results, it may be okay to temporarily save the open panel until we exit the panel. We can see the previous results through the temporary saved results.


&#8226 Any thoughts of what component will implement those checks? Or maybe these will be just scripts?

I think I implement a separate component to request it.


&#8226 It could be nice if, as a result of an action check, a new alarm will be raised in Vitrage. A specific alarm with the additional details that were found. However, it might not be trivial to implement it. We could think about it as phase #2.


It is expected to be really good. It would be very useful if an Entity-Graph generates an alarm based on the check result.
I think that part will be able to talk in detail later.
My answer is my opinions and assumptions.
If you think my implementation is wrong, or an inefficient implementation, please do not hesitate to tell me.

Thanks.

Best Regards,
Minwook.
From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.afek at nokia.com]
Sent: Wednesday, March 28, 2018 2:23 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Vitrage] New proposal for analysis.

Hi Minwook,

I think that from a user’s perspective, these are very good ideas.

I have some questions regarding the UX and the implementation, since I’m trying to think what could be the best way to execute such actions from Vitrage.


·         I assume that these checks will not be implemented in Vitrage, and the results will not be stored in Vitrage, right? Vitrage role is to be a place where it is easy and intuitive for the user to execute external actions/checks.

·         Do you expect the user to click an entity, select an action to run (e.g. ‘P2P check’), and wait by the open panel for the results? What if the user switches to another menu before the check is done? What if the user asks to run an additional check in parallel? What if the user wants to see again a previous result?

·         Any thoughts of what component will implement those checks? Or maybe these will be just scripts?

·         It could be nice if, as a result of an action check, a new alarm will be raised in Vitrage. A specific alarm with the additional details that were found. However, it might not be trivial to implement it. We could think about it as phase #2.

Best Regards,
Ifat


From: MinWookKim <delightwook at ssu.ac.kr<mailto:delightwook at ssu.ac.kr>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Tuesday, 27 March 2018 at 14:45
To: "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: [openstack-dev] [Vitrage] New proposal for analysis.

Hello Vitrage team.

I am currently working on the Vitrage-Dashboard proposal for the ‘Add action list panel for entity click action’.
(https://review.openstack.org/#/c/531141/)

I would like to make a new proposal based on the action list panel mentioned above.

The new proposal is to provide multidimensional analysis capabilities in several entities that make up the infrastructure in the entity graph.

Vitrage's entity-graph allows us to efficiently monitor alarms from various monitoring tools.

In the current state, when there is a problem with the VM and Host, or when we want to check the status, we need to access the console individually for each VM and Host.

This situation causes unnecessary behavior when the number of VMs and hosts increases.

My new suggestion is that if we have a large number of vm and host, we do not need to directly connect to each VM, host console to enter the system command.

Instead, we can send a system command to VM and hosts in the cloud through this proposal. It is only checking results.

I have written some use-cases for an efficient explanation of the function.

From an implementation perspective, the goals of the proposal are:


1.     To execute commands without installing any Agent / Client that can cause load on VM, Host.

2. I want to provide a simple UI so that users or administrators can get the desired information to multiple VMs and hosts.

3. I want to be able to grasp the results at a glance.

4. I want to implement a component that can support many additional scenarios in plug-in format.

I would be happy if you could comment on the proposal or ask questions.

Thanks.

Best Regards,
Minwook.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20180404/ac0af432/attachment.html>


More information about the OpenStack-dev mailing list