[openstack-dev] [vitrage] [congress] Vitrage-Congress Collaboration

Weyl, Alexey (Nokia - IL) alexey.weyl at nokia.com
Tue May 10 11:00:33 UTC 2016


Hi Masahito,

In addition, I wanted to add that the reason Congress needs to get the data from Vitrage by a pushing mechanism and not via polling, is so there won't be a delay from when the event occurs and when Congress receives it. Using polling, it will take a number of seconds (the polling interval time, 30 seconds by default) until Congress will receive the data.

The reason of course why we need it, is to make the whole process work much faster, and be consistent with other projects such as OPNFV Doctor (that wants events to happen in less than 1 second).

Alexey

> -----Original Message-----
> From: Weyl, Alexey (Nokia - IL) [mailto:alexey.weyl at nokia.com]
> Sent: Tuesday, May 10, 2016 1:45 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [vitrage] [congress] Vitrage-Congress
> Collaboration
> 
> Hi Masahito,
> 
> Thanks for your question.
> 
> There are two main reasons why we need to get alarms from Vitrage
> initially.
> 
> First, there are alarms that Vitrage generates ("deduced alarms") based
> on its user-defined templates and topology. Also, there are alarms that
> come from external sources outside of OpenStack, which Aodh and other
> projects do not hold. This information could also be valuable for
> Congress regardless of the RCA functionality.
> 
> Second, since Vitrage retrieves alarms from multiple sources, the RCA
> API takes as input the Vitrage-Id of the alarm. To know what that ID
> is, you will need to first get the Alarms from Vitrage.
> 
> Does this make sense? Would there be a different flow you think could
> work?
> 
> Best Regards,
> Alexey
> 
> > -----Original Message-----
> > From: Masahito MUROI [mailto:muroi.masahito at lab.ntt.co.jp]
> > Sent: Tuesday, May 10, 2016 11:00 AM
> > To: openstack-dev at lists.openstack.org
> > Subject: Re: [openstack-dev] [vitrage] [congress] Vitrage-Congress
> > Collaboration
> >
> > Hi Alexey,
> >
> > This use case sounds interesting. To be clarified it, I have a
> > question.
> >
> > On 2016/05/10 0:17, Weyl, Alexey (Nokia - IL) wrote:
> > > Hi Tim,
> > >
> > > I agree – creating a datasource from Vitrage to Congress is the
> > > first step, and we should have some concrete use case in mind to
> > > help guide this process.
> > >
> > > The most straightforward use case I would suggest is when there is
> a
> > > problem on an instance that is caused by some problem on the
> > > physical host. Then:
> > >
> > > ·Vitrage will notify about an alarm on the instance, which Congress
> > > will receive
> > >
> > Why does Congress need to receive the alarm?  DataSouce Driver pulls
> > data from Vitrage, so it looks like Congress should only pull the
> > cause of the failure from Vitrage.
> >
> > Best regards,
> > Masahito
> >
> > > ·Congress can then call the Vitrage RCA API. The response will
> state
> > > that the cause of the instance alarm is the host alarm.
> > >
> > > ·Congress policy can define that in such a case, the instance
> should
> > > be migrated to (or healed on) a different physical host
> > >
> > > Does this seem like a good first step for you?
> > >
> > > Thanks,
> > >
> > > Alexey
> > >
> > > *From:*Tim Hinrichs [mailto:tim at styra.com]
> > > *Sent:* Saturday, May 07, 2016 2:43 AM
> > > *To:* OpenStack Development Mailing List (not for usage questions)
> > > *Subject:* Re: [openstack-dev] [vitrage] [congress] Vitrage-
> Congress
> > > Collaboration
> > >
> > > Hi Alexey,
> > >
> > > Thanks for the overview of how you see a Congress-Vitrage
> > > integration being valuable.
> > >
> > > I'd imagine that the right first step in this integration would be
> > > creating a new datasource driver within Congress to pull data from
> > > Vitrage.  It doesn't need to pull all the data in your list to
> > > start, but enough so that we can try writing policy over that data.
> > > It's helpful to have a policy in mind that you want to write and
> > > then set up the datasource driver to grab enough of the Vitrage
> data
> > > to write that policy.  Here are the relevant docs.
> > >
> > > Datasource drivers
> > >
> > > http://docs.openstack.org/developer/congress/cloudservices.html
> > >
> > > Writing policy
> > >
> > > http://docs.openstack.org/developer/congress/policy.html
> > >
> > > Let me know if you have any questions,
> > >
> > > Tim
> > >
> > > On Wed, May 4, 2016 at 11:51 PM Weyl, Alexey (Nokia - IL)
> > > <alexey.weyl at nokia.com <mailto:alexey.weyl at nokia.com>> wrote:
> > >
> > >     Hi to all Vitrage and Congress contributors,
> > >
> > >     We had a good introduction meeting in Austin and we (Vitrage)
> > think
> > >     that we can have a good collaboration between the projects.
> > >
> > >     Vitrage, as an Openstack Root Cause Analysis (RCA) Engine,
> > > builds
> > a
> > >     topology graph of all the entities in the system (physical,
> > virtual
> > >     and application) from different datasources. It thus can enrich
> > >     Congress by providing more data about what is happening in the
> > >     system. Additionally, the Vitrage RCA and deduce alarms &
> states
> > >     mechanism can enhance the visibility of faults and how they
> > >     inter-relate.  By using this information Congress could then
> > execute
> > >     different policies and perform more accurate actions.
> > >
> > >     Another good property of Vitrage is that it can receive data
> also
> > >     from non-openstack sources, like Nagios, which monitor the
> > physical
> > >     resources, including Switches (which are not modeled today in
> > >     OpenStack).
> > >
> > >     There are many ways in which Congress-Vitrage combination would
> > be
> > >     helpful. To take just one example:
> > >     a. If a physical Switch is down, Vitrage can raise deduced
> > > alarms
> > on
> > >     the connected hosts and on the virtual machines affected by
> this
> > >     change in switch state.
> > >     b. Congress will then be notified by Vitrage about these
> alarms,
> > >     which can set off Congress policies of migration.
> > >     c. Furthermore, due to the RCA functionality, Congress will be
> > aware
> > >     that the Switch error is the source of the problem, and can
> > >     determine the best place to create new instances of the VMs so
> > that
> > >     this  switch fault will not impact the new instances.
> > >
> > >     As you can see, for each fault, we can use Vitrage to link it
> to
> > >     other faults, and create alarms to reflect them. This is all
> done
> > >     via Vitrage Templates, so the system is configurable to the
> > > needs
> > of
> > >     the user. Thus many more cases such as the example above could
> be
> > >     thought of.
> > >
> > >     To summarize, Vitrage can enrich Congress with the following
> four
> > >     features:
> > >     a. RCA
> > >     b. Deduced alarms
> > >     c. Physical, virtual and application layers
> > >     d. Graph structure and topology of the system that defines the
> > >     connections and relationships between all entities on which we
> > can
> > >     run quick graph algorithms to decide different actions to
> > > perform
> > >
> > >     If you can think of additional use cases that can be used here,
> > >     please share ☺
> > >
> > >     For more data about Vitrage and its insights please take a look
> > here:
> > >     https://wiki.openstack.org/wiki/Vitrage
> > >
> > >     Best Regards,
> > >     Alexey Weyl
> > >
> > >
> >



More information about the OpenStack-dev mailing list