[openstack-dev] [Fuel] fuel master monitoring

Przemyslaw Kaminski pkaminski at mirantis.com
Wed Jan 7 08:59:34 UTC 2015


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hello,

The updated version of monitoring code is available here:

https://review.openstack.org/#/c/137785/

This is based on monit as was agreed in this thread. The drawback of
monit is that basically it's a very simple system that doesn't track
state of checkers so still some Python code is needed so that user
isn't spammed with low disk space notifications every minute.

On 01/05/2015 10:40 PM, Andrew Woodward wrote:
> There are two threads here that need to be unraveled from each 
> other.
> 
> 1. We need to prevent fuel from doing anything if the OS is out of 
> disk space. It leads to a very broken database from which it 
> requires a developer to reset to a usable state. From this point we
> need to * develop a method for locking down the DB writes so that
> fuel becomes RO until space is freed

It's true that full disk space + DB writes can result in fatal
database failure. I just don't know if we can lock the DB just like
that? What if deployment is in progress?

I think the first way to reduce disk space usage would be to set
logging level to WARNING instead of DEBUG. It's good to have DEBUG
during development but I don't think it's that good for production.
Besides it slows down deployment much, from what I observed.

> * develop a method (or re-use existing) to notify the user that a 
> serious error state exists on the host. ( that could not be 
> dismissed)

Well this is done already in the review I've linked above. It
basically posts a notification to the UI system. Everything still
works as before though until the disk is full. The CLI doesn't
communicate in any way with notifications AFAIK so the warning is not
shown there.

> * we need some API that can lock / unlock the DB * we need some 
> monitor process that will trigger the lock/unlock

This one can be easily changed with the code in the above review request.

> 
> 2. We need monitoring for the master node and fuel components in 
> general as discussed at length above. unless we intend to use this
>  to also monitor the services on deployed nodes (likely bad), then
>  what we use to do this is irrelevant to getting this started. If 
> we are intending to use this to also monitor deployed nodes, (again
> bad for the fuel node to do) then we need to standardize with what
> we monitor the cloud with (Zabbix currently) and offer a single
> pane of glass. Federation in the monitoring becomes a critical
> requirement here as having more than one pane of glass is an
> operations nightmare.

AFAIK installation of Zabbix is optional. We want obligatory
monitoring of the master which would somehow force its installation on
the cloud nodes.

P.

> 
> Completing #1 is very important in the near term as I have had to 
> un-brick several deployments over it already. Also, in my mind 
> these are also separate tasks.
> 
> On Thu, Nov 27, 2014 at 1:19 AM, Simon Pasquier 
> <spasquier at mirantis.com> wrote:
>> I've added another option to the Etherpad: collectd can do basic
>>  threshold monitoring and run any kind of scripts on alert 
>> notifications. The other advantage of collectd would be the RRD 
>> graphs for (almost) free. Of course since monit is already 
>> supported in Fuel, this is the fastest path to get something 
>> done. Simon
>> 
>> On Thu, Nov 27, 2014 at 9:53 AM, Dmitriy Shulyak 
>> <dshulyak at mirantis.com> wrote:
>>> 
>>> Is it possible to send http requests from monit, e.g for 
>>> creating notifications? I scanned through the docs and found 
>>> only alerts for sending mail, also where token (username/pass)
>>>  for monit will be stored?
>>> 
>>> Or maybe there is another plan? without any api interaction
>>> 
>>> On Thu, Nov 27, 2014 at 9:39 AM, Przemyslaw Kaminski 
>>> <pkaminski at mirantis.com> wrote:
>>>> 
>>>> This I didn't know. It's true in fact, I checked the 
>>>> manifests. Though monit is not deployed yet because of lack 
>>>> of packages in Fuel ISO. Anyways, I think the argument about
>>>>  using yet another monitoring service is now rendered 
>>>> invalid.
>>>> 
>>>> So +1 for monit? :)
>>>> 
>>>> P.
>>>> 
>>>> 
>>>> On 11/26/2014 05:55 PM, Sergii Golovatiuk wrote:
>>>> 
>>>> Monit is easy and is used to control states of Compute nodes.
>>>> We can adopt it for master node.
>>>> 
>>>> -- Best regards, Sergii Golovatiuk, Skype #golserge IRC 
>>>> #holser
>>>> 
>>>> On Wed, Nov 26, 2014 at 4:46 PM, Stanislaw Bogatkin 
>>>> <sbogatkin at mirantis.com> wrote:
>>>>> 
>>>>> As for me - zabbix is overkill for one node. Zabbix Server
>>>>>  + Agent + Frontend + DB + HTTP server, and all of it for 
>>>>> one node? Why not use something that was developed for 
>>>>> monitoring one node, doesn't have many deps and work out of
>>>>> the box? Not necessarily Monit, but something similar.
>>>>> 
>>>>> On Wed, Nov 26, 2014 at 6:22 PM, Przemyslaw Kaminski 
>>>>> <pkaminski at mirantis.com> wrote:
>>>>>> 
>>>>>> We want to monitor Fuel master node while Zabbix is only
>>>>>>  on slave nodes and not on master. The monitoring service
>>>>>>  is supposed to be installed on Fuel master host (not 
>>>>>> inside a Docker container) and provide basic info about 
>>>>>> free disk space, etc.
>>>>>> 
>>>>>> P.
>>>>>> 
>>>>>> 
>>>>>> On 11/26/2014 02:58 PM, Jay Pipes wrote:
>>>>>>> 
>>>>>>> On 11/26/2014 08:18 AM, Fox, Kevin M wrote:
>>>>>>>> 
>>>>>>>> So then in the end, there will be 3 monitoring 
>>>>>>>> systems to learn, configure, and debug? Monasca for 
>>>>>>>> cloud users, zabbix for most of the physical systems,
>>>>>>>> and sensu or monit "to be small"?
>>>>>>>> 
>>>>>>>> Seems very complicated.
>>>>>>>> 
>>>>>>>> If not just monasca, why not the zabbix thats already
>>>>>>>> being deployed?
>>>>>>> 
>>>>>>> 
>>>>>>> Yes, I had the same thoughts... why not just use zabbix
>>>>>>> since it's used already?
>>>>>>> 
>>>>>>> Best, -jay
>>>>>>> 
>>>>>>> _______________________________________________ 
>>>>>>> OpenStack-dev mailing list 
>>>>>>> OpenStack-dev at lists.openstack.org 
>>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 
_______________________________________________
>>>>>> OpenStack-dev mailing list 
>>>>>> OpenStack-dev at lists.openstack.org 
>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> 
_______________________________________________
>>>>> OpenStack-dev mailing list 
>>>>> OpenStack-dev at lists.openstack.org 
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>> 
_______________________________________________
>>>> OpenStack-dev mailing list OpenStack-dev at lists.openstack.org
>>>> 
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 
_______________________________________________
>>>> OpenStack-dev mailing list OpenStack-dev at lists.openstack.org
>>>> 
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>
>>>
>>>
>>>>
>>>>
>>>> 
_______________________________________________
>>> OpenStack-dev mailing list OpenStack-dev at lists.openstack.org 
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>>
>>>
>>>
>>> 
_______________________________________________
>> OpenStack-dev mailing list OpenStack-dev at lists.openstack.org 
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>>
>>
>> 
> 
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJUrPV0AAoJEDMLDJqAXfZ0BTMP/2zoQmRXvpTPB77k5xiDim7X
a3P5qVXIjeoRlAoCv1VPDZJra+cqKVRyZYTFf8j8VB3l/aKXhU7HxzFOVVgT8KiJ
GnrudIsi9Nir1D+DxFZPWAb2zBPwp/6Wn90CkGwXWiHDzE/E8nSY5lgia2wK0tza
/0dWLa6L6Lj4Vc5LViXXS7Q+7kEa1EZuAdAymEg6uAkEspWFvlUf2BQwxtHg1zbW
9Jd20DAUviR7xeWrbub/yIsTfQp+iMhI9beor4p7tcBtso33uA9H7UpGEbBwsnq9
rF8xjO9cL0qObe0ki0uc7ymBmNKmONJvWz9F2hVUQCNt2085hj3ljMRJ661HYfWh
vckoWRoGGBa9hPwklCSCMTLvtw2nzqXAC73WyVFmAMWPMX4sG9riSUKnXOeW68GM
9iSd5oYYqBeotdgc1daYcoEeX41KY7gcNEtBHt2B+xFFxPF0jeA/hDRir9HWTdWv
/lNg+Bdqw7pHLQN9rlZHO9ggfPoJOR93YUjsUyv0L3ph3pxn55ebsY300lNJEbWk
eR/xbn4yZaz9orApbr28F6CkQ0xzVTJXuN13QdzVivHqkwyXLHUycHfFA5bQlShu
OmUXejfMcVDBlhTf+VGXwFAfSPNl0nyGKtdevJsGB7uqwh4pBvCxu2QH3CJIRjBe
2MvoKX75xKysGQecLX6S
=ofKV
-----END PGP SIGNATURE-----



More information about the OpenStack-dev mailing list