[openstack-dev] [Fuel] fuel master monitoring
Anton Zemlyanov
azemlyanov at mirantis.com
Thu Nov 6 08:59:44 UTC 2014
We can add a notification to FuelWeb, no additional software or user
actions are required. I would not overestimate this method though, it is in
no way the robust monitoring system. Forcing user to do something on a
regular basis is unlikely to work.
Anton
On Thu, Nov 6, 2014 at 11:55 AM, Przemyslaw Kaminski <pkaminski at mirantis.com
> wrote:
> I think we're missing the point here. What I meant adding a simple
> monitoring system that informed the user via UI/CLI/email/whatever of low
> resources on fuel master node. That's it. HA here is not an option -- if,
> despite of warnings, the user still continues to use fuel and disk becomes
> full, it's the user's fault. By adding these warnings we have a way of
> saying "We told you so!" Without warnings we get bugs like [1] I mentioned
> in the first post.
>
> Of course user can check disk space by hand but since we do have a
> full-blown UI telling the user to periodically log in to the console and
> check disks by hand seems a bit of a burden.
>
> We can even implement such monitoring functionality as a Nailgun plugin --
> installing it would be optional and at the same time we would grow our
> plugin ecosystem.
>
> P.
>
>
> On 11/05/2014 08:42 PM, Dmitry Borodaenko wrote:
>
> Even one additional hardware node required to host the Fuel master is seen
> by many users as excessive. Unless you can come up with an architecture
> that adds HA capability to Fuel without increasing its hardware footprint
> by 2 more nodes, it's just not worth it.
>
> The only operational aspect of the Fuel master node that you don't want to
> lose even for a short while is logging. You'd be better off redirecting
> OpenStack environments' logs to a dedicated highly available logging server
> (which, of course, you already have in your environment), and deal with
> Fuel master node failures by restoring it from backups.
>
> On Wed, Nov 5, 2014 at 8:26 AM, Anton Zemlyanov <azemlyanov at mirantis.com>
> wrote:
>
>> Monitoring of the Fuel master's disk space is the special case. I
>> really wonder why Fuel master have no HA option, disk overflow can be
>> predicted but many other failures cannot. HA is a solution of the 'single
>> point of failure' problem.
>>
>> The current monitoring recommendations (
>> http://docs.openstack.org/openstack-ops/content/logging_monitoring.html)
>> are based on analyzing logs and manual checks, that are rather reactive way
>> of fixing problems. Zabbix is quite good for preventing failures that are
>> predictable but for the abrupt problems Zabbix just reports them 'post
>> mortem'.
>>
>> The only way to remove the single failure point is to implement
>> redundancy/HA
>>
>> Anton
>>
>> On Tue, Nov 4, 2014 at 6:26 PM, Przemyslaw Kaminski <
>> pkaminski at mirantis.com> wrote:
>>
>>> Hello,
>>>
>>> In extension to my comment in this bug [1] I'd like to discuss the
>>> possibility of adding Fuel master node monitoring. As I wrote in the
>>> comment, when disk is full it might be already too late to perform any
>>> action since for example Nailgun could be down because DB shut itself down.
>>> So we should somehow warn the user that disk is running low (in the UI and
>>> fuel CLI on stderr for example) before it actually happens.
>>>
>>> For now the only meaningful value to monitor would be disk usage -- do
>>> you have other suggestions? If not then probably a simple API endpoint with
>>> statvfs calls would suffice. If you see other usages of this then maybe it
>>> would be better to have some daemon collecting the stats we want.
>>>
>>> If we opted for a daemon, then I'm aware that the user can optionally
>>> install Zabbix server although looking at blueprints in [2] I don't see
>>> anything about monitoring Fuel master itself -- is it possible to do?
>>> Though the installation of Zabbix though is not mandatory so it still
>>> doesn't completely solve the problem.
>>>
>>> [1] https://bugs.launchpad.net/fuel/+bug/1371757
>>> [2] https://blueprints.launchpad.net/fuel/+spec/monitoring-system
>>>
>>> Przemek
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
>
> --
> Dmitry Borodaenko
>
>
> _______________________________________________
> OpenStack-dev mailing listOpenStack-dev at lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141106/eaa5f2e2/attachment.html>
More information about the OpenStack-dev
mailing list