[openstack-dev] [Fuel] fuel master monitoring
Przemyslaw Kaminski
pkaminski at mirantis.com
Thu Nov 6 07:55:11 UTC 2014
I think we're missing the point here. What I meant adding a simple
monitoring system that informed the user via UI/CLI/email/whatever of
low resources on fuel master node. That's it. HA here is not an option
-- if, despite of warnings, the user still continues to use fuel and
disk becomes full, it's the user's fault. By adding these warnings we
have a way of saying "We told you so!" Without warnings we get bugs like
[1] I mentioned in the first post.
Of course user can check disk space by hand but since we do have a
full-blown UI telling the user to periodically log in to the console and
check disks by hand seems a bit of a burden.
We can even implement such monitoring functionality as a Nailgun plugin
-- installing it would be optional and at the same time we would grow
our plugin ecosystem.
P.
On 11/05/2014 08:42 PM, Dmitry Borodaenko wrote:
> Even one additional hardware node required to host the Fuel master is
> seen by many users as excessive. Unless you can come up with an
> architecture that adds HA capability to Fuel without increasing its
> hardware footprint by 2 more nodes, it's just not worth it.
>
> The only operational aspect of the Fuel master node that you don't
> want to lose even for a short while is logging. You'd be better off
> redirecting OpenStack environments' logs to a dedicated highly
> available logging server (which, of course, you already have in your
> environment), and deal with Fuel master node failures by restoring it
> from backups.
>
> On Wed, Nov 5, 2014 at 8:26 AM, Anton Zemlyanov
> <azemlyanov at mirantis.com <mailto:azemlyanov at mirantis.com>> wrote:
>
> Monitoring of the Fuel master's disk space is the special case. I
> really wonder why Fuel master have no HA option, disk overflow can
> be predicted but many other failures cannot. HA is a solution of
> the 'single point of failure' problem.
>
> The current monitoring recommendations
> (http://docs.openstack.org/openstack-ops/content/logging_monitoring.html)
> are based on analyzing logs and manual checks, that are rather
> reactive way of fixing problems. Zabbix is quite good for
> preventing failures that are predictable but for the abrupt
> problems Zabbix just reports them 'post mortem'.
>
> The only way to remove the single failure point is to implement
> redundancy/HA
>
> Anton
>
> On Tue, Nov 4, 2014 at 6:26 PM, Przemyslaw Kaminski
> <pkaminski at mirantis.com <mailto:pkaminski at mirantis.com>> wrote:
>
> Hello,
>
> In extension to my comment in this bug [1] I'd like to discuss
> the possibility of adding Fuel master node monitoring. As I
> wrote in the comment, when disk is full it might be already
> too late to perform any action since for example Nailgun could
> be down because DB shut itself down. So we should somehow warn
> the user that disk is running low (in the UI and fuel CLI on
> stderr for example) before it actually happens.
>
> For now the only meaningful value to monitor would be disk
> usage -- do you have other suggestions? If not then probably a
> simple API endpoint with statvfs calls would suffice. If you
> see other usages of this then maybe it would be better to have
> some daemon collecting the stats we want.
>
> If we opted for a daemon, then I'm aware that the user can
> optionally install Zabbix server although looking at
> blueprints in [2] I don't see anything about monitoring Fuel
> master itself -- is it possible to do? Though the installation
> of Zabbix though is not mandatory so it still doesn't
> completely solve the problem.
>
> [1] https://bugs.launchpad.net/fuel/+bug/1371757
> [2] https://blueprints.launchpad.net/fuel/+spec/monitoring-system
>
> Przemek
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> <mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> <mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> --
> Dmitry Borodaenko
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141106/b80a4f4c/attachment.html>
More information about the OpenStack-dev
mailing list