[openstack-dev] [fuel] What to do when a controller runs out of space

Sergii Golovatiuk sgolovatiuk at mirantis.com
Mon Oct 5 14:12:52 UTC 2015


Hi,


On Mon, Oct 5, 2015 at 3:03 PM, Alex Schultz <aschultz at mirantis.com> wrote:

> On Mon, Oct 5, 2015 at 5:56 AM, Eugene Nikanorov
> <enikanorov at mirantis.com> wrote:
> > Ok,
> >
> > Project-wise:
> > 1) Pacemaker is not under our company's control, we can't assure its
> quality
>

Mirantis does control neither Rabbitmq or Galera. Mirantis cannot assure
their quality as well.

> 2) it has terrible UX
>

It looks like personal opinion. I'd like to see surveys or operators
feedbacks. Also, this statement is not constructive as it doesn't have
alternative solutions.


> > 3) it is not reliable
>

I would say openstack services are not HA reliable. So OCF scripts are
reaction of operators on these problems. Many of them have child-ish issues
from release to release. Operators made OCF scripts to fix these problems.
A lot of openstack are stateful, so they require some kind of stickiness or
synchronization. Openstack services doesn't have simple health-check
functionality so it's hard to say it's running well or not. Sighup is still
a problem for many of openstack services. Etc/etc So, let's be constructive
here.



> >
>
> I disagree with #1 as I do not agree that should be a criteria for an
> open-source project.  Considering pacemaker is at the core of our
> controller setup, I would argue that if these are in fact true we need
> to be using something else.  I would agree that it is a terrible UX
> but all the clustering software I've used fall in this category.  I'd
> like more information on how it is not reliable. Do we have numbers to
> backup these claims?
>
> > (3) is not evaluation of the project itself, but just a logical
> consequence
> > of (1) and (2).
> > As a part of escalation team I can say that it has cost our team
> thousands
> > of man hours of head-scratching, staring at pacemaker logs which value
> are
> > usually slightly below zero.
> >
> > Most of openstack services (in fact, ALL api servers) are stateless, they
> > don't require any cluster management (also, they don't need to be moved
> in
> > case of lack of space).
> > Statefull services like neutron agents have their states being a
> function of
> > db state and are able to syncronize it with the server without external
> > "help".
> >
>
> So it's not an issue with moving services so much as being able to
> stop the services when a condition is met. Have we tested all OS
> services to ensure they do function 100% when out of disk space?  I
> would assume that glance might have issues with image uploads if there
> is no space to handle a request.
>
> > So now usage of pacemaker can be only justified for cases where service's
> > clustering mechanism requires active monitoring (rabbitmq, galera)
> > But even there, examples when we are better off without pacemaker are all
> > around.
> >
> > Thanks,
> > Eugene.
> >
>
> After I sent this email, I had further discussions around the issues
> that I'm facing and it may not be completely related to disk space. I
> think we might be relying on the expectation that the local rabbitmq
> is always available but I need to look into that. Either way, I
> believe we still should continue to discuss this issue as we are
> managing services in multiple ways on a single host. Additionally I do
> not believe that we really perform quality health checks on our
> services.
>
> Thanks,
> -Alex
>
>
> >
> > On Mon, Oct 5, 2015 at 1:34 PM, Sergey Vasilenko <
> svasilenko at mirantis.com>
> > wrote:
> >>
> >>
> >> On Mon, Oct 5, 2015 at 12:22 PM, Eugene Nikanorov
> >> <enikanorov at mirantis.com> wrote:
> >>>
> >>> No pacemaker for os services, please.
> >>> We'll be moving out neutron agents from pacemaker control in 8.0, other
> >>> os services don't need it too.
> >>
> >>
> >> could you please provide your arguments.
> >>
> >>
> >> /sv
> >>
> >>
> __________________________________________________________________________
> >> OpenStack Development Mailing List (not for usage questions)
> >> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>
> >
> >
> >
> __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20151005/26824dfb/attachment.html>


More information about the OpenStack-dev mailing list