[all][tact-sig][infra][qa] Retiring Logstash, Elasticsearch, subunit2sql, and Health

Sylvain Bauza sbauza at redhat.com
Tue May 11 16:02:52 UTC 2021


On Tue, May 11, 2021 at 5:29 PM Clark Boylan <cboylan at sapwetik.org> wrote:

> On Tue, May 11, 2021, at 6:56 AM, Jeremy Stanley wrote:
> > On 2021-05-11 09:47:45 +0200 (+0200), Sylvain Bauza wrote:
> > [...]
> > > Could we be discussing how we could try to find a workaround for
> > > this?
> > [...]
> >
>
> snip. What Fungi said is great. I just wanted to add a bit of detail below.
>
> > Upgrading the existing systems at this point is probably at least
> > the same amount of work, given all the moving parts, the need to
> > completely redo the current configuration management for it, the
> > recent license strangeness with Elasticsearch, the fact that
> > Logstash and Kibana are increasingly open-core fighting to keep
> > useful features exclusively for their paying users... the whole
> > stack needs to be reevaluated, and new components and architecture
> > considered.
>
> To add a bit more concrete info to this the current config management for
> all of this is Puppet. We no longer have the ability to run Puppet in our
> infrastructure on systems beyond Ubuntu Xenial. What we have been doing for
> newer systems is using Ansible (often coupled with docker + docker-compose)
> to deploy services. This means that all of the config management needs to
> be redone.
>
> The next problem you'll face is that Elasticsearch itself needs to be
> upgraded. Historically when we have done this, it has required also
> upgrading Kibana and Logstash due to compatibility problems. When you
> upgrade Kibana you have to sort out all of the data access and
> authorizations problems that Elasticsearch presents because it doesn't
> provide authentication and authorization (we cannot allow arbitrary writes
> into the ES cluster, Kibana assumes it can do this). With Logstash you end
> up rewriting all of your rules.
>
> Finally, I don't think we have enough room to do rolling replacements of
> Elasticsearch cluster members as they are so large. We have to delete
> servers to add servers. Typically we would add server, rotate in, then
> delete the old one. In this case the idea is probably to spin up an
> entirely new cluster along side the old one, check that it is functional,
> then shift the data streaming over to point at it. Unfortunately, that
> won't be possible.
>
> > --
> > Jeremy Stanley
>
>
>
First, thanks both Jeremy and fungi for explaining why we need to stop to
provide a ELK environment for our logs. I now understand it better and
honestly I can't really find a way to fix it just by me.
I'm just sad we can't for the moment find a way to have a way to continue
looking at this unless finding "someone" who would help us :-)

Just a note, I then also guess that
http://status.openstack.org/elastic-recheck/ will stop to work as well,
right?

Operators, if you read me and want to make sure that our upstream CI
continues to work as we could see gate issues, please help us ! :-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210511/8da867dd/attachment.html>


More information about the openstack-discuss mailing list