[all][infra][qa] Retiring Logstash, Elasticsearch, subunit2sql, and Health
Daniel Pawlik
dpawlik at redhat.com
Thu May 13 14:23:51 UTC 2021
Hello Folks,
Thank you Jeremy and Clark for sharing the issue that you have. I
understand that the main issue is related to a lack of time.
ELK stack requires a lot of resources, but the values that you share
probably can be optimized. Is it possible to share
the architecture, how many servers are using which Elasticsearch server
role (master, data servers, etc.) ?
My team is managing RDO infra, which contains an ELK stack based on
Opendistro for Elasticsearch.
We have ansible playbooks to setup Elasticsearch base on Opendistro just on
one node. Almost all of ELK
stack services are located on one server that does not utilize a lot of
resources (the retention time is set to
10 days, 90GB of HDD is used, 2GB of RAM for Elasticsearch, 512MB for
Logstash).
Could you share, what is the retention time set currently in the cluster
that it requires 1 TB disk? Also other statistics like
how many queries are done in kibana and how much of HDD disk space is used
by the Openstack project and compare
it to other projects that are available in Opendev?
In the end, I would like to ask, if you can share what is the Elasticsearch
version currently running on your servers and if
you can share the -Xmx and -Xms parameters that are set in Logstash,
Elasticsearch and Kibana.
Thank you for your time and effort in keeping things running smoothly for
OpenDev. We find the OpenDev ELK stack
valuable enough to the OpenDev community to take a much larger role in
keeping it running.
If you can think of any additional links or information that may be helpful
to us taking a larger role here, please do not
hesitate to share it.
Dan
On Wed, May 12, 2021 at 3:20 PM Jeremy Stanley <fungi at yuggoth.org> wrote:
> On 2021-05-12 02:05:57 -0700 (-0700), Sorin Sbarnea wrote:
> [...]
> > TripleO health check project relies on being able to query ER from
> > both opendev and rdo in order to easy identification of problems.
>
> Since you say RDO has a similar setup, could they just expand to
> start indexing our logs? As previously stated, doing that doesn't
> require any special access to our infrastructure.
>
> > Maybe instead of dropping we should rethink what it is supposed to
> > index and not, set some hard limits per job and scale down the
> > deployment. IMHO, one of the major issues with it is that it does
> > try to index maybe too much w/o filtering noisy output before
> > indexing.
>
> Reducing how much we index doesn't solve the most pressing problem,
> which is that we need to upgrade the underlying operating system,
> therefore replace the current current configuration management which
> won't work on newer platforms, and also almost certainly upgrade
> versions of the major components in use for it. Nobody has time to
> do that, at least nobody who has heard our previous cries for help.
>
> > If we can delay making a decision a little bit so we can
> > investigate all available options it would really be great.
>
> This thread hasn't set any timeline for stopping the service, not
> yet anyway.
>
> > I worth noting that I personally do not have a special love for ES
> > but I do value a lot what it does. I am also pragmatic and I would
> > not be very upset to make use of a SaaS service as an alternative,
> > especially as I recognize how costly is to run and maintain an
> > instance.
> [...]
>
> It's been pointed out that OVH has a similar-sounding service, if
> someone is interested in experimenting with it:
>
> https://www.ovhcloud.com/en-ca/data-platforms/logs/
>
> The case with this, and I think with any SaaS solution, is that
> there would still need to be a separate ingestion mechanism to
> identify when new logs are available, postprocess them to remove
> debug lines, and then feed them to the indexing service at the
> provider... something our current team doesn't have time to design
> and manage.
> --
> Jeremy Stanley
>
--
Regards,
Daniel Pawlik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210513/3ecebdc7/attachment-0001.html>
More information about the openstack-discuss
mailing list