[openstack-dev] [kolla] Introduction of Heka in Kolla

Clark Boylan cboylan at sapwetik.org
Tue Jan 12 20:53:15 UTC 2016


On Tue, Jan 12, 2016, at 04:19 AM, Kwasniewska, Alicja wrote:
> Unfortunately I do not have any experience in working or testing Heka, so
> it’s hard for me to compare its performance vs Logstash performance.
> However I’ve read that Heka possess a lot advantages over Logstash in
> this scope.
> 
> 
> But which version of Logstash did you test? One guy from the Logstash
> community said that: “The next release of logstash (1.2.0 is in beta) has
> a 3.5x improvement in event throughput. For numbers: on my workstation at
> home (6 vcpu on virtualbox, host OS windows, 8 GB ram, host cpu is
> FX-8150) - with logstash 1.1.13, I can process roughly 31,000 events/sec
> parsing apache logs. With logstash 1.2.0.beta1, I can process 102,000
> events/sec.”
In addition to the version of Logstash that is used, the specific grok
rules and file inputs can make a big difference with performance. I
would make sure that your 500 input files are representative of the log
files you will generate running Kolla/OpenStack and that you write grok
rules that would actually be useful. If you need to you can always grab
them from the CI logs.

For the Elasticsearch indexing that we expose at
http://logstash.openstack.org we ended up going with Logstash because
many of the alternate tools (Heka and Fluentd and friends) seemed more
appropriate for moving logs from point A to point B with little to no
modification. With Jenkins build logs we needed to be able to accept
many different log formats (libvirt, oslo, swift, apache, devstack
console logs, and so on) and collapse them into a mostly common event
format. That said if you can get structured logs and keep the number of
formats to a minimum using a tool like Heka or Fluentd makes a lot of
sense.
> 
> 
> You also said that Heka is a unified data processing, but do we need
> this? Heka seems to address stream processing needs, while Logstash
> focuses mainly on processing logs. We want to create a central logging
> service, and Logstash was created especially for it and seems to work
> well for this application.
> 
> 
> One thing that is obvious is the fact that the Logstash is better known,
> more popular and tested. Maybe it has some performance disadvantages, but
> at least we know what we can expect from it. Also, it has more pre-built
> plugins and has a lot examples of usage, while Heka doesn’t have many of
> them yet and is nowhere near the range of plugins and integrations
> provided by Logstash.
Not only that but the OpenStack Infra team and others have written rules
for handling OpenStack logs. In theory you can just drop them into place
and it will all work.
> 
> 
> In the case of adding plugins, I’ve read that in order to add Go plugins,
> the binary has to be recompiled, what is a little bit frustrating (static
> linking - to wire in new plugins, have to recompile). On the other hand,
> the Lua plugins do not require it, but the question is whether Lua
> plugins are sufficient? Or maybe adding Go plugins is not so bad?
> 
> 
> You also said that you didn’t test the Heka with Docker, right? But do
> you have any experience in setting up Heka in Docker container? I saw
> that with Heka 0.8.0 new Docker features were implemented (included
> Dockerfiles to generate Heka Docker containers for both development and
> deployment), but did you test it? If you didn’t, we could not be sure
> whether there are any issues with it.
> 
> 
> Moreover you will have to write your own Dockerfile for Heka that
> inherits from Kolla base image (as we discussed during last meeting, we
> would like to have our own images), you won’t be able to inherit from
> ianneub/heka:0.10 as specified in the link that you sent
> http://www.ianneubert.com/wp/2015/03/03/how-to-use-heka-docker-and-tutum/.
> 
> 
> There are also some issues with DockerInput Module which you want to use.
> For example splitters are not available in DockerInput
> (https://github.com/mozilla-services/heka/issues/1643). I can’t say that
> it will affect us, but we also don’t know which new issues may arise
> during first tests, as any of us has ever tried Heka in and with Dockers.
> 
> 
> I am not stick to any specific solution, however just not sure whether
> Heka won’t surprise us with something hard to solve, configure, etc.
> 
> 
> Alicja Kwaśniewska
> 

Happy to answer any other questions about how the Infra team runs
Logstash (we don't centralize it for example).

Clark



More information about the OpenStack-dev mailing list