[openstack-dev] [kolla] Introduction of Heka in Kolla

Eric LEMOINE elemoine at mirantis.com
Wed Jan 13 09:55:24 UTC 2016


Hi Alicja


Thank you for your comments.  Answers and comments below.



On Tue, Jan 12, 2016 at 1:19 PM, Kwasniewska, Alicja
<alicja.kwasniewska at intel.com> wrote:
> Unfortunately I do not have any experience in working or testing Heka, so
> it’s hard for me to compare its performance vs Logstash performance. However
> I’ve read that Heka possess a lot advantages over Logstash in this scope.
>
>
> But which version of Logstash did you test? One guy from the Logstash
> community said that: “The next release of logstash (1.2.0 is in beta) has a
> 3.5x improvement in event throughput. For numbers: on my workstation at home
> (6 vcpu on virtualbox, host OS windows, 8 GB ram, host cpu is FX-8150) -
> with logstash 1.1.13, I can process roughly 31,000 events/sec parsing apache
> logs. With logstash 1.2.0.beta1, I can process 102,000 events/sec.”


I've used the latest Docker image:
<https://hub.docker.com/r/library/logstash/>.  It uses Logstash 2.1.1,
which is the most recent stable version.



> You also said that Heka is a unified data processing, but do we need this?


Heka, as a unified data processing, enables to derive metrics from
logs, HTTP response times for example.  Alerts can also be triggered
on specific log patterns.


> Heka seems to address stream processing needs, while Logstash focuses mainly
> on processing logs. We want to create a central logging service, and
> Logstash was created especially for it and seems to work well for this
> application.
>
>
> One thing that is obvious is the fact that the Logstash is better known,
> more popular and tested. Maybe it has some performance disadvantages, but at
> least we know what we can expect from it. Also, it has more pre-built
> plugins and has a lot examples of usage, while Heka doesn’t have many of
> them yet and is nowhere near the range of plugins and integrations provided
> by Logstash.


As Simon said Heka already includes quite a lot of plugins.  See the
Heka documentation [*] for an exhaustive list.  It may indeed be the
case that Logstash includes even more plugins, but Heka has taken us
pretty far already.



> In the case of adding plugins, I’ve read that in order to add Go plugins,
> the binary has to be recompiled, what is a little bit frustrating (static
> linking - to wire in new plugins, have to recompile). On the other hand, the
> Lua plugins do not require it, but the question is whether Lua plugins are
> sufficient? Or maybe adding Go plugins is not so bad?


See Simon's answer.



> You also said that you didn’t test the Heka with Docker, right?


I did test Heka with Docker.  In my performance tests both Heka and
Logstash ran in Docker containers.  What I haven't tested yet is the
Docker Log Input plugin.  We'll do more tests as part of the work on
specifications.



> But do you
> have any experience in setting up Heka in Docker container? I saw that with
> Heka 0.8.0 new Docker features were implemented (included Dockerfiles to
> generate Heka Docker containers for both development and deployment), but
> did you test it? If you didn’t, we could not be sure whether there are any
> issues with it.
>
>
> Moreover you will have to write your own Dockerfile for Heka that inherits
> from Kolla base image (as we discussed during last meeting, we would like to
> have our own images), you won’t be able to inherit from ianneub/heka:0.10 as
> specified in the link that you sent
> http://www.ianneubert.com/wp/2015/03/03/how-to-use-heka-docker-and-tutum/.



As I said in my first email Heka has no dependencies, so creating a
Dockerfile for Heka is quite easy.  See
<https://github.com/elemoine/heka-logstash-comparison/blob/master/Dockerfile>
for the super-simple Dockerfile I've used so far.


> There are also some issues with DockerInput Module which you want to use.
> For example splitters are not available in DockerInput
> (https://github.com/mozilla-services/heka/issues/1643). I can’t say that it
> will affect us, but we also don’t know which new issues may arise during
> first tests, as any of us has ever tried Heka in and with Dockers.



Yes, we're aware of that limitation.  But, we're not sure this is a
problem, as the decoder can be the component coalescing log lines.  We
already have a Lua decoder that does that, accumulating lines of
Python Tracebacks.  I am going to look at this in more detail when
working on the blueprint.



> I am not stick to any specific solution, however just not sure whether Heka
> won’t surprise us with something hard to solve, configure, etc.


We chose Heka because it's lightweight and fast, while providing us
with the flexibility we need for processing different types of data
streams.  The distributed architecture we think is necessary for large
environments requires running the logs processing component on each
cluster node, and we did not want to run a JVM on each node,
especially on compute nodes where user VMs run.



Thanks,



More information about the OpenStack-dev mailing list