[openstack-dev] [kolla] Can Heka solve all the deficiencies in the current rsyslog implementation: was Re: [kolla] Introduction of Heka in Kolla

Fox, Kevin M Kevin.Fox at pnnl.gov
Wed Jan 13 19:57:55 UTC 2016

Some random thoughts...

I've been looking into how to log our production docker containers better, and a couple of things have shown up that may be of interest to you...

1, docker now has a journald backend. so rather then log to files, you just set your daemons to log to stdout, and docker+journald takes care of log rolling, log deletion, and log shipping for you. Setup a remote journald to funnel the traffic off to and your good to go.

2. filebeat can tail log files and ship them to logstash. should be very easy to add a filebeat docker container as a sidecar to watch and upload logs rather then try and cram them through syslog.

and my 2 cents on ELK as an Op. I normally have a similar reaction to Java as Steven Dake does. ELK is becoming very common and EleasticSearch does work fine on the fully open source version though. We've been running it here for a long time now and its quite nice. We already have existing ELK stacks and being able to point Kolla at existing ELK's would be quite useful.

From: Steven Dake (stdake) [stdake at cisco.com]
Sent: Wednesday, January 13, 2016 4:27 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [kolla] Can Heka solve all the deficiencies in the current rsyslog implementation: was Re: [kolla] Introduction of Heka in Kolla


Apologies for top post, not really sure where in this thread to post this
list of questions as its sort of a change in topic so I changed the
subject line :)

Somewhere I read when researching this Heka topic, that Heka cannot log
all details from /dev/log.  Some services like mariadb for example don't
log to stdout as I think Heka requires to operate correctly.  Would you
mind responding on the question "Would Heka be able to effectively log
every piece of information coming off the system related to OpenStack (our
infrastructure services like ceph/mariadb/etc as well as the OpenStack

Also, I want to make sure we can fix up the backtrace defeciency.
Currently rsyslog doesn't log backtraces in python code.  Perhaps Sam or
inc0 know the reason behind it, but I want to make sure we can fix up this
annoyance, because backtraces are mightily important.

Also I want to make sure each node ends up with log files in a data
container (or data volume or whatever we just recently replaced the data
containers with) for all the services for individual node diagnostics.
This helps fill the gap of the Kibana visualization and Elasticsearch
where we may not have a perfect diagnostic solution at the conclusion of
Mitaka and in need of individual node inspection of the logs.  Can Heka be
made to do this?  Our rsyslog implementation does today, and its a hard
requirement for the moment.  If we need some special software to run in
addition to Heka, I could live with that.


Beyond the scheduling logistics which I split off into a different thread,
are there other hard requirements we need to make Eric aware of up front
where you struggled with rsyslog?


On 1/13/16, 4:11 AM, "Kwasniewska, Alicja" <alicja.kwasniewska at intel.com>

>Eric, Patrick, Simon, Clark thanks for your comments.
>I don't know Heka, so that's why I ask a lot of questions. I hope you are
>fine with that:) I am not against Heka, I was just curious how reliable
>it is  and how much experience you have with setting it up in Docker
>environment in order to know both advantages and disadvantages of this
>@Eric, great that you are going to create POC, it will explain a lot and
>it will show us possible problems.
>Kind regards,
>Alicja Kwaśniewska
>-----Original Message-----
>From: Eric LEMOINE [mailto:elemoine at mirantis.com]
>Sent: Wednesday, January 13, 2016 10:55 AM
>To: OpenStack Development Mailing List (not for usage questions)
>Subject: Re: [openstack-dev] [kolla] Introduction of Heka in Kolla
>Hi Alicja
>Thank you for your comments.  Answers and comments below.
>On Tue, Jan 12, 2016 at 1:19 PM, Kwasniewska, Alicja
><alicja.kwasniewska at intel.com> wrote:
>> Unfortunately I do not have any experience in working or testing Heka,
>> so it's hard for me to compare its performance vs Logstash
>> performance. However I've read that Heka possess a lot advantages over
>>Logstash in this scope.
>> But which version of Logstash did you test? One guy from the Logstash
>> community said that: "The next release of logstash (1.2.0 is in beta)
>> has a 3.5x improvement in event throughput. For numbers: on my
>> workstation at home
>> (6 vcpu on virtualbox, host OS windows, 8 GB ram, host cpu is FX-8150)
>> - with logstash 1.1.13, I can process roughly 31,000 events/sec
>> parsing apache logs. With logstash 1.2.0.beta1, I can process 102,000
>I've used the latest Docker image:
><https://hub.docker.com/r/library/logstash/>.  It uses Logstash 2.1.1,
>which is the most recent stable version.
>> You also said that Heka is a unified data processing, but do we need
>Heka, as a unified data processing, enables to derive metrics from logs,
>HTTP response times for example.  Alerts can also be triggered on
>specific log patterns.
>> Heka seems to address stream processing needs, while Logstash focuses
>> mainly on processing logs. We want to create a central logging
>> service, and Logstash was created especially for it and seems to work
>> well for this application.
>> One thing that is obvious is the fact that the Logstash is better
>> known, more popular and tested. Maybe it has some performance
>> disadvantages, but at least we know what we can expect from it. Also,
>> it has more pre-built plugins and has a lot examples of usage, while
>> Heka doesn't have many of them yet and is nowhere near the range of
>> plugins and integrations provided by Logstash.
>As Simon said Heka already includes quite a lot of plugins.  See the Heka
>documentation [*] for an exhaustive list.  It may indeed be the case that
>Logstash includes even more plugins, but Heka has taken us pretty far
>> In the case of adding plugins, I've read that in order to add Go
>> plugins, the binary has to be recompiled, what is a little bit
>> frustrating (static linking - to wire in new plugins, have to
>> recompile). On the other hand, the Lua plugins do not require it, but
>> the question is whether Lua plugins are sufficient? Or maybe adding Go
>>plugins is not so bad?
>See Simon's answer.
>> You also said that you didn't test the Heka with Docker, right?
>I did test Heka with Docker.  In my performance tests both Heka and
>Logstash ran in Docker containers.  What I haven't tested yet is the
>Docker Log Input plugin.  We'll do more tests as part of the work on
>> But do you
>> have any experience in setting up Heka in Docker container? I saw that
>> with Heka 0.8.0 new Docker features were implemented (included
>> Dockerfiles to generate Heka Docker containers for both development
>> and deployment), but did you test it? If you didn't, we could not be
>> sure whether there are any issues with it.
>> Moreover you will have to write your own Dockerfile for Heka that
>> inherits from Kolla base image (as we discussed during last meeting,
>> we would like to have our own images), you won't be able to inherit
>> from ianneub/heka:0.10 as specified in the link that you sent
>As I said in my first email Heka has no dependencies, so creating a
>Dockerfile for Heka is quite easy.  See
>for the super-simple Dockerfile I've used so far.
>> There are also some issues with DockerInput Module which you want to
>> For example splitters are not available in DockerInput
>> (https://github.com/mozilla-services/heka/issues/1643). I can't say
>> that it will affect us, but we also don't know which new issues may
>> arise during first tests, as any of us has ever tried Heka in and with
>Yes, we're aware of that limitation.  But, we're not sure this is a
>problem, as the decoder can be the component coalescing log lines.  We
>already have a Lua decoder that does that, accumulating lines of Python
>Tracebacks.  I am going to look at this in more detail when working on
>the blueprint.
>> I am not stick to any specific solution, however just not sure whether
>> Heka won't surprise us with something hard to solve, configure, etc.
>We chose Heka because it's lightweight and fast, while providing us with
>the flexibility we need for processing different types of data streams.
>The distributed architecture we think is necessary for large environments
>requires running the logs processing component on each cluster node, and
>we did not want to run a JVM on each node, especially on compute nodes
>where user VMs run.
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

More information about the OpenStack-dev mailing list