[openstack-dev] [kolla] Introduction of Heka in Kolla

Steven Dake (stdake) stdake at cisco.com
Tue Jan 12 19:26:21 UTC 2016


Thanks for using the mailing list for this discussion.  I like to see more
mailing list conversations on big changes related to kolla, which this one
is :)
Responses inline.

Please put document #3 (the design document) in gerrit rather than google
docs in the main kolla repo as a spec using our special snowflake (read
less cumbersome) spec template.

On 1/11/16, 8:16 AM, "Eric LEMOINE" <elemoine at mirantis.com> wrote:

>As discussed on IRC the other day [1] we want to propose a distributed
>logs processing architecture based on Heka [2], built on Alicja
>Kwasniewska's ELK work with
><https://review.openstack.org/#/c/252968/>.  Please take a look at the
>design document I've started working on [3].  The document is still
>work-in-progress, but the "Problem statement" and "Proposed change"
>sections should provide you with a good overview of the architecture
>we have in mind.
>In the proposed architecture each cluster node runs an instance of
>Heka for collecting and processing logs.  And instead of sending the
>processed logs to a centralized Logstash instance, logs are directly
>sent to Elasticsearch, which itself can be distributed across multiple
>nodes for high-availability and scaling.  The proposed architecture is
>based on Heka, and it doesn't use Logstash.

How are the elasticsearch network addresses discovered by Heka here?

>That being said, it is important to note that the intent of this
>proposal is not strictly directed at replacing Logstash by Heka.  The
>intent is to propose a distributed architecture with Heka running on
>each cluster node rather than having Logstash run as a centralized
>logs processing component.  For such a distributed architecture we
>think that Heka is more appropriate, with a smaller memory footprint
>and better performances in general.  In addition, Heka is also more
>than a logs processing tool, as it's designed to process streams of
>any type of data, including events, logs and metrics.

I think the followup was that the intent of this proposal was to replace
both logstash and rsyslog.  Could you comment on that?  It may be that
this work has to be punted to the N cycle if that¹s the case - we are
super short on time, and need updates done.

Will you be making it to the Kolla Midcycle Feb 9th and 10th to discuss
this system face to face?

>Some elements of comparison between Heka and Logstash:
>* Logstash was designed for logs processing.  Heka is a "unified data
>processing" software, designed to process streams of any type of data.
>So Heka is about running one service on each box instead of many.
>Using a single service for processing different types of data also
>makes it possible to do correlations, and derive metrics from logs and
>events.  See Rob Miller's presentation [4] for more details.
>* The virtual size of the Logstash Docker image is 447 MB, while the
>virtual size of an Heka image built from the same base image
>(debian:jessie) is 177 MB.  For comparison the virtual size of the
>Elasticsearch image is 345 MB.

Just a heads up, I don't think there is much concern over size of images.

>* Heka is written in Go and has no dependencies.  Go programs are
>compiled to native code.  This in contrast to Logstash which uses
>JRuby and as such requires running a Java Virtual Machine.  Besides
>this native versus interpreted code aspect, this also can raise the
>question of which JVM to use (Oracle, OpenJDK?) and which version

This I did not know.  I was aware kibana (visualization) was implemented
in Java.

I would prefer to avoid any Java dependencies in the Kolla project.  The
reason being there is basically a fork of the virtual machines, the Oracle
version and the openjdk version.  This creates licensing problems for our
downstreams.  If Kibana and Elasticsearch are developed in Java, I guess
we will just have to live with that but the less Java dependencies the

>* There are six types of Heka plugins: Inputs, Splitters, Decoders,
>Filters, Encoders, and Outputs.  Heka plugins are written in Go or
>Lua.  When written in Lua their executions are sandbox'ed, where
>misbehaving plugins may be shut down by Heka.  Lua plugins may also be
>dynamically added to Heka with no config changes or Heka restart. This
>is an important property on container environments such as Mesos,
>where workloads are changed dynamically.

For any update to the Kolla environment, I expect a full pull, stop, start
of the service. This preserves immutability which is a magical property of
a container.  For more details on my opinions on this matter, please take
10 minutes and read:


>* To avoid losing logs under high load it is often recommend to use
>Logstash together with Redis [5].  Redis plays the role of a buffer,
>where logs are queued when Logstash or Elasticsearch cannot keep up
>with the load.  Heka, as a "unified data processing" software,
>includes its own resilient message queue, making it unnecessary to use
>an external queue (Redis for example).

I like this - less dependencies = more goodness.

>* Heka is faster than Logstash for processing logs, and its memory
>footprint is smaller.  I ran tests, where 3,400,000 log messages were
>read from 500 input files and then written to a single output file.
>Heka processed the 3,400,000 log messages in 12 seconds, consuming
>500M of RAM.  Logstash processed the 3,400,000 log messages in 1mn
>35s, consuming 1.1G of RAM.  Adding a grok filter to parse and
>structure logs, Logstash processed the 3,400,000 log messages in 2mn
>15s, consuming 1.5G of RAM. Using an equivalent filtering plugin, Heka
>processed the 3,400,000 log messages in 27s, consuming 730M of RAM.
>See my GitHub repo [6] for more information about the test

Nice benchmark information.  It appears Heka is faster and has a smaller
memory footprint. That said, efficiency isn't our main concern here.
Resiliency and simplicity are the things I'm after.  Could you address a
comparison of the simplicity and resiliency tradeoffs?

>Also, I want to say that our team has been using Heka in production
>for about a year, in clusters of up to 200 nodes.  Heka has proven to
>be very robust, efficient and flexible enough to address our logs
>processing and monitoring use-cases.  We've also acquired a solid
>experience with it.

This is a fantastic piece of information.  It sounds like you have managed
to make Heka work in Fuel.

I just want to be clear that I believe sunk costs are a falacy and the
fact that we have implemented rsyslog already doesn't mean it can't change
to Heka.  I just don't want to change one set of manageable problems for a
different set of unknown and new problems :)

>Any comments are welcome!

One concern I have with Heka is pre-built RPMs and DEB files for the
various distributions we implement.  I don't want to lock us in to source
only for Heka.  Providing a full toolchain for Heka in the binary package
build doesn't seem ideal.  Can you tell me the status of packaging of Heka
for RPM and DEB based systems?  I would be satisfied if we had to pull it
from an upstream repo provided by Mozilla or COPR.  If the status is "its
not packaged" can you tell me your recommendations on avoiding having a
full toolchain the container for binary based containers?

Please use the openstack workflows that folks are familiar with and get
the design doc [3] into a spec in the kolla repo.  I want all kolla
related specs in the kolla repo that people put together in one repository
even though it may seem a bit odd that it is in a repository unrelated to
>[2] <http://hekad.readthedocs.org>
>[4] <http://www.slideshare.net/devopsdays/heka-rob-miller>
>[5] <http://blog.sematext.com/2015/09/28/recipe-rsyslog-redis-logstash/>
>[6] <https://github.com/elemoine/heka-logstash-comparison>
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

More information about the OpenStack-dev mailing list