Open Stack

Wed Jan 13 10:20:01 UTC 2016

On Tue, Jan 12, 2016 at 8:26 PM, Steven Dake (stdake) <stdake at cisco.com> wrote:
> Eric,
>
> Thanks for using the mailing list for this discussion.  I like to see more
> mailing list conversations on big changes related to kolla, which this one
> is :)
> Responses inline.
>
> Please put document #3 (the design document) in gerrit rather than google
> docs in the main kolla repo as a spec using our special snowflake (read
> less cumbersome) spec template.

Sure.  The Google doc is a temporary thing.

> On 1/11/16, 8:16 AM, "Eric LEMOINE" <elemoine at mirantis.com> wrote:
>>In the proposed architecture each cluster node runs an instance of
>>Heka for collecting and processing logs.  And instead of sending the
>>processed logs to a centralized Logstash instance, logs are directly
>>sent to Elasticsearch, which itself can be distributed across multiple
>>nodes for high-availability and scaling.  The proposed architecture is
>>based on Heka, and it doesn't use Logstash.
>
> How are the elasticsearch network addresses discovered by Heka here?

Initially one Elasticsearch instance will be used, as in Alicja's work
(<https://review.openstack.org/#/c/252968>).  In the future, HAProxy,
which is already included in Kolla, could be used between Heka and
Elasticsearch.  Another option would be to extend Heka's Elasticsearch
Ouput plugin to work with a list a Elasticsearch hosts instead of just
one host.

>>That being said, it is important to note that the intent of this
>>proposal is not strictly directed at replacing Logstash by Heka.  The
>>intent is to propose a distributed architecture with Heka running on
>>each cluster node rather than having Logstash run as a centralized
>>logs processing component.  For such a distributed architecture we
>>think that Heka is more appropriate, with a smaller memory footprint
>>and better performances in general.  In addition, Heka is also more
>>than a logs processing tool, as it's designed to process streams of
>>any type of data, including events, logs and metrics.
>
> I think the followup was that the intent of this proposal was to replace
> both logstash and rsyslog.  Could you comment on that?

Yes, it may make sense to remove Rsyslog entirely and only rely on
Heka.  This is something we want/need to assess in the specifications.

> It may be that
> this work has to be punted to the N cycle if that¹s the case - we are
> super short on time, and need updates done.

Yes, sure, that makes sense.

>
> Will you be making it to the Kolla Midcycle Feb 9th and 10th to discuss
> this system face to face?

I won't be able to go myself unfortunately.  But there will probably
be someone from Mirantis (from the kolla-mesos team) whom you can talk
to.

>>* Heka is written in Go and has no dependencies.  Go programs are
>>compiled to native code.  This in contrast to Logstash which uses
>>JRuby and as such requires running a Java Virtual Machine.  Besides
>>this native versus interpreted code aspect, this also can raise the
>>question of which JVM to use (Oracle, OpenJDK?) and which version
>>(6,7,8?).
>
> This I did not know.  I was aware kibana (visualization) was implemented
> in Java.
>
> I would prefer to avoid any Java dependencies in the Kolla project.  The
> reason being there is basically a fork of the virtual machines, the Oracle
> version and the openjdk version.  This creates licensing problems for our
> downstreams.  If Kibana and Elasticsearch are developed in Java, I guess
> we will just have to live with that but the less Java dependencies the
> better.

Confirming that Elasticsearch and Logstash run in a JVM.  However,
Kibana is written in JavaScript, and Kibana version 4 includes a
server-side component that runs in NodeJS.

>>* There are six types of Heka plugins: Inputs, Splitters, Decoders,
>>Filters, Encoders, and Outputs.  Heka plugins are written in Go or
>>Lua.  When written in Lua their executions are sandbox'ed, where
>>misbehaving plugins may be shut down by Heka.  Lua plugins may also be
>>dynamically added to Heka with no config changes or Heka restart. This
>>is an important property on container environments such as Mesos,
>>where workloads are changed dynamically.
>
> For any update to the Kolla environment, I expect a full pull, stop, start
> of the service. This preserves immutability which is a magical property of
> a container.  For more details on my opinions on this matter, please take
> 10 minutes and read:
>
> http://sdake.io/2015/11/11/the-tldr-on-immutable-infrastructure/

Thanks for the link.

>>* Heka is faster than Logstash for processing logs, and its memory
>>footprint is smaller.  I ran tests, where 3,400,000 log messages were
>>read from 500 input files and then written to a single output file.
>>Heka processed the 3,400,000 log messages in 12 seconds, consuming
>>500M of RAM.  Logstash processed the 3,400,000 log messages in 1mn
>>35s, consuming 1.1G of RAM.  Adding a grok filter to parse and
>>structure logs, Logstash processed the 3,400,000 log messages in 2mn
>>15s, consuming 1.5G of RAM. Using an equivalent filtering plugin, Heka
>>processed the 3,400,000 log messages in 27s, consuming 730M of RAM.
>>See my GitHub repo [6] for more information about the test
>>environment.
>
> Nice benchmark information.  It appears Heka is faster and has a smaller
> memory footprint. That said, efficiency isn't our main concern here.
> Resiliency and simplicity are the things I'm after.  Could you address a
> comparison of the simplicity and resiliency tradeoffs?

With Heka running on each cluster and directly sending logs to
Elasticsearch, we'll probably remove both Logstash and Rsyslog.  We
also don't need to use an external queue such as Redis, which is often
recommended with Logstash for robustness reasons.  So I think it is
fair to say that we end up with a simpler system.  And with fewer
links on the chain we also end up with a more resilient solution.

>>Also, I want to say that our team has been using Heka in production
>>for about a year, in clusters of up to 200 nodes.  Heka has proven to
>>be very robust, efficient and flexible enough to address our logs
>>processing and monitoring use-cases.  We've also acquired a solid
>>experience with it.
>
> This is a fantastic piece of information.  It sounds like you have managed
> to make Heka work in Fuel.

Yes, although our Heka-based (so-called) "collectors" are quite
independent from Fuel.

> I just want to be clear that I believe sunk costs are a falacy and the
> fact that we have implemented rsyslog already doesn't mean it can't change
> to Heka.  I just don't want to change one set of manageable problems for a
> different set of unknown and new problems :)

That makes perfect sense.  The specs and proof-of-concepts will help.

> One concern I have with Heka is pre-built RPMs and DEB files for the
> various distributions we implement.  I don't want to lock us in to source
> only for Heka.  Providing a full toolchain for Heka in the binary package
> build doesn't seem ideal.  Can you tell me the status of packaging of Heka
> for RPM and DEB based systems?  I would be satisfied if we had to pull it
> from an upstream repo provided by Mozilla or COPR.  If the status is "its
> not packaged" can you tell me your recommendations on avoiding having a
> full toolchain the container for binary based containers?

I am with you here.  The Heka developers provide rpm and deb packages
for every release.  They're downloadable from the GitHub releases
page: <https://github.com/mozilla-services/heka/releases>.  And it
looks like there are Debian developers interested in packaging Heka
for Debian. See
<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=805278>.

> Please use the openstack workflows that folks are familiar with and get
> the design doc [3] into a spec in the kolla repo.  I want all kolla
> related specs in the kolla repo that people put together in one repository
> even though it may seem a bit odd that it is in a repository unrelated to

Ok.

Thanks.

Open Stack

[openstack-dev] [kolla] Introduction of Heka in Kolla

OpenStack

Community

Documentation

Branding & Legal