[OpenStack-Infra] Adding index and views/dashboards for Kata to ELK stack

Whaley, Graham graham.whaley at intel.com
Tue Nov 27 18:53:16 UTC 2018


(back to an old thread... this has rippled near the top of my pile again)

> -----Original Message-----
> From: Clark Boylan [mailto:cboylan at sapwetik.org]
> Sent: Tuesday, October 23, 2018 6:03 PM
> To: Whaley, Graham <graham.whaley at intel.com>; openstack-
> infra at lists.openstack.org; thierry at openstack.org
> Cc: Ernst, Eric <eric.ernst at intel.com>; fungi at yuggoth.org
> Subject: Re: Adding index and views/dashboards for Kata to ELK stack
[snip]
> > I don't think the Zuul Ansible role will be applicable - the metrics run
> > on bare metal machines running Jenkins, and export their JSON results
> > via a filebeat socket. My theory was we'd then add the socket input to
> > the logstash server to receive from that filebeat - as in my gist at
> >
> https://gist.github.com/grahamwhaley/aa730e6bbd6a8ceab82129042b186467
> 
> I don't think we would want to expose write access to the unauthenticated
> logstash and elasticsearch system to external systems. The thing that makes this
> secure today is we (community infrastructure team) control the existing writers.
> The existing writers are available for your use (see below) should you decide to
> use them.

My theory was we'd secure the connection at least using the logstash/beat SSL connection, and only we/the infra group would have access to the keys:
https://www.elastic.co/guide/en/beats/filebeat/current/configuring-ssl-logstash.html

The machines themselves are only accessible by the CNCF CIL owners and nominated Kata engineers with the keys.
> 
> >
> > One crux here is that the metrics have to run on a machine with
> > guaranteed performance (so not a shared/virtual cloud instance), and
> > hence currently run under Jenkins and not on the OSF/Zuul CI infra.
> 
> Zuul (by way of Nodepool) can speak to arbitrary machines as long as they speak
> an ansible connection protocol. In this case the default of ssh would probably
> work when tied to nodepool's static instance driver. The community
> infrastructure happens to only talk to cloud VMs today because that is what we
> have been given access to, but should be able to talk to other resources if
> people show up with them.

If we ignore the fact that all current Kata CI is running on Jenkins, and we are not presently transitioning to Zuul afaik, then....
Even if we did integrate the bare metal CNCF CIL packet.net machines vi ansible/SSH/nodepool/Zuul, then afaict you'd still be running the same CI tasks on the same machines and injecting the Elastic data through the same SSL socket/tunnel into Elastic.
I know you'd like to keep as much of the infra under your control, but the only bit I think that would be different is the Jenkins Master. Given the Jenkins job running the slave only executes master branch merges, which have undergone peer review (which would be the same jobs that Zuul would run), then I'm not sure there is any security difference here in reality between having the Kata Jenkins master or Zuul drive the slaves.

> 
> >
> > Let me know you see any issues with that Jenkins/filebeat/socket/JSON flow.
> >
> > I need to deploy a new machine to process master branch merges to
> > generate the data (currently we have a machine that is processing PRs at
> > submission, not merge, which is not the data we want to track long
> > term). I'll let you know when I have that up and running. If we wanted
> > to move on this earlier, then I could inject data to a test index from
> > my local test setup - all it would need I believe is the valid keys for
> > the filebeat->logstash connection.

Oh, I've deployed a Jenkins slave and job to test out the first stage of the flow btw:
http://jenkins.katacontainers.io/job/kata-metrics-runtime-ubuntu-16-04-master/

> >
> > > Clark
> > Thanks!
> >   Graham (now on copy ;-)
> 
> Ideally we'd make use of the existing community infrastructure as much as
> possible to make this sustainable and secure. We are happy to modify our
> existing tooling as necessary to do this. Update the logstash configuration, add
> Nodepool resources, have grafana talk to elasticsearch, and so on.

I think the only key decision is if we can use the packet.net slaves as driven by the kata Jenkins master, or if we have to move the management of those into Zuul.
For expediency and consistency with the rest of the Kata CI, obviously I lean heavily towards Jenkins.
If we do have to go with Zuul, then I think we'll have to work out who has access to and how they can modify the Zuul job configs for Kata.

(adding Salvador to CC, as he is the Kata Jenkins owner mostly, and has also worked on the Zuul PoC for Kata before).

 Graham (hoping we can come to some agreement :-) )
---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


More information about the OpenStack-Infra mailing list