[OpenStack-Infra] Adding index and views/dashboards for Kata to ELK stack

Paul Belanger pabelanger at redhat.com
Tue Nov 27 22:00:20 UTC 2018


On Tue, Nov 27, 2018 at 06:53:16PM +0000, Whaley, Graham wrote:
> (back to an old thread... this has rippled near the top of my pile again)
> 
> > -----Original Message-----
> > From: Clark Boylan [mailto:cboylan at sapwetik.org]
> > Sent: Tuesday, October 23, 2018 6:03 PM
> > To: Whaley, Graham <graham.whaley at intel.com>; openstack-
> > infra at lists.openstack.org; thierry at openstack.org
> > Cc: Ernst, Eric <eric.ernst at intel.com>; fungi at yuggoth.org
> > Subject: Re: Adding index and views/dashboards for Kata to ELK stack
> [snip]
> > > I don't think the Zuul Ansible role will be applicable - the metrics run
> > > on bare metal machines running Jenkins, and export their JSON results
> > > via a filebeat socket. My theory was we'd then add the socket input to
> > > the logstash server to receive from that filebeat - as in my gist at
> > >
> > https://gist.github.com/grahamwhaley/aa730e6bbd6a8ceab82129042b186467
> > 
> > I don't think we would want to expose write access to the unauthenticated
> > logstash and elasticsearch system to external systems. The thing that makes this
> > secure today is we (community infrastructure team) control the existing writers.
> > The existing writers are available for your use (see below) should you decide to
> > use them.
> 
> My theory was we'd secure the connection at least using the logstash/beat SSL connection, and only we/the infra group would have access to the keys:
> https://www.elastic.co/guide/en/beats/filebeat/current/configuring-ssl-logstash.html
> 
> The machines themselves are only accessible by the CNCF CIL owners and nominated Kata engineers with the keys.
>
> > 
> > >
> > > One crux here is that the metrics have to run on a machine with
> > > guaranteed performance (so not a shared/virtual cloud instance), and
> > > hence currently run under Jenkins and not on the OSF/Zuul CI infra.
> > 
> > Zuul (by way of Nodepool) can speak to arbitrary machines as long as they speak
> > an ansible connection protocol. In this case the default of ssh would probably
> > work when tied to nodepool's static instance driver. The community
> > infrastructure happens to only talk to cloud VMs today because that is what we
> > have been given access to, but should be able to talk to other resources if
> > people show up with them.
> 
> If we ignore the fact that all current Kata CI is running on Jenkins, and we are not presently transitioning to Zuul afaik, then....
> Even if we did integrate the bare metal CNCF CIL packet.net machines vi ansible/SSH/nodepool/Zuul, then afaict you'd still be running the same CI tasks on the same machines and injecting the Elastic data through the same SSL socket/tunnel into Elastic.

John Studarus at OpenStack summit gave a talk about using zuul and
packet.net, during the talk he mentioned starting to work on a nodepool
driver for packet.net bare metal servers.  I believe the plan is to
upstream it, which then allows for both static and packet.net dynamic
provider.

> I know you'd like to keep as much of the infra under your control, but the only bit I think that would be different is the Jenkins Master. Given the Jenkins job running the slave only executes master branch merges, which have undergone peer review (which would be the same jobs that Zuul would run), then I'm not sure there is any security difference here in reality between having the Kata Jenkins master or Zuul drive the slaves.
> 
> > 
> > >
> > > Let me know you see any issues with that Jenkins/filebeat/socket/JSON flow.
> > >
> > > I need to deploy a new machine to process master branch merges to
> > > generate the data (currently we have a machine that is processing PRs at
> > > submission, not merge, which is not the data we want to track long
> > > term). I'll let you know when I have that up and running. If we wanted
> > > to move on this earlier, then I could inject data to a test index from
> > > my local test setup - all it would need I believe is the valid keys for
> > > the filebeat->logstash connection.
> 
> Oh, I've deployed a Jenkins slave and job to test out the first stage of the flow btw:
> http://jenkins.katacontainers.io/job/kata-metrics-runtime-ubuntu-16-04-master/
> 
> > >
> > > > Clark
> > > Thanks!
> > >   Graham (now on copy ;-)
> > 
> > Ideally we'd make use of the existing community infrastructure as much as
> > possible to make this sustainable and secure. We are happy to modify our
> > existing tooling as necessary to do this. Update the logstash configuration, add
> > Nodepool resources, have grafana talk to elasticsearch, and so on.
> 
> I think the only key decision is if we can use the packet.net slaves as driven by the kata Jenkins master, or if we have to move the management of those into Zuul.
> For expediency and consistency with the rest of the Kata CI, obviously I lean heavily towards Jenkins.
> If we do have to go with Zuul, then I think we'll have to work out who has access to and how they can modify the Zuul job configs for Kata.
> 
> (adding Salvador to CC, as he is the Kata Jenkins owner mostly, and has also worked on the Zuul PoC for Kata before).
> 
>  Graham (hoping we can come to some agreement :-) )




More information about the OpenStack-Infra mailing list