From cboylan at sapwetik.org  Mon Jun  3 19:09:04 2019
From: cboylan at sapwetik.org (Clark Boylan)
Date: Mon, 03 Jun 2019 12:09:04 -0700
Subject: [OpenStack-Infra] Meeting Agenda for June 04, 2019
Message-ID: <27c3b6ae-fa8e-4af5-a35d-4146db0804a0@www.fastmail.com>

== Agenda for next meeting ==

* Announcements

* Actions from last meeting

* Specs approval

* Priority Efforts (Standing meeting agenda items. Please expand if you have subtopics.)
** [http://specs.openstack.org/openstack-infra/infra-specs/specs/task-tracker.html A Task Tracker for OpenStack]
** [http://specs.openstack.org/openstack-infra/infra-specs/specs/update-config-management.html Update Config Management]
*** topic:update-cfg-mgmt
*** Zuul as CD engine
** OpenDev
*** Next steps

* General topics
** Trusty Upgrade Progress (clarkb 20190604)
** https mirror update (clarkb 20190604)
*** Ready to deploy across all regions?
** ARM64 Nodepool builder status (clarkb 20190604)

* Open discussion


From fungi at yuggoth.org  Fri Jun  7 21:44:46 2019
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Fri, 7 Jun 2019 21:44:46 +0000
Subject: [OpenStack-Infra] SPF enforcement on mailing lists
Message-ID: <20190607214446.y57ptqc52eshphmh@yuggoth.org>

This is just a quick heads up that, in order to deal with recent
spam flooding from spoofed E-mail addresses claiming to originate
from domains which provide strict IETF RFC 7208 SPF records[*], we
have instituted a change to reject messages failing SPF "-all"
policies for their domains (not those with "?all" or "~all") at time
of receipt to our listservs. If anyone encounters issues which look
like a rejection resulting from this change in behavior, please
reach out to us in the #openstack-infra channel on the Freenode IRC
network.

[*] https://tools.ietf.org/html/rfc7208
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-infra/attachments/20190607/c497dccc/attachment.sig>

From cboylan at sapwetik.org  Fri Jun  7 21:52:35 2019
From: cboylan at sapwetik.org (Clark Boylan)
Date: Fri, 07 Jun 2019 14:52:35 -0700
Subject: [OpenStack-Infra] SPF enforcement on mailing lists
In-Reply-To: <20190607214446.y57ptqc52eshphmh@yuggoth.org>
References: <20190607214446.y57ptqc52eshphmh@yuggoth.org>
Message-ID: <ff06ae6d-2be3-4c69-af53-f32cbd02d0a1@www.fastmail.com>

On Fri, Jun 7, 2019, at 2:51 PM, Jeremy Stanley wrote:
> This is just a quick heads up that, in order to deal with recent
> spam flooding from spoofed E-mail addresses claiming to originate
> from domains which provide strict IETF RFC 7208 SPF records[*], we
> have instituted a change to reject messages failing SPF "-all"
> policies for their domains (not those with "?all" or "~all") at time
> of receipt to our listservs. If anyone encounters issues which look
> like a rejection resulting from this change in behavior, please
> reach out to us in the #openstack-infra channel on the Freenode IRC
> network.
> 
> [*] https://tools.ietf.org/html/rfc7208
> -- 
> Jeremy Stanley

And to confirm that ?all isn't rejected, here is an email from a domain with an ?all in its spf record.

Clark


From cboylan at sapwetik.org  Mon Jun 10 19:33:02 2019
From: cboylan at sapwetik.org (Clark Boylan)
Date: Mon, 10 Jun 2019 12:33:02 -0700
Subject: [OpenStack-Infra] Meeting Agenda for June 11, 2019
Message-ID: <6928876d-5905-4399-9b9a-15a492f0da79@www.fastmail.com>

== Agenda for next meeting ==

* Announcements
** Clarkb Out June 17-20. Need volunteer to run the meeting June 18.

* Actions from last meeting

* Specs approval

* Priority Efforts (Standing meeting agenda items. Please expand if you have subtopics.)
** [http://specs.openstack.org/openstack-infra/infra-specs/specs/task-tracker.html A Task Tracker for OpenStack]
** [http://specs.openstack.org/openstack-infra/infra-specs/specs/update-config-management.html Update Config Management]
*** topic:update-cfg-mgmt
*** Zuul as CD engine
** OpenDev
*** Next steps

* General topics
** Trusty Upgrade Progress (clarkb 20190611)
** https mirror update (clarkb 20190611)
*** Ready to deploy across all regions? Seems like the ubuntu mirror issues have subsided?
** GitHub replication (corvus 20190611)
*** http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005007.html
** Requesting Spamhaus PBL exceptions (fungi 20190611)
** Replacing SSL certs this week (clarkb 20190611)

* Open discussion


From iwienand at redhat.com  Tue Jun 11 08:00:01 2019
From: iwienand at redhat.com (Ian Wienand)
Date: Tue, 11 Jun 2019 18:00:01 +1000
Subject: [OpenStack-Infra] ARA 1.0 deployment plans
Message-ID: <20190611080001.GA13504@fedora19.localdomain>

Hello,

I started to look at the system-config base -devel job, which runs
Ansible & ARA from master (this job has been quite useful in flagging
issues early across Ansible, testinfra, ARA etc, but it takes a bit
for us to keep it stable...)

It seems ARA 1.0 has moved in some directions we're not handling right
now.  Playing with [1] I've got ARA generating and uploading it's
database.

Currently, Apache matches an ara-report/ directory on
logs.openstack.org and sent to the ARA wsgi application which serves
the response from the sqlite db in that directory [2].

If I'm understanding, we now need ara-web [3] to display the report
page we all enjoy.  However this web app currently only gets data from
an ARA server instance that provides a REST interface with the info?

I'm not really seeing how this fits with the current middleware
deployment? (unfortunately [4] or an analogue in the new release seems
to have disappeared).  Do we now host a separate ARA server on
logs.openstack.org on some known port that knows how to turn
/*/ara-report/ URL requests into access of the .sqlite db on disk and
thus provide the REST interface?  And then somehow we host a ara-web
instance that knows how to request from this?

Given I can't see us wanting to do a bunch of puppet hacking to get
new services on logs.openstack.org, but yet also it requiring fairly
non-trivial effort to get the extant bits and pieces on that server
migrated to an all-Ansible environment, I think we have to give some
thought as to how we'll roll this out (plus add in containers,
possible logs on swift, etc ... for extra complexity :)

So does anyone have thoughts on a high-level view of how this might
hang together?

-i

[1] https://review.opendev.org/#/c/664478/
[2] https://opendev.org/opendev/puppet-openstackci/src/branch/master/templates/logs.vhost.erb
[3] https://github.com/ansible-community/ara-web
[4] https://ara.readthedocs.io/en/stable/advanced.html


From dmsimard at redhat.com  Tue Jun 11 20:39:58 2019
From: dmsimard at redhat.com (David Moreau Simard)
Date: Tue, 11 Jun 2019 16:39:58 -0400
Subject: [OpenStack-Infra] ARA 1.0 deployment plans
In-Reply-To: <20190611080001.GA13504@fedora19.localdomain>
References: <20190611080001.GA13504@fedora19.localdomain>
Message-ID: <CAH7C+PouYae1-ih_8Bm1jtXCP6gsZGbBtN8Y_kPZzQT5aZEq0g@mail.gmail.com>

Thanks for starting this thread Ian.

On Tue, Jun 11, 2019 at 4:09 AM Ian Wienand <iwienand at redhat.com> wrote:
> It seems ARA 1.0 has moved in some directions we're not handling right
> now.  Playing with [1] I've got ARA generating and uploading it's
> database.

Patch looks good to me.
Thanks for playing with it :)

> Currently, Apache matches an ara-report/ directory on
> logs.openstack.org and sent to the ARA wsgi application which serves
> the response from the sqlite db in that directory [2].

Correct.

> If I'm understanding, we now need ara-web [3] to display the report
> page we all enjoy.  However this web app currently only gets data from
> an ARA server instance that provides a REST interface with the info?

Correct.

> I'm not really seeing how this fits with the current middleware
> deployment? (unfortunately [4] or an analogue in the new release seems
> to have disappeared).  Do we now host a separate ARA server on
> logs.openstack.org on some known port that knows how to turn
> /*/ara-report/ URL requests into access of the .sqlite db on disk and
> thus provide the REST interface?

The sqlite middleware doesn't have an equivalent in 1.0 right now.

Although it was first implemented as somewhat of a hack to address the
lack of scalability of HTML generation, I've gotten to like the design
principle of isolating a job's result in a single database.

It easy to scale and keeps latency to a minimum compared to a central
database server.

I'm convinced that implementing a similar approach for 1.0 would make
sense. I would happily accept any input here.

> And then somehow we host a ara-web instance that knows how to
> request from this ?

Right now the API server is defined by the configuration [1].
We would need to implement a more "dynamic" way of being able to
specify an API endpoint to query.

For example, we recently added support for ara-web to prompt a login
page if the API server requires authentication. The credentials are stored
locally and then used to authenticate queries against the API.

It might not be too much of a stretch to implement a way to store the
API endpoint locally like we do for credentials.

We wouldn't want the user(s) to type in the API endpoint every time, though.
There is a little thinking to do about how to glue the things together.

The combination of the rewrite rule, the wsgi middleware and the fact that
0.x was a monolithic app definitely made this easier.

> Given I can't see us wanting to do a bunch of puppet hacking to get
> new services on logs.openstack.org, but yet also it requiring fairly
> non-trivial effort to get the extant bits and pieces on that server
> migrated to an all-Ansible environment, I think we have to give some
> thought as to how we'll roll this out (plus add in containers,
> possible logs on swift, etc ... for extra complexity :)
>
> So does anyone have thoughts on a high-level view of how this might
> hang together?

For now I've created an etherpad [2] as a starting point to summarize
some of what has been written here as well as other points which are perhaps
more Zuul-specific.

[1]: https://github.com/ansible-community/ara-web/blob/master/public/config.json
[2]: https://etherpad.openstack.org/p/ara-1.0-in-zuul

David Moreau Simard
dmsimard = [irc, github, twitter]


From corvus at inaugust.com  Tue Jun 11 23:38:01 2019
From: corvus at inaugust.com (James E. Blair)
Date: Tue, 11 Jun 2019 16:38:01 -0700
Subject: [OpenStack-Infra] ARA 1.0 deployment plans
In-Reply-To: <20190611080001.GA13504@fedora19.localdomain> (Ian Wienand's
 message of "Tue, 11 Jun 2019 18:00:01 +1000")
References: <20190611080001.GA13504@fedora19.localdomain>
Message-ID: <874l4vae1y.fsf@meyer.lemoncheese.net>

Ian Wienand <iwienand at redhat.com> writes:

> Given I can't see us wanting to do a bunch of puppet hacking to get
> new services on logs.openstack.org, but yet also it requiring fairly
> non-trivial effort to get the extant bits and pieces on that server
> migrated to an all-Ansible environment, I think we have to give some
> thought as to how we'll roll this out (plus add in containers,
> possible logs on swift, etc ... for extra complexity :)

Given that our direction is to remove logs.o.o from the system along
with all of its proxy services and only serve static files from swift, I
think we will have to continue to use the old version of ARA until the
story with static generation is worked out for 1.0.

-Jim


From iwienand at redhat.com  Tue Jun 18 00:32:59 2019
From: iwienand at redhat.com (Ian Wienand)
Date: Tue, 18 Jun 2019 10:32:59 +1000
Subject: [OpenStack-Infra] Meeting Agenda for June 18, 2019
Message-ID: <20190618003259.GA20591@fedora19.localdomain>

== Agenda for next meeting ==

* Announcements

* Actions from last meeting

* Specs approval

* Priority Efforts (Standing meeting agenda items. Please expand if you have subtopics.)
** [http://specs.openstack.org/openstack-infra/infra-specs/specs/task-tracker.html A Task Tracker for OpenStack]
** [http://specs.openstack.org/openstack-infra/infra-specs/specs/update-config-management.html Update Config Management]
*** topic:update-cfg-mgmt
*** Zuul as CD engine
** OpenDev
*** Next steps

* General topics
** Trusty Upgrade Progress (ianw 20190618)
** https mirror update (ianw 20190618)
*** kafs in production update
*** https://review.opendev.org/#/q/status:open+branch:master+topic:kafs

* Open discussion


From iwienand at redhat.com  Tue Jun 18 01:55:27 2019
From: iwienand at redhat.com (Ian Wienand)
Date: Tue, 18 Jun 2019 11:55:27 +1000
Subject: [OpenStack-Infra] ARA 1.0 deployment plans
In-Reply-To: <CAH7C+PouYae1-ih_8Bm1jtXCP6gsZGbBtN8Y_kPZzQT5aZEq0g@mail.gmail.com>
References: <20190611080001.GA13504@fedora19.localdomain>
 <CAH7C+PouYae1-ih_8Bm1jtXCP6gsZGbBtN8Y_kPZzQT5aZEq0g@mail.gmail.com>
Message-ID: <20190618015527.GA26490@fedora19.localdomain>

On Tue, Jun 11, 2019 at 04:39:58PM -0400, David Moreau Simard wrote:
> Although it was first implemented as somewhat of a hack to address the
> lack of scalability of HTML generation, I've gotten to like the design
> principle of isolating a job's result in a single database.
> 
> It easy to scale and keeps latency to a minimum compared to a central
> database server.

I've been ruminating on how all this can work given some constraints
of

- keep current model of "click on a link in the logs, see the ara
  results"

- no middleware to intercept such clicks with logs on swift

- don't actually know where the logs are if using swift (not just
  logs.openstack.org/xy/123456/) which makes it harder to find job
  artefacts like sqlite db's post job run (have to query gerrit or
  zuul results db?)

- some jobs, like in system-config, have "nested" ARA reports from
  subnodes; essentially reporting twice.

Can the ARA backend import a sqlite run after the fact?  I agree
adding latency to jobs running globally sending results piecemeal back
to a central db isn't going work; but if it logged everything to a
local db as now, then we uploaded that to a central location in post
that might work?  Although we can't run services/middleware on logs
directly, we could store the results as we see fit and run services on
a separate host.

If say, you had a role that sent the generated ARA sqlite.db to
ara.opendev.org and got back a UUID, then it could write into the logs
ara-report/index.html which might just be a straight 301 redirect to
https://ara.opendev.org/UUID.  This satisfies the "just click on it"
part.

It seems that "all" that needs to happen is that requests for
https://ara.opendev.org/uuid/api/v1/... to query either just the
results for "uuid" in the db.

And could the ara-web app (which is presumably then just statically
served from that host) know that when started as
https://ara.opendev.org/uuid it should talk to
https://ara.opendev.org/uuid/api/...?

I think though, this might be relying on a feature of the ara REST
server that doesn't exist -- the idea of unique "runs"?  Is that
something you'd have to paper-over with, say wsgi starting a separate
ara REST process/thread to respond to each incoming
/uuid/api/... request (maybe the process just starts pointing to
/opt/logs/uuid/results.sqlite)?

This doesn't have to grow indefinitely, we can similarly just have a
cron query to delete rows older than X weeks.

Easy in theory, of course ;)

-i


From ssbarnea at redhat.com  Thu Jun 20 09:12:09 2019
From: ssbarnea at redhat.com (Sorin Sbarnea)
Date: Thu, 20 Jun 2019 10:12:09 +0100
Subject: [OpenStack-Infra] getting some advanced stats out of gerrit reviews
Message-ID: <E4830709-41D0-43C7-BA15-952F44671FE3@redhat.com>

I would like to build some stats related to openstack gerrit reviews and I would like to know few things:
* how can I get a dump of the reviews with all comments?  (queries on live DB are not an option)
* did anyone already invested time in this direction? (happy to team-up instead of having a DIY)

One graph I do want to build is would be with average number of rechecks done on reviews before they are merged, per month. As you can see that is not a trivial query to do but is doable.

Another metric it would be fun to build is the estimated failure-rate for a single CR. As I can compute an average failure rase for each job, it should not be hard to compute an estimate failure rate when I have a the list of jobs triggered for specific change. That second one may not be very practical but I am really curious what kind of values I would endup with (obvious the range would vary a lot between projects).

Thanks
Sorin

From 751484782 at qq.com  Mon Jun 24 01:54:28 2019
From: 751484782 at qq.com (wangchengcheng)
Date: Mon, 24 Jun 2019 09:54:28 +0800
Subject: [OpenStack-Infra] New contributor
Message-ID: <FE01A286-DD90-43FA-9C61-430A21787134@qq.com>

751484782 at qq.com


From cboylan at sapwetik.org  Mon Jun 24 17:00:34 2019
From: cboylan at sapwetik.org (Clark Boylan)
Date: Mon, 24 Jun 2019 10:00:34 -0700
Subject: [OpenStack-Infra]
 =?utf-8?q?getting_some_advanced_stats_out_of_ge?=
 =?utf-8?q?rrit_reviews?=
In-Reply-To: <E4830709-41D0-43C7-BA15-952F44671FE3@redhat.com>
References: <E4830709-41D0-43C7-BA15-952F44671FE3@redhat.com>
Message-ID: <7833fde9-de78-4664-8834-43ccef108202@www.fastmail.com>

On Thu, Jun 20, 2019, at 2:15 AM, Sorin Sbarnea wrote:
> I would like to build some stats related to openstack gerrit reviews 
> and I would like to know few things:
> * how can I get a dump of the reviews with all comments?  (queries on 
> live DB are not an option)

You can query for the list of changes using https://review.opendev.org/Documentation/rest-api-changes.html#list-changes. The query can include a start and stop date too. Then iterate over that list of changes and get their comments with https://review.opendev.org/Documentation/rest-api-changes.html#list-change-comments.

> * did anyone already invested time in this direction? (happy to team-up 
> instead of having a DIY)

I believe ttx has done similar work to find different stats in the past. I want to say one of the big take aways from that was to not query all changes every time. You want to run periodically then query only the changes that have updated since your last run.

> 
> One graph I do want to build is would be with average number of 
> rechecks done on reviews before they are merged, per month. As you can 
> see that is not a trivial query to do but is doable.
> 
> Another metric it would be fun to build is the estimated failure-rate 
> for a single CR. As I can compute an average failure rase for each job, 
> it should not be hard to compute an estimate failure rate when I have a 
> the list of jobs triggered for specific change. That second one may not 
> be very practical but I am really curious what kind of values I would 
> endup with (obvious the range would vary a lot between projects).

You may be able to approximate this simply by checking graphite for failure rates on jobs then assume failures are independent (this assumption likely untrue but probably good enough for a start) then calculate the aggregate chance of failures for those jobs.

> 
> Thanks
> Sorin


From cboylan at sapwetik.org  Mon Jun 24 20:58:37 2019
From: cboylan at sapwetik.org (Clark Boylan)
Date: Mon, 24 Jun 2019 13:58:37 -0700
Subject: [OpenStack-Infra] Meeting Agenda for June 25, 2019
Message-ID: <9d05af64-753f-4081-a6aa-c61684c09e81@www.fastmail.com>

== Agenda for next meeting ==

* Announcements
** Zuul Cloner shim and Bindep fallback file removed from base OpenDev jobs

* Actions from last meeting

* Specs approval

* Priority Efforts (Standing meeting agenda items. Please expand if you have subtopics.)
** [http://specs.openstack.org/openstack-infra/infra-specs/specs/task-tracker.html A Task Tracker for OpenStack]
** [http://specs.openstack.org/openstack-infra/infra-specs/specs/update-config-management.html Update Config Management]
*** topic:update-cfg-mgmt
*** Zuul as CD engine
** OpenDev
*** Next steps

* General topics
** Trusty Upgrade Progress (clarkb 20190625)
** https mirror update (clarkb 20190625)
*** AFS on Bionic (kafs vs openafs)
*** Status update
*** https://review.opendev.org/#/q/status:open+branch:master+topic:kafs

* Open discussion