From cboylan at sapwetik.org Mon Jun 3 19:09:04 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 03 Jun 2019 12:09:04 -0700 Subject: [OpenStack-Infra] Meeting Agenda for June 04, 2019 Message-ID: <27c3b6ae-fa8e-4af5-a35d-4146db0804a0@www.fastmail.com> == Agenda for next meeting == * Announcements * Actions from last meeting * Specs approval * Priority Efforts (Standing meeting agenda items. Please expand if you have subtopics.) ** [http://specs.openstack.org/openstack-infra/infra-specs/specs/task-tracker.html A Task Tracker for OpenStack] ** [http://specs.openstack.org/openstack-infra/infra-specs/specs/update-config-management.html Update Config Management] *** topic:update-cfg-mgmt *** Zuul as CD engine ** OpenDev *** Next steps * General topics ** Trusty Upgrade Progress (clarkb 20190604) ** https mirror update (clarkb 20190604) *** Ready to deploy across all regions? ** ARM64 Nodepool builder status (clarkb 20190604) * Open discussion From fungi at yuggoth.org Fri Jun 7 21:44:46 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 7 Jun 2019 21:44:46 +0000 Subject: [OpenStack-Infra] SPF enforcement on mailing lists Message-ID: <20190607214446.y57ptqc52eshphmh@yuggoth.org> This is just a quick heads up that, in order to deal with recent spam flooding from spoofed E-mail addresses claiming to originate from domains which provide strict IETF RFC 7208 SPF records[*], we have instituted a change to reject messages failing SPF "-all" policies for their domains (not those with "?all" or "~all") at time of receipt to our listservs. If anyone encounters issues which look like a rejection resulting from this change in behavior, please reach out to us in the #openstack-infra channel on the Freenode IRC network. [*] https://tools.ietf.org/html/rfc7208 -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From cboylan at sapwetik.org Fri Jun 7 21:52:35 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Fri, 07 Jun 2019 14:52:35 -0700 Subject: [OpenStack-Infra] SPF enforcement on mailing lists In-Reply-To: <20190607214446.y57ptqc52eshphmh@yuggoth.org> References: <20190607214446.y57ptqc52eshphmh@yuggoth.org> Message-ID: On Fri, Jun 7, 2019, at 2:51 PM, Jeremy Stanley wrote: > This is just a quick heads up that, in order to deal with recent > spam flooding from spoofed E-mail addresses claiming to originate > from domains which provide strict IETF RFC 7208 SPF records[*], we > have instituted a change to reject messages failing SPF "-all" > policies for their domains (not those with "?all" or "~all") at time > of receipt to our listservs. If anyone encounters issues which look > like a rejection resulting from this change in behavior, please > reach out to us in the #openstack-infra channel on the Freenode IRC > network. > > [*] https://tools.ietf.org/html/rfc7208 > -- > Jeremy Stanley And to confirm that ?all isn't rejected, here is an email from a domain with an ?all in its spf record. Clark From cboylan at sapwetik.org Mon Jun 10 19:33:02 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 10 Jun 2019 12:33:02 -0700 Subject: [OpenStack-Infra] Meeting Agenda for June 11, 2019 Message-ID: <6928876d-5905-4399-9b9a-15a492f0da79@www.fastmail.com> == Agenda for next meeting == * Announcements ** Clarkb Out June 17-20. Need volunteer to run the meeting June 18. * Actions from last meeting * Specs approval * Priority Efforts (Standing meeting agenda items. Please expand if you have subtopics.) ** [http://specs.openstack.org/openstack-infra/infra-specs/specs/task-tracker.html A Task Tracker for OpenStack] ** [http://specs.openstack.org/openstack-infra/infra-specs/specs/update-config-management.html Update Config Management] *** topic:update-cfg-mgmt *** Zuul as CD engine ** OpenDev *** Next steps * General topics ** Trusty Upgrade Progress (clarkb 20190611) ** https mirror update (clarkb 20190611) *** Ready to deploy across all regions? Seems like the ubuntu mirror issues have subsided? ** GitHub replication (corvus 20190611) *** http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005007.html ** Requesting Spamhaus PBL exceptions (fungi 20190611) ** Replacing SSL certs this week (clarkb 20190611) * Open discussion From iwienand at redhat.com Tue Jun 11 08:00:01 2019 From: iwienand at redhat.com (Ian Wienand) Date: Tue, 11 Jun 2019 18:00:01 +1000 Subject: [OpenStack-Infra] ARA 1.0 deployment plans Message-ID: <20190611080001.GA13504@fedora19.localdomain> Hello, I started to look at the system-config base -devel job, which runs Ansible & ARA from master (this job has been quite useful in flagging issues early across Ansible, testinfra, ARA etc, but it takes a bit for us to keep it stable...) It seems ARA 1.0 has moved in some directions we're not handling right now. Playing with [1] I've got ARA generating and uploading it's database. Currently, Apache matches an ara-report/ directory on logs.openstack.org and sent to the ARA wsgi application which serves the response from the sqlite db in that directory [2]. If I'm understanding, we now need ara-web [3] to display the report page we all enjoy. However this web app currently only gets data from an ARA server instance that provides a REST interface with the info? I'm not really seeing how this fits with the current middleware deployment? (unfortunately [4] or an analogue in the new release seems to have disappeared). Do we now host a separate ARA server on logs.openstack.org on some known port that knows how to turn /*/ara-report/ URL requests into access of the .sqlite db on disk and thus provide the REST interface? And then somehow we host a ara-web instance that knows how to request from this? Given I can't see us wanting to do a bunch of puppet hacking to get new services on logs.openstack.org, but yet also it requiring fairly non-trivial effort to get the extant bits and pieces on that server migrated to an all-Ansible environment, I think we have to give some thought as to how we'll roll this out (plus add in containers, possible logs on swift, etc ... for extra complexity :) So does anyone have thoughts on a high-level view of how this might hang together? -i [1] https://review.opendev.org/#/c/664478/ [2] https://opendev.org/opendev/puppet-openstackci/src/branch/master/templates/logs.vhost.erb [3] https://github.com/ansible-community/ara-web [4] https://ara.readthedocs.io/en/stable/advanced.html From dmsimard at redhat.com Tue Jun 11 20:39:58 2019 From: dmsimard at redhat.com (David Moreau Simard) Date: Tue, 11 Jun 2019 16:39:58 -0400 Subject: [OpenStack-Infra] ARA 1.0 deployment plans In-Reply-To: <20190611080001.GA13504@fedora19.localdomain> References: <20190611080001.GA13504@fedora19.localdomain> Message-ID: Thanks for starting this thread Ian. On Tue, Jun 11, 2019 at 4:09 AM Ian Wienand wrote: > It seems ARA 1.0 has moved in some directions we're not handling right > now. Playing with [1] I've got ARA generating and uploading it's > database. Patch looks good to me. Thanks for playing with it :) > Currently, Apache matches an ara-report/ directory on > logs.openstack.org and sent to the ARA wsgi application which serves > the response from the sqlite db in that directory [2]. Correct. > If I'm understanding, we now need ara-web [3] to display the report > page we all enjoy. However this web app currently only gets data from > an ARA server instance that provides a REST interface with the info? Correct. > I'm not really seeing how this fits with the current middleware > deployment? (unfortunately [4] or an analogue in the new release seems > to have disappeared). Do we now host a separate ARA server on > logs.openstack.org on some known port that knows how to turn > /*/ara-report/ URL requests into access of the .sqlite db on disk and > thus provide the REST interface? The sqlite middleware doesn't have an equivalent in 1.0 right now. Although it was first implemented as somewhat of a hack to address the lack of scalability of HTML generation, I've gotten to like the design principle of isolating a job's result in a single database. It easy to scale and keeps latency to a minimum compared to a central database server. I'm convinced that implementing a similar approach for 1.0 would make sense. I would happily accept any input here. > And then somehow we host a ara-web instance that knows how to > request from this ? Right now the API server is defined by the configuration [1]. We would need to implement a more "dynamic" way of being able to specify an API endpoint to query. For example, we recently added support for ara-web to prompt a login page if the API server requires authentication. The credentials are stored locally and then used to authenticate queries against the API. It might not be too much of a stretch to implement a way to store the API endpoint locally like we do for credentials. We wouldn't want the user(s) to type in the API endpoint every time, though. There is a little thinking to do about how to glue the things together. The combination of the rewrite rule, the wsgi middleware and the fact that 0.x was a monolithic app definitely made this easier. > Given I can't see us wanting to do a bunch of puppet hacking to get > new services on logs.openstack.org, but yet also it requiring fairly > non-trivial effort to get the extant bits and pieces on that server > migrated to an all-Ansible environment, I think we have to give some > thought as to how we'll roll this out (plus add in containers, > possible logs on swift, etc ... for extra complexity :) > > So does anyone have thoughts on a high-level view of how this might > hang together? For now I've created an etherpad [2] as a starting point to summarize some of what has been written here as well as other points which are perhaps more Zuul-specific. [1]: https://github.com/ansible-community/ara-web/blob/master/public/config.json [2]: https://etherpad.openstack.org/p/ara-1.0-in-zuul David Moreau Simard dmsimard = [irc, github, twitter] From corvus at inaugust.com Tue Jun 11 23:38:01 2019 From: corvus at inaugust.com (James E. Blair) Date: Tue, 11 Jun 2019 16:38:01 -0700 Subject: [OpenStack-Infra] ARA 1.0 deployment plans In-Reply-To: <20190611080001.GA13504@fedora19.localdomain> (Ian Wienand's message of "Tue, 11 Jun 2019 18:00:01 +1000") References: <20190611080001.GA13504@fedora19.localdomain> Message-ID: <874l4vae1y.fsf@meyer.lemoncheese.net> Ian Wienand writes: > Given I can't see us wanting to do a bunch of puppet hacking to get > new services on logs.openstack.org, but yet also it requiring fairly > non-trivial effort to get the extant bits and pieces on that server > migrated to an all-Ansible environment, I think we have to give some > thought as to how we'll roll this out (plus add in containers, > possible logs on swift, etc ... for extra complexity :) Given that our direction is to remove logs.o.o from the system along with all of its proxy services and only serve static files from swift, I think we will have to continue to use the old version of ARA until the story with static generation is worked out for 1.0. -Jim From iwienand at redhat.com Tue Jun 18 00:32:59 2019 From: iwienand at redhat.com (Ian Wienand) Date: Tue, 18 Jun 2019 10:32:59 +1000 Subject: [OpenStack-Infra] Meeting Agenda for June 18, 2019 Message-ID: <20190618003259.GA20591@fedora19.localdomain> == Agenda for next meeting == * Announcements * Actions from last meeting * Specs approval * Priority Efforts (Standing meeting agenda items. Please expand if you have subtopics.) ** [http://specs.openstack.org/openstack-infra/infra-specs/specs/task-tracker.html A Task Tracker for OpenStack] ** [http://specs.openstack.org/openstack-infra/infra-specs/specs/update-config-management.html Update Config Management] *** topic:update-cfg-mgmt *** Zuul as CD engine ** OpenDev *** Next steps * General topics ** Trusty Upgrade Progress (ianw 20190618) ** https mirror update (ianw 20190618) *** kafs in production update *** https://review.opendev.org/#/q/status:open+branch:master+topic:kafs * Open discussion From iwienand at redhat.com Tue Jun 18 01:55:27 2019 From: iwienand at redhat.com (Ian Wienand) Date: Tue, 18 Jun 2019 11:55:27 +1000 Subject: [OpenStack-Infra] ARA 1.0 deployment plans In-Reply-To: References: <20190611080001.GA13504@fedora19.localdomain> Message-ID: <20190618015527.GA26490@fedora19.localdomain> On Tue, Jun 11, 2019 at 04:39:58PM -0400, David Moreau Simard wrote: > Although it was first implemented as somewhat of a hack to address the > lack of scalability of HTML generation, I've gotten to like the design > principle of isolating a job's result in a single database. > > It easy to scale and keeps latency to a minimum compared to a central > database server. I've been ruminating on how all this can work given some constraints of - keep current model of "click on a link in the logs, see the ara results" - no middleware to intercept such clicks with logs on swift - don't actually know where the logs are if using swift (not just logs.openstack.org/xy/123456/) which makes it harder to find job artefacts like sqlite db's post job run (have to query gerrit or zuul results db?) - some jobs, like in system-config, have "nested" ARA reports from subnodes; essentially reporting twice. Can the ARA backend import a sqlite run after the fact? I agree adding latency to jobs running globally sending results piecemeal back to a central db isn't going work; but if it logged everything to a local db as now, then we uploaded that to a central location in post that might work? Although we can't run services/middleware on logs directly, we could store the results as we see fit and run services on a separate host. If say, you had a role that sent the generated ARA sqlite.db to ara.opendev.org and got back a UUID, then it could write into the logs ara-report/index.html which might just be a straight 301 redirect to https://ara.opendev.org/UUID. This satisfies the "just click on it" part. It seems that "all" that needs to happen is that requests for https://ara.opendev.org/uuid/api/v1/... to query either just the results for "uuid" in the db. And could the ara-web app (which is presumably then just statically served from that host) know that when started as https://ara.opendev.org/uuid it should talk to https://ara.opendev.org/uuid/api/...? I think though, this might be relying on a feature of the ara REST server that doesn't exist -- the idea of unique "runs"? Is that something you'd have to paper-over with, say wsgi starting a separate ara REST process/thread to respond to each incoming /uuid/api/... request (maybe the process just starts pointing to /opt/logs/uuid/results.sqlite)? This doesn't have to grow indefinitely, we can similarly just have a cron query to delete rows older than X weeks. Easy in theory, of course ;) -i From ssbarnea at redhat.com Thu Jun 20 09:12:09 2019 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Thu, 20 Jun 2019 10:12:09 +0100 Subject: [OpenStack-Infra] getting some advanced stats out of gerrit reviews Message-ID: I would like to build some stats related to openstack gerrit reviews and I would like to know few things: * how can I get a dump of the reviews with all comments? (queries on live DB are not an option) * did anyone already invested time in this direction? (happy to team-up instead of having a DIY) One graph I do want to build is would be with average number of rechecks done on reviews before they are merged, per month. As you can see that is not a trivial query to do but is doable. Another metric it would be fun to build is the estimated failure-rate for a single CR. As I can compute an average failure rase for each job, it should not be hard to compute an estimate failure rate when I have a the list of jobs triggered for specific change. That second one may not be very practical but I am really curious what kind of values I would endup with (obvious the range would vary a lot between projects). Thanks Sorin From 751484782 at qq.com Mon Jun 24 01:54:28 2019 From: 751484782 at qq.com (wangchengcheng) Date: Mon, 24 Jun 2019 09:54:28 +0800 Subject: [OpenStack-Infra] New contributor Message-ID: 751484782 at qq.com From cboylan at sapwetik.org Mon Jun 24 17:00:34 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 24 Jun 2019 10:00:34 -0700 Subject: [OpenStack-Infra] =?utf-8?q?getting_some_advanced_stats_out_of_ge?= =?utf-8?q?rrit_reviews?= In-Reply-To: References: Message-ID: <7833fde9-de78-4664-8834-43ccef108202@www.fastmail.com> On Thu, Jun 20, 2019, at 2:15 AM, Sorin Sbarnea wrote: > I would like to build some stats related to openstack gerrit reviews > and I would like to know few things: > * how can I get a dump of the reviews with all comments? (queries on > live DB are not an option) You can query for the list of changes using https://review.opendev.org/Documentation/rest-api-changes.html#list-changes. The query can include a start and stop date too. Then iterate over that list of changes and get their comments with https://review.opendev.org/Documentation/rest-api-changes.html#list-change-comments. > * did anyone already invested time in this direction? (happy to team-up > instead of having a DIY) I believe ttx has done similar work to find different stats in the past. I want to say one of the big take aways from that was to not query all changes every time. You want to run periodically then query only the changes that have updated since your last run. > > One graph I do want to build is would be with average number of > rechecks done on reviews before they are merged, per month. As you can > see that is not a trivial query to do but is doable. > > Another metric it would be fun to build is the estimated failure-rate > for a single CR. As I can compute an average failure rase for each job, > it should not be hard to compute an estimate failure rate when I have a > the list of jobs triggered for specific change. That second one may not > be very practical but I am really curious what kind of values I would > endup with (obvious the range would vary a lot between projects). You may be able to approximate this simply by checking graphite for failure rates on jobs then assume failures are independent (this assumption likely untrue but probably good enough for a start) then calculate the aggregate chance of failures for those jobs. > > Thanks > Sorin From cboylan at sapwetik.org Mon Jun 24 20:58:37 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 24 Jun 2019 13:58:37 -0700 Subject: [OpenStack-Infra] Meeting Agenda for June 25, 2019 Message-ID: <9d05af64-753f-4081-a6aa-c61684c09e81@www.fastmail.com> == Agenda for next meeting == * Announcements ** Zuul Cloner shim and Bindep fallback file removed from base OpenDev jobs * Actions from last meeting * Specs approval * Priority Efforts (Standing meeting agenda items. Please expand if you have subtopics.) ** [http://specs.openstack.org/openstack-infra/infra-specs/specs/task-tracker.html A Task Tracker for OpenStack] ** [http://specs.openstack.org/openstack-infra/infra-specs/specs/update-config-management.html Update Config Management] *** topic:update-cfg-mgmt *** Zuul as CD engine ** OpenDev *** Next steps * General topics ** Trusty Upgrade Progress (clarkb 20190625) ** https mirror update (clarkb 20190625) *** AFS on Bionic (kafs vs openafs) *** Status update *** https://review.opendev.org/#/q/status:open+branch:master+topic:kafs * Open discussion